Search:   Help

Developers' side bar

Selected categories

Edit

Shared groups

Links

LyXFileFormatReverse

<< | Page list | >>

Notes on the LyX file format from the author of a lyx-to-xml converter.

See also LyXFileFormat.

General structure

A .lyx-file consist of a header and a body. For a simple document, the body is a serie of paragraphs.

#LyX 2.0 created this file. For more info see http://www.lyx.org/
\lyxformat 413
\begin_document
\begin_header
...
\end_header

\begin_body

\begin_layout Section
Lorem ipsum...
\end_layout

\begin_layout Standard
Lorem ipsum dolor sit amet, ...
\end_layout

\begin_layout Standard
Duis autem vel eum iriure dolor in ...
\end_layout

\end_body
\end_document

Paragraph

See Paragraph::write in src/Paragraph.cpp.

\n\begin_deeper (optionally)
\n\end_deeper (optionally)
\n\begin_layout name\n
...parameters...
...children...
\n\end_layout\n

Parameters: something like \paragraph_spacing onehalf or \align block. Probably always starting with backslash.

Children: loop over them, with tracking of current column position. For each child:

  • If there is a font change, write it.
  • If a character:
    • a backslash is written as \backslash
    • if a dot is followed by the space character (code 0x20), then newline is inserted after the dot.
    • If the column position>79, then force the newline after the character
    • If the column position>70 and the character is space (code 0x20), then force the newline after the space
  • If an inset:
    • some of them are written directly, without begin/end group
    • \begin_inset_ or \n\begin_inset_
      inset-specific write
      \n\end_inset\n\n

TODO: insets writen directly: which ones?

TODO: how code listings are not corrupted?

Insets

Each type of an inset has its own serialization. Fortunately, they follow the common approach.

name type\n
options (without backslashes now)
inset text

Some options must present and be in specific order. For example, Flex requires the option status open, and Float figure requires options wide, sideways and status.

The wrapping \begin_inset and \end_inset are written in the paragraph writer.

Inset text is actually not character data, but an object of type InsetText, which is a wrapper for an object of type Text, which is usually a list of paragraphs (and I suspect it's exactly the type for the content of a document).

Insets are always inside a paragraph.

Character styles

More precisely, insets of the type "Flex". The unused name of the paragraph style is Plain_Layout.

So, the paragraph

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, ...

is serialized as:

\begin_layout Standard
Lorem 
\begin_inset Flex Emph
status collapsed

\begin_layout Plain Layout
ipsum dolor
\end_layout

\end_inset

 sit amet, consetetur sadipscing elitr, ...
\end_layout

Character escaping

Backslash becomes \backslash.

Forced newline:

\begin_inset Newline newline
\end_inset

LyX ignores newline, therefore something-to-lyx converter should handle it specially:

  • is important: replaced by the inset
  • is not important: replaced by \n_ (newline plus space)

TODO

LyX grammatic

There is no official grammar, but there is enough regularity to suggest our own.

LYX      := HEADER, BEGIN_DOCUMENT, DOCUMENT, END_DOCUMENT
DOCUMENT := LAYOUT*

LAYOUT   := LAYOUT_BEGIN, LAYOUT_OPTION*, (CHARDATA | INSET)*, LAYOUT_END
LAYOUT_BEGIN   := \begin_layout LAYOUT_NAME
                  Can contain spaces (special name "Plain Layout")
LAYOUT_OPTION  := OPTION_NAME OPTION_VALUE?
OPTION_NAME    := Name prefixed with slash. Should not be confused
                  with character data '\backslash' .....
LAYOUT_END     := \end_layout

INSET          := INSET_BEGIN, INSET_OPTION*, LAYOUT*, INSET_END
INSET_BEGIN    := \begin_inset INSET_NAME INSET_ANNOTATION*
INSET_OPTION   := OPTION_NAME OPTION_VALUE
INSET_END      := \end_inset

TABULAR        := angle-brackets-style

Tables are different.

Once again to note:

  • Inset is always inside a paragraph (= inside a layout). This paragraph can contain also character data.
  • Character data is always inside a layout.

A LyX parser can be found here: http://github.com/olpa/tex, file "lyxml/lyxparser.py".

Tables

A table is stored with help of html-like chunk.

\begin_inset Tabular
<lyxtabular version="3" rows="2" columns="2">
<features tabularvalignment="middle">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
r1c1
\end_layout

\end_inset
</cell>
...
</row>
...
</lyxtabular>

\end_inset

Elements cell, row and lyxtabular have a closing tag, elements features and column do not have one.

The order of parameters-attributes is important.

Each cell-element has a Text-inset, which returns back the usual serialization as a list of layouts.

Category Development

Edit - History - Print - Recent Changes - All Recent Changes - Search
Page last modified on 2012-09-26 09:59 UTC