<Chapter Label="ch:intro"><Heading>Introduction and Example</Heading>
The main purpose of the &GAPDoc; package is to define a file format for
documentation of &GAP;-programs and -packages (see <Cite Key="GAP4" />). The
problem is that such documentation should be readable in several output
formats. For example it should be possible to read the documentation inside
the terminal in which &GAP; is running (a text mode) and there should be a
printable version in high typesetting quality (produced by some version of
&TeX;). It is also popular to view &GAP;'s online help with a Web-browser
via an HTML-version of the documentation. Nowadays one can use &LaTeX; and
standard viewer programs to produce and view on the screen <C>dvi</C>- or
<C>pdf</C>-files with full support of internal and external hyperlinks.
Certainly there will be other interesting document formats and tools in this
direction in the future. <P/>
Our aim is to find a <Emph>format for writing</Emph> the documentation which
allows a relatively easy translation into the output formats just mentioned
and which hopefully makes it easy to translate to future output formats as
well. <P/>
To make documentation written in the &GAPDoc; format directly usable, we
also provide a set of programs, called converters, which produce text-,
hyperlinked &LaTeX;- and HTML-output versions of a &GAPDoc; document. These
programs are developed by the first named author. They run completely inside
&GAP;, i.e., no external programs are needed. You only need <C>latex</C> and
<C>pdflatex</C> to process the &LaTeX; output. These programs are described
in Chapter <Ref Chap="ch:conv"/>.
The definition of the &GAPDoc; format uses XML, the <Q>eXtendible Markup Language</Q>. This is a standard (defined by the W3C consortium, see
<URL>https://www.w3c.org</URL>) which lays down a syntax for adding markup to
a document or to some data. It allows to define document structures via
introducing markup <E>elements</E> and certain relations between them. This
is done in a <E>document type definition</E>. The file <F>gapdoc.dtd</F>
contains such a document type definition and is the central part of the
&GAPDoc; package. <P/>
The easiest way for getting a good idea about this is probably to look at an
example. The Appendix <Ref Appendix="app:3k+1" /> contains a short but
complete &GAPDoc; document for a fictitious share package. In the next
section we will go through this document, explain basic facts about XML and
the &GAPDoc; document type, and give pointers to more details in later parts
of this documentation. <P/>
In the last Section <Ref Sect="sec:faq" /> of this introductory chapter
we try to answer some general questions about the decisions which lead to
the &GAPDoc; package.
This line just tells a human reader and computer programs that the file
is a document with XML markup and that the text is encoded in the UTF-8
character set (other common encodings are ASCII or ISO-8895-X encodings).
Everything in a XML file between <Q><C><!--</C></Q> and
<Q><C>--></C></Q> is a comment and not part of the document content.
<Listing Type="from 3k+1.xml">
<![CDATA[<!DOCTYPE Book SYSTEM "gapdoc.dtd">
]]></Listing>
This line says that the document contains markup which is defined in
the system file <F>gapdoc.dtd</F> and that the markup obeys certain
rules defined in that file (the ending <F>dtd</F> means <Q>document type
definition</Q>). It further says that the actual content of the document
consists of an element with name <Q>Book</Q>. And we can really see that the
remaining part of the file is enclosed as follows:
This demonstrates the basics of the markup in XML. This part of the document
is an <Q>element</Q>. It consists of the <Q>start tag</Q> <C><![CDATA[<Book
Name="3k+1">]]></C>, the <Q>element content</Q> and the <Q>end tag</Q>
<C><![CDATA[</Book>]]></C> (end tags always start with <C></</C>). This element also has an <Q>attribute</Q> <C>Name</C> whose <Q>value</Q> is
<C>3k+1</C>.
<P/>
If you know HTML, this will look familiar to you. But there are some
important differences: The element name <C>Book</C> and attribute name
<C>Name</C> are <E>case sensitive</E>. The value of an attribute must
<E>always</E> be enclosed in quotes. In XML <E>every</E> element has a start
and end tag (which can be combined for elements defined as <Q>empty</Q>, see
for example <C><TableOfContents/></C> below).
<P/>
If you know &LaTeX;, you are familiar with quite different
types of markup, for example: The equivalent of the <C>Book</C> element in &LaTeX; is <C>\begin{document} ...
\end{document}</C>. The sectioning in &LaTeX; is not
done by explicit start and end markup, but implicitly via heading
commands like <C>\section</C>. Other markup is done by using
braces <C>{}</C> and putting some commands inside. And for
mathematical formulae one can use the <C>$</C> for the start
<E>and</E> the end of the markup. In XML <E>all</E> markup looks similar to
that of the <C>Book</C> element. <P/>
The content of the <C>TitlePage</C> element consists again of elements. In
Chapter <Ref Chap="DTD" /> we describe which elements are allowed
within a <C>TitlePage</C> and that their ordering is prescribed in this
case. In the (stupid) name of the author you see that a German umlaut is
used directly (in ISO-latin1 encoding).
<P/>
Contrary to &LaTeX;- or HTML-files this markup does not say anything about
the actual layout of the title page in any output version of the document.
It just adds information about the <E>meaning</E> of pieces of text. <P/>
Within the <C>Copyright</C> element there are two more things to learn about XML markup. The <C><P/></C> is a complete element. It is a combined
start and end tag. This shortcut is allowed for elements which are defined
to be always <Q>empty</Q>, i.e., to have no content. You may have already
guessed that <C><P/></C> is used as a paragraph separator. Note that
empty lines do not separate paragraphs (contrary to &LaTeX;). <P/>
Note that elements in XML must always be properly nested, as in this
example. A construct like <C><![CDATA[<a><b>...</a></b>]]></C> is <E>not</E>
allowed.
This is another example of an <Q>empty element</Q>. It just means that a
table of contents for the whole document should be included into any output version of the document.
<P/>
After this the main text of the document follows inside certain sectioning
elements:
These elements are used similarly to <Q>\chapter</Q> and
<Q>\section</Q> in &LaTeX;. But note that the explicit end tags are
necessary here.
<P/>
The sectioning commands allow to assign an optional attribute <Q>Label</Q>.
This can be used for referring to a section inside the document.
<P/>
The text of the first section starts as follows. The whitespace in the text
is unimportant and the indenting is not necessary.
<Listing Type="from 3k+1.xml">
<![CDATA[ Let <M>k \in &NN;</M> be a natural number. We consider the
sequence <M>n(i, k), i \in &NN;,</M> with <M>n(1, k) = k</M> and
else
]]></Listing>
Here we come to the interesting question how to type mathematical formulae
in a &GAPDoc; document. We did not find any alternative for writing formulae
in &TeX; syntax. (There is MATHML, but even simple formulae contain a lot
of markup, become quite unreadable and they are cumbersome to type.
Furthermore there seem to be no tools available which translate such
formulae in a nice way into &TeX; and text.) So, formulae are essentially
typed as in &LaTeX;. (Actually, it is also possible to type unicode
characters of some mathematical symbols directly, or via an entity like the
<C>&NN;</C> above.) There are three types of elements containing
formulae: <Q>M</Q>, <Q>Math</Q> and <Q>Display</Q>. The first two are for
in-text formulae and the third is for displayed formulae. Here <Q>M</Q> and
<Q>Math</Q> are equivalent, when translating a &GAPDoc; document into
&LaTeX;. But they are handled differently for terminal text (and HTML)
output. For the content of an <Q>M</Q>-element there are defined rules for a
translation into well readable terminal text. More complicated formulae are
in <Q>Math</Q> or <Q>Display</Q> elements and they are just printed as they
are typed in text output. So, to make a section well readable inside a
terminal window you should try to put as many formulae as possible into
<Q>M</Q>-elements. In our example text we used the notation <C>n(i, k)</C>
instead of <C>n_i(k)</C> because it is easier to read in text mode. See
Sections <Ref Sect="GDformulae"/> and <Ref Sect="sec:misc" /> for
more details. <P/>
A few lines further on we find two non-internal references.
The first within the <Q>Cite</Q>-element is the citation of a book. In
&GAPDoc; we use the widely used &BibTeX; database format for reference
lists. This does not use XML but has a well documented structure which is
easy to parse. And many people have collections of references readily
available in this format. The reference list in an output version of the
document is produced with the empty element
close to the end of our example file. The attribute <Q>Databases</Q>
give the name(s) of the database (<F>.bib</F>) files which contain the
references.
<P/>
Putting a Web-address into an <Q>URL</Q>-element allows one to create a
hyperlink in output formats which allow this.
<P/>
The second section of our example contains a special kind of subsection
defined in &GAPDoc;.
<Listing Type="from 3k+1.xml">
<![CDATA[ <ManSection>
<Func Name="ThreeKPlusOneSequence" Arg="k[, max]"/>
<Description>
This function computes for a natural number <A>k</A> the
beginning of the sequence <M>n(i, k)</M> defined in section
<Ref Sect="sec:theory"/>. The sequence stops at the first
<M>1</M> or at <M>n(<A>max</A>, k)</M>, if <A>max</A> is
given.
<Example>
gap> ThreeKPlusOneSequence(101); "Sorry, not yet implemented. Wait for Version 84 of the package"
</Example>
</Description>
</ManSection>
]]></Listing>
A <Q>ManSection</Q> contains the description of some function, operation,
method, filter and so on. The <Q>Func</Q>-element describes the name of a
<E>function</E> (there are also similar elements <Q>Oper</Q>, <Q>Meth</Q>,
<Q>Filt</Q> and so on) and names for its arguments, optional arguments
enclosed in square brackets. See Section <Ref Sect="sec:mansect" /> for
more details. <P/>
In the <Q>Description</Q> we write the argument names as <Q>A</Q>-elements.
A good description of a function should usually contain an example of its
use. For this there are some verbatim-like elements in &GAPDoc;, like
<Q>Example</Q> above (here, clearly, whitespace matters which causes a
slightly strange indenting). <P/>
The text contains an internal reference to the first section via the
explicitly defined label <C>sec:theory</C>.
<P/>
The first section also contains a <Q>Ref</Q>-element which refers to the
function described here. Note that there is no explicit label for such a
reference. The pair <C><![CDATA[<Func Name="ThreeKPlusOneSequence" Arg="k[,
max]"/>]]> and ThreeKPlusOneSequence"/>]]>
does the cross referencing (and hyperlinking if possible) implicitly via the
name of the function.
<P/>
Here is one further element from our example document which we want to
explain.
This is again an empty element which just says that an output version of the
document should contain an index. Many entries for the index are generated
automatically because the <Q>Func</Q> and similar elements implicitly
produce such entries. It is also possible to include explicit additional
entries in the index.
<List>
<Mark>Are those XML files too ugly to read and edit?</Mark>
<Item>
Just have a look and decide yourself. The markup needs more characters
than most &TeX; or &LaTeX; markup. But the structure of the document is
easier to see. If you configure your favorite editor well, you do not need
more key strokes for typing the markup than in &LaTeX;.
</Item>
<Mark>Why do we not use &LaTeX; alone?</Mark>
<Item>
&LaTeX; is good for writing books. But &LaTeX; files are generally
difficult to parse and to process to other output formats like text
for browsing in a terminal window or HTML (or new formats which may
become popular in the future). &GAPDoc; markup is one step more
abstract than &LaTeX; insofar as it describes meaning instead of
appearance of text. The inner workings of &LaTeX; are too complicated
to learn without pain, which makes it difficult to overcome problems
that occur occasionally.
</Item>
<Mark>Why XML and not a newly defined markup language?</Mark>
<Item> XML is a well defined standard that is more and more widely used. Lots
of people have thought about it. Years of experience with SGML went into the
design. It is easy to explain, easy to parse and lots of tools are available,
there will be more in the future.
</Item>
</List>
</Section>
</Chapter>
¤ Dauer der Verarbeitung: 0.15 Sekunden
(vorverarbeitet)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.