GAP 4.8.9 installation with standard packages -- copy to your CoCalc project to get it
<Chapter Label="HowEnter">1<Heading>How To Type a &GAPDoc; Document</Heading>23In this chapter we give a more formal description of what you need to start4to type documentation in &GAPDoc; XML format. Many details were already5explained by example in Section <Ref Sect="sec:3k+1expl"/> of the6introduction.<P/>78We do <E>not</E> answer the question <Q>How to <E>write</E> a &GAPDoc;9document?</Q> in this chapter. You can (hopefully) find an answer to10this question by studying the example in the introduction, see <Ref11Sect="sec:3k+1expl"/>, and learning about more details in the reference12Chapter <Ref Chap="DTD" />.<P/>1314The definite source for all details of the official XML standard with useful15annotations is:<P/>1617<URL>http://www.xml.com/axml/axml.html</URL><P/>1819Although this document must be quite technical, it is surprisingly well20readable.<P/>2122<Section Label="EnterXML">23<Heading>General XML Syntax</Heading>2425We will now discuss the pieces of text which can occur in a general XML26document. We start with those pieces which do not contribute to the actual27content of the document.2829<Subsection Label="XMLhead">30<Heading>Head of XML Document</Heading>3132Each XML document should have a head which states that it is an XML document33in some encoding and which XML-defined language is used. In case of a34&GAPDoc; document this should always look as in the following example.3536<Log>37<![CDATA[<?xml version="1.0" encoding="UTF-8"?>38<!DOCTYPE Book SYSTEM "gapdoc.dtd">]]>39</Log>4041See <Ref Subsect="XMLenc"/> for a remark on the <Q>encoding</Q>42statement.<P/>4344(There may be local entity definitions inside the <C>DOCTYPE</C> statement,45see Subsection <Ref Subsect="GDent" /> below.)46</Subsection>4748<Subsection Label="XMLcomment">49<Heading>Comments</Heading>5051A <Q>comment</Q> in XML starts with the character sequence52<Q><C><!--</C></Q> and ends with the sequence <Q><C>--></C></Q>. Between53these sequences there must not be two adjacent dashes <Q><C>--</C></Q>.5455</Subsection>5657<Subsection Label="XMLprocinstr">58<Heading>Processing Instructions</Heading>5960A <Q>processing instruction</Q> in XML starts with the character sequence61<Q><C><?</C></Q> followed by a name (<Q><C>xml</C></Q> is only allowed62at the very beginning of the document to declare it being an XML document,63see <Ref Subsect="XMLhead"/>). After that any characters may follow, except64that the ending sequence <Q><C>?></C></Q> must not occur within the65processing instruction.6667</Subsection>6869 <P/>70And now we turn to those parts of the document which contribute to its71actual content.7273<Subsection Label="XMLnames">74<Heading>Names in XML and Whitespace</Heading>7576A <Q>name</Q> in XML (used for element and attribute identifiers, see below)77must start with a letter (in the encoding of the document) or with a78colon <Q><C>:</C></Q> or underscore <Q><C>_</C></Q> character. The79following characters may also be digits, dots <Q><C>.</C></Q> or dashes80<Q><C>-</C></Q>.<P/>8182This is a simplified description of the rules in the standard, which are83concerned with lots of unicode ranges to specify what a <Q>letter</Q>84is.<P/>8586Sequences only consisting of the following characters are considered as87<E>whitespace</E>: blanks, tabs, carriage return characters and new line88characters.8990</Subsection>9192<Subsection Label="XMLel">93<Heading>Elements</Heading>9495The actual content of an XML document consists of <Q>elements</Q>.96An element has some <Q>content</Q> with a leading <Q>start tag</Q>97(<Ref Subsect="XMLstarttag"/>) and a trailing <Q>end tag</Q> (<Ref98Subsect="XMLendtag"/>). The content can contain further elements but they99must be properly nested. One can define elements whose content is always100empty, those elements can also be entered with a single combined tag (<Ref101Subsect="XMLcombtag"/>).102</Subsection>103104<Subsection Label="XMLstarttag">105<Heading>Start Tags</Heading>106107A <Q>start-tag</Q> consists of a less-than-character <Q><C><</C></Q>108directly followed (without whitespace) by an element name (see <Ref109Subsect="XMLnames"/>), optional attributes, optional whitespace, and a110greater-than-character <Q><C>></C></Q>.<P/>111112An <Q>attribute</Q> consists of some whitespace and then its name113followed by an equal sign <Q><C>=</C></Q> which is optionally enclosed by114whitespace, and the attribute value, which is enclosed either in single115or double quotes. The attribute value may not contain the type of116quote used as a delimiter or the character <Q><C><</C></Q>, the117character <Q><C>&</C></Q> may only appear to start an entity,118see <Ref Subsect="XMLent"/>. We describe119in <Ref Subsect="AttrValRules"/> how120to enter special characters in attribute values.<P/>121122Note especially that no whitespace is allowed between the starting123<Q><C><</C></Q> character and the element name. The quotes around an124attribute value cannot be omitted. The names of elements and attributes are125<E>case sensitive</E>.126</Subsection>127128<Subsection Label="XMLendtag">129<Heading>End Tags</Heading>130131An <Q>end tag</Q> consists of the two characters <Q><C></</C></Q>132directly followed by the element name, optional whitespace and a133greater-than-character <Q><C>></C></Q>.134</Subsection>135136<Subsection Label="XMLcombtag">137<Heading>Combined Tags for Empty Elements</Heading>138139Elements which always have empty content can be written with a single140tag. This looks like a start tag (see <Ref Subsect="XMLstarttag"/>)141<E>except</E> that the trailing greater-than-character <Q><C>></C></Q>142is substituted by the two character sequence <Q><C>/></C></Q>.143144</Subsection>145146<Subsection Label="XMLent">147<Heading>Entities</Heading>148149An <Q>entity</Q> in XML is a macro for some substitution text. There are two150types of entities. <P/>151152A <Q>character entity</Q> can be used to specify characters in the encoding153of the document (can be useful for entering non-ASCII characters which you154cannot manage to type in directly). They are entered with a sequence155<Q><C>&#</C></Q>, directly followed by either some decimal digits156or an <Q><C>x</C></Q> and some hexadecimal digits, directly followed by a157semicolon <Q><C>;</C></Q>. Using such a character entity is just equivalent158to typing the corresponding character directly.<P/>159160Then there are references to <Q>named entities</Q>. They are entered with an161ampersand character <Q><C>&</C></Q> directly followed by a name which162is directly followed by a semicolon <Q><C>;</C></Q>. Such entities must be163declared somewhere by giving a substitution text. This text is included in164the document and the document is parsed again afterwards. The exact rules165are a bit subtle but you probably want to use this only in simple cases.166Predefined entities for &GAPDoc; are described in <Ref Subsect="XMLspchar"/>167and <Ref Subsect="GDent"/>.<P/>168169</Subsection>170171<Subsection Label="XMLspchar">172<Heading>Special Characters in XML</Heading>173174We have seen that the less-than-character <Q><C><</C></Q> and the175ampersand character <Q><C>&</C></Q> start a tag or entity reference in176XML. To get these characters into the document text one has to use177entity references, namely <Q><C>&lt;</C></Q> to get <Q><C><</C></Q>178and <Q><C>&amp;</C></Q> to get <Q><C>&</C></Q>. Furthermore179<Q><C>&gt;</C></Q> must be used to get <Q><C>></C></Q> when the string180<Q><C>]]></C></Q> appears in element content (and not as delimiter of a181<C>CDATA</C> section explained below).<P/>182183Another possibility is to use a <C>CDATA</C> statement explained184in <Ref Subsect="XMLcdata"/>.185186</Subsection>187188<Subsection Label="AttrValRules">189<Heading>Rules for Attribute Values</Heading>190191Attribute values can contain entities which are substituted recursively.192But except for the entities &lt; or a character entity it is not193allowed that a < character is introduced by the substitution (there is194no XML parsing for evaluating the attribute value, just entity substitutions).195</Subsection>196197<Subsection Label="XMLcdata">198<Heading><C>CDATA</C></Heading>199200Pieces of text which contain many characters which can be201misinterpreted as markup can be enclosed by the character sequences202<Q><C><![CDATA[<![CDATA[]]></C></Q> and <Q><C>]]></C></Q>. Everything203between these sequences is considered as content of the document and is not204further interpreted as XML text. All the rules explained so far in this205section do <E>not apply</E> to such a part of the document. The only206document content which cannot be entered directly inside a <C>CDATA</C>207statement is the sequence <Q><C>]]></C></Q>. This can be entered as208<Q><C>]]&gt;</C></Q> outside the <C>CDATA</C> statement.209210<Listing Type="Example">211A nesting of tags like <![CDATA[<a> <b> </a> </b>]]> is not allowed.212</Listing>213214</Subsection>215216<Subsection Label="XMLenc">217<Heading>Encoding of an XML Document</Heading>218219We suggest to use the UTF-8 encoding for writing &GAPDoc; XML documents.220But the tools described in Chapter <Ref Chap="ch:conv" /> also work221with ASCII or the various ISO-8859-X encodings (ISO-8859-1 is also222called latin1 and covers most special characters for western European223languages).224225</Subsection>226227<Subsection Label="XMLvalid">228<Heading>Well Formed and Valid XML Documents</Heading>229230We want to mention two further important words which are often used in the231context of XML documents. A piece of text becomes a <Q>well formed</Q> XML232document if all the formal rules described in this section are fulfilled.233<P/>234235But this says nothing about the content of the document. To give236this content a meaning one needs a declaration of the element and237corresponding attribute names as well as of named entities which are238allowed. Furthermore there may be restrictions how such elements can be239nested. This <E>definition of an XML based markup language</E> is done in a240<Q>document type definition</Q>. An XML document which contains only241elements and entities declared in such a document type definition and obeys242the rules given there is called <Q>valid (with respect to this document type243definition)</Q>.<P/>244245The main file of the &GAPDoc; package is <F>gapdoc.dtd</F>. This contains246such a definition of a markup language. We are not going to explain the247formal syntax rules for document type definitions in this section. But in248Chapter <Ref Chap="DTD"/> we will explain enough about it to understand249the file <F>gapdoc.dtd</F> and so the markup language defined there.250251</Subsection>252</Section>253254<Section Label="EnterGD">255<Heading>Entering &GAPDoc; Documents</Heading>256257Here are some additional rules for writing &GAPDoc; XML documents.258259<Subsection Label="otherspecchar">260<Heading>Other special characters</Heading>261As &GAPDoc; documents are used to produce &LaTeX; and HTML262documents, the question arises how to deal with characters with a263special meaning for other applications (for example264<Q><C>&</C></Q>,265<Q><C>#</C></Q>,266<Q><C>$</C></Q>,267<Q><C>%</C></Q>,268<Q><C>~</C></Q>,269<Q><C>\</C></Q>,270<Q><C>{</C></Q>,271<Q><C>}</C></Q>,272<Q><C>_</C></Q>,273<Q><C>^</C></Q>,274<Q><C> </C></Q> (this is a non-breakable space,275<Q><C>~</C></Q> in &LaTeX;) have a special meaning for &LaTeX; and276<Q><C>&</C></Q>,277<Q><C><</C></Q>,278<Q><C>></C></Q> have a special meaning for HTML (and XML).279In &GAPDoc; you can usually just type these characters directly, it is280the task of the converter programs which translate to some output format281to take care of such special characters. The exceptions to this simple282rule are:283<List >284<Item>285& and < must be entered as <C>&amp;</C> and286<C>&lt;</C> as explained in <Ref Subsect="XMLspchar"/>.287</Item>288<Item>The content of the &GAPDoc; elements <C><M></C>,289<C><Math></C> and <C><Display></C> is &LaTeX; code,290see <Ref Sect="MathForm"/>.</Item>291<Item>The content of an <C><Alt></C> element with <C>Only</C>292attribute contains code for the specified output type, see293<Ref Subsect="Alt"/>.</Item>294</List>295296Remark: In former versions of &GAPDoc; one had to use particular297entities for all the special characters mentioned above298(<C>&tamp;</C>, <C>&hash;</C>,299<C>&dollar;</C>, <C>&percent;</C>, <C>&tilde;</C>,300<C>&bslash;</C>, <C>&obrace;</C>, <C>&cbrace;</C>,301<C>&uscore;</C>, <C>&circum;</C>, <C>&tlt;</C>, <C>&tgt;</C>).302These are no longer needed, but they are still defined for backwards303compatibility with older &GAPDoc; documents.304305</Subsection>306307<Subsection Label="GDformulae">308<Heading>Mathematical Formulae</Heading>309310Mathematical formulae in &GAPDoc; are typed as in &LaTeX;. They must be311the content of one of three types of &GAPDoc; elements concerned with312mathematical formulae: <Q><C>Math</C></Q>, <Q><C>Display</C></Q>, and313<Q><C>M</C></Q> (see Sections <Ref Subsect="Math"/> and <Ref314Subsect="M"/> for more details). The first two correspond to &LaTeX;'s315math mode and display math mode. The last one is a special form of the316<Q><C>Math</C></Q> element type, that imposes certain restrictions on317the content. On the other hand the content of an <Q><C>M</C></Q> element318is processed in a well defined way for text terminal or HTML output. The319<Q><C>Display</C></Q> element also has an attribute such that its320content is processed as in <Q><C>M</C></Q> elements.<P/>321322Note that the content of these element is &LaTeX; code, but323the special characters324<Q><C><</C></Q> and <Q><C>&</C></Q> for XML must be entered via325the entities described in <Ref Subsect="XMLspchar"/> or by using a326<C>CDATA</C> statement, see <Ref Subsect="XMLcdata"/>.<P/>327328</Subsection>329330<Subsection Label="GDent">331<Heading>More Entities</Heading>332333In &GAPDoc; there are some more predefined entities:334335<Table Align="|l|l|">336<Caption>Predefined Entities in the &GAPDoc; system</Caption>337<HorLine/>338<Row> <Item><C>&GAP;</C></Item> <Item>&GAP;</Item> </Row>339<HorLine/>340<Row> <Item><C>&GAPDoc;</C></Item> <Item>&GAPDoc;</Item> </Row>341<HorLine/>342<Row> <Item><C>&TeX;</C></Item> <Item>&TeX;</Item> </Row>343<HorLine/>344<Row> <Item><C>&LaTeX;</C></Item> <Item>&LaTeX;</Item> </Row>345<HorLine/>346<Row> <Item><C>&BibTeX;</C></Item> <Item>&BibTeX;</Item> </Row>347<HorLine/>348<Row> <Item><C>&MeatAxe;</C></Item> <Item>&MeatAxe;</Item> </Row>349<HorLine/>350<Row> <Item><C>&XGAP;</C></Item> <Item>&XGAP;</Item> </Row>351<HorLine/>352<Row> <Item><C>&copyright;</C></Item> <Item>©right;</Item> </Row>353<HorLine/>354<Row> <Item><C>&nbsp;</C></Item> <Item><Q> </Q></Item> </Row>355<HorLine/>356<Row> <Item><C>&ndash;</C></Item> <Item>–</Item> </Row>357<HorLine/>358</Table>359360Here <C>&nbsp;</C> is a non-breakable space character.361<P/>362363Additional entities are defined for some mathematical symbols, see <Ref364Sect="MathForm"/> for more details.365<P/>366One can define further local entities right inside the head (see <Ref367Subsect="XMLhead"/>) of a &GAPDoc; XML document as in the following example.368369<Listing Type="Example">370<![CDATA[<?xml version="1.0" encoding="UTF-8"?>371372<!DOCTYPE Book SYSTEM "gapdoc.dtd"373[ <!ENTITY MyEntity "some longish <E>text</E> possibly with markup">374]>]]>375</Listing>376377These additional definitions go into the <C><!DOCTYPE</C> tag in square378brackets. Such new entities are used like this: <C>&MyEntity;</C> <P/>379380</Subsection>381382</Section>383</Chapter>384385386387