CoCalc -- enter.xml

GAP 4.8.9 installation with standard packages -- copy to your CoCalc project to get it
gap4r8 / pkg / GAPDoc-1.6.1 / doc / enter.xml
⁵⁶³⁶³⁸ views
1
<Chapter Label="HowEnter">
2
<Heading>How To Type a &GAPDoc; Document</Heading>
3

4
In this chapter we give a more  formal description of what you need to start
5
to  type documentation  in &GAPDoc;  XML format.  Many details  were already
6
explained  by  example  in Section&nbsp;<Ref  Sect="sec:3k+1expl"/>  of  the
7
introduction.<P/>
8

9
We  do <E>not</E>  answer the  question  <Q>How to  <E>write</E> a  &GAPDoc;
10
document?</Q>  in  this chapter.  You  can  (hopefully)  find an  answer  to
11
this question  by studying  the example  in the  introduction, see&nbsp;<Ref
12
Sect="sec:3k+1expl"/>,  and learning  about  more details  in the  reference
13
Chapter&nbsp;<Ref Chap="DTD" />.<P/>
14

15
The definite source for all details of the official XML standard with useful
16
annotations is:<P/>
17

18
<URL>http://www.xml.com/axml/axml.html</URL><P/>
19

20
Although  this document  must be  quite technical,  it is  surprisingly well
21
readable.<P/>
22

23
<Section Label="EnterXML">
24
<Heading>General XML Syntax</Heading>
25

26
We will  now discuss the  pieces of  text which can  occur in a  general XML
27
document. We start  with those pieces which do not  contribute to the actual
28
content of the document.
29

30
<Subsection Label="XMLhead">
31
<Heading>Head of XML Document</Heading>
32

33
Each XML document should have a head which states that it is an XML document
34
in  some encoding  and which  XML-defined  language is  used. In  case of  a
35
&GAPDoc; document this should always look as in the following example.
36

37
<Log>
38
<![CDATA[<?xml version="1.0" encoding="UTF-8"?>
39
<!DOCTYPE Book SYSTEM "gapdoc.dtd">]]>
40
</Log>
41

42
See&nbsp;<Ref  Subsect="XMLenc"/>  for  a   remark  on  the  <Q>encoding</Q>
43
statement.<P/>
44

45
(There may be local entity  definitions inside the <C>DOCTYPE</C> statement,
46
see Subsection&nbsp;<Ref Subsect="GDent" /> below.)
47
</Subsection>
48

49
<Subsection Label="XMLcomment">
50
<Heading>Comments</Heading>
51

52
A   <Q>comment</Q>    in   XML   starts   with    the   character   sequence
53
<Q><C>&lt;!--</C></Q> and ends with the sequence <Q><C>--></C></Q>. Between
54
these sequences there must not be two adjacent dashes <Q><C>--</C></Q>.
55

56
</Subsection>
57

58
<Subsection Label="XMLprocinstr">
59
<Heading>Processing Instructions</Heading>
60

61
A <Q>processing  instruction</Q> in XML  starts with the  character sequence
62
<Q><C>&lt;?</C></Q> followed  by a name (<Q><C>xml</C></Q>  is only allowed
63
at the very beginning  of the document to declare it  being an XML document,
64
see <Ref Subsect="XMLhead"/>). After that  any characters may follow, except
65
that  the ending  sequence <Q><C>?></C></Q>  must not  occur within  the
66
processing instruction.
67

68
</Subsection>
69

70
&nbsp;<P/>
71
And now  we turn  to those  parts of  the document  which contribute  to its
72
actual content.
73

74
<Subsection Label="XMLnames">
75
<Heading>Names in XML and Whitespace</Heading>
76

77
A <Q>name</Q> in XML (used for element and attribute identifiers, see below)
78
must  start with  a letter  (in  the encoding  of  the document)  or with  a
79
colon  <Q><C>:</C></Q> or  underscore <Q><C>_</C></Q>  character. The
80
following  characters may  also  be digits,  dots  <Q><C>.</C></Q> or dashes
81
<Q><C>-</C></Q>.<P/>
82

83
This is  a simplified description  of the rules  in the standard,  which are
84
concerned  with lots  of  unicode  ranges to  specify  what a  <Q>letter</Q>
85
is.<P/>
86

87
Sequences  only consisting  of the  following characters  are considered  as
88
<E>whitespace</E>:  blanks, tabs,  carriage return  characters and  new line
89
characters.
90

91
</Subsection>
92

93
<Subsection Label="XMLel">
94
<Heading>Elements</Heading>
95

96
The  actual  content  of  an   XML  document  consists  of  <Q>elements</Q>.
97
An  element  has  some  <Q>content</Q>   with  a  leading  <Q>start  tag</Q>
98
(<Ref  Subsect="XMLstarttag"/>)   and  a   trailing  <Q>end   tag</Q>  (<Ref
99
Subsect="XMLendtag"/>). The  content can  contain further elements  but they
100
must be  properly nested. One  can define  elements whose content  is always
101
empty, those elements  can also be entered with a  single combined tag (<Ref
102
Subsect="XMLcombtag"/>). 
103
</Subsection>
104

105
<Subsection Label="XMLstarttag">
106
<Heading>Start Tags</Heading>
107

108
A  <Q>start-tag</Q> consists  of  a less-than-character  <Q><C>&lt;</C></Q>
109
directly  followed (without  whitespace) by  an element  name (see&nbsp;<Ref
110
Subsect="XMLnames"/>),  optional  attributes,  optional  whitespace,  and  a
111
greater-than-character <Q><C>></C></Q>.<P/>
112

113
An  <Q>attribute</Q>  consists   of  some  whitespace  and   then  its  name
114
followed by  an equal sign  <Q><C>=</C></Q> which is optionally  enclosed by
115
whitespace,  and the  attribute value,  which is  enclosed either  in single
116
or  double  quotes.  The  attribute  value  may  not  contain  the  type  of
117
quote  used  as  a  delimiter  or  the  character  <Q><C>&lt;</C></Q>, the 
118
character <Q><C>&amp;</C></Q> may only appear to start an entity, 
119
see&nbsp;<Ref Subsect="XMLent"/>. We describe 
120
in&nbsp;<Ref Subsect="AttrValRules"/>  how 
121
to enter special characters in attribute values.<P/>
122

123
Note  especially  that  no  whitespace   is  allowed  between  the  starting
124
<Q><C>&lt;</C></Q> character  and the  element name.  The quotes  around an
125
attribute value cannot be omitted. The  names of elements and attributes are
126
<E>case sensitive</E>.
127
</Subsection>
128

129
<Subsection Label="XMLendtag">
130
<Heading>End Tags</Heading>
131

132
An  <Q>end  tag</Q>  consists  of the  two  characters  <Q><C>&lt;/</C></Q>
133
directly  followed   by  the  element   name,  optional  whitespace   and  a
134
greater-than-character <Q><C>></C></Q>.
135
</Subsection>
136

137
<Subsection Label="XMLcombtag">
138
<Heading>Combined Tags for Empty Elements</Heading>
139

140
Elements  which always  have  empty content  can be  written  with a  single
141
tag.  This looks  like a  start tag  (see&nbsp;<Ref Subsect="XMLstarttag"/>)
142
<E>except</E> that  the trailing  greater-than-character <Q><C>></C></Q>
143
is substituted by the two character sequence <Q><C>/></C></Q>.
144

145
</Subsection>
146

147
<Subsection Label="XMLent">
148
<Heading>Entities</Heading>
149

150
An <Q>entity</Q> in XML is a macro for some substitution text. There are two
151
types of entities. <P/>
152

153
A <Q>character entity</Q> can be used  to specify characters in the encoding
154
of the document  (can be useful for entering non-ASCII  characters which you
155
cannot  manage to  type  in  directly). They  are  entered  with a  sequence
156
<Q><C>&amp;#</C></Q>, directly followed by  either some decimal digits
157
or an  <Q><C>x</C></Q> and some  hexadecimal digits, directly followed  by a
158
semicolon <Q><C>;</C></Q>. Using such a  character entity is just equivalent
159
to typing the corresponding character directly.<P/>
160

161
Then there are references to <Q>named entities</Q>. They are entered with an
162
ampersand character  <Q><C>&amp;</C></Q> directly followed by  a name which
163
is directly followed  by a semicolon <Q><C>;</C></Q>. Such  entities must be
164
declared somewhere by  giving a substitution text. This text  is included in
165
the document  and the document is  parsed again afterwards. The  exact rules
166
are a  bit subtle but you  probably want to  use this only in  simple cases.
167
Predefined entities for &GAPDoc; are described in <Ref Subsect="XMLspchar"/>
168
and <Ref Subsect="GDent"/>.<P/>
169

170
</Subsection>
171

172
<Subsection Label="XMLspchar">
173
<Heading>Special Characters in XML</Heading>
174

175
We  have  seen  that  the  less-than-character  <Q><C>&lt;</C></Q>  and  the
176
ampersand character <Q><C>&amp;</C></Q>  start a tag or  entity reference in
177
XML.  To  get  these characters  into  the  document  text  one has  to  use
178
entity references,  namely <Q><C>&amp;lt;</C></Q> to  get <Q><C>&lt;</C></Q>
179
and   <Q><C>&amp;amp;</C></Q>   to  get   <Q><C>&amp;</C></Q>.   Furthermore
180
<Q><C>&amp;gt;</C></Q> must be  used to get <Q><C>></C></Q>  when the string
181
<Q><C>]]&gt;</C></Q> appears in  element content (and not as  delimiter of a
182
<C>CDATA</C> section explained below).<P/>
183

184
Another   possibility  is   to  use   a  <C>CDATA</C>   statement  explained
185
in&nbsp;<Ref Subsect="XMLcdata"/>.
186

187
</Subsection>
188

189
<Subsection Label="AttrValRules">
190
<Heading>Rules for Attribute Values</Heading>
191

192
Attribute values can contain entities which are substituted recursively.
193
But except for the entities &amp;lt;  or a character entity it is not
194
allowed that a &lt; character is introduced by the substitution (there is 
195
no XML parsing for evaluating the attribute value, just entity substitutions).
196
</Subsection>
197

198
<Subsection Label="XMLcdata">
199
<Heading><C>CDATA</C></Heading>
200

201
Pieces   of   text   which   contain    many   characters   which   can   be
202
misinterpreted  as  markup  can  be  enclosed  by  the  character  sequences
203
<Q><C><![CDATA[<![CDATA[]]></C></Q>  and  <Q><C>]]&gt;</C></Q>.  Everything
204
between these sequences is considered as  content of the document and is not
205
further interpreted  as XML  text. All  the rules explained  so far  in this
206
section  do <E>not  apply</E>  to such  a  part of  the  document. The  only
207
document  content which  cannot be  entered directly  inside a  <C>CDATA</C>
208
statement  is the  sequence <Q><C>]]&gt;</C></Q>.  This can  be entered  as
209
<Q><C>]]&amp;gt;</C></Q> outside the <C>CDATA</C> statement.
210

211
<Listing Type="Example">
212
A nesting of tags like <![CDATA[<a> <b> </a> </b>]]> is not allowed.
213
</Listing>
214

215
</Subsection> 
216

217
<Subsection Label="XMLenc">
218
<Heading>Encoding of an XML Document</Heading>
219

220
We suggest to use the UTF-8 encoding for writing &GAPDoc; XML documents.
221
But  the tools  described in  Chapter <Ref  Chap="ch:conv" />  also work
222
with  ASCII or  the  various ISO-8859-X  encodings  (ISO-8859-1 is  also
223
called latin1  and covers most  special characters for  western European
224
languages).
225

226
</Subsection>
227

228
<Subsection Label="XMLvalid">
229
<Heading>Well Formed and Valid XML Documents</Heading>
230

231
We want to mention  two further important words which are  often used in the
232
context of XML  documents. A piece of text becomes  a <Q>well formed</Q> XML
233
document if  all the formal rules  described in this section  are fulfilled.
234
<P/>
235

236
But  this  says  nothing  about  the   content  of  the  document.  To  give
237
this  content  a  meaning  one  needs  a  declaration  of  the  element  and
238
corresponding  attribute  names as  well  as  of  named entities  which  are
239
allowed.  Furthermore there  may be  restrictions how  such elements  can be
240
nested. This <E>definition of an XML  based markup language</E> is done in a
241
<Q>document  type  definition</Q>.  An  XML  document  which  contains  only
242
elements and entities declared in such  a document type definition and obeys
243
the rules given there is called <Q>valid (with respect to this document type
244
definition)</Q>.<P/>
245

246
The main  file of the  &GAPDoc; package is <F>gapdoc.dtd</F>.  This contains
247
such a  definition of  a markup language.  We are not  going to  explain the
248
formal syntax  rules for document type  definitions in this section.  But in
249
Chapter&nbsp;<Ref Chap="DTD"/> we will explain enough about it to understand
250
the file <F>gapdoc.dtd</F> and so the markup language defined there.
251

252
</Subsection>
253
</Section>
254

255
<Section Label="EnterGD">
256
<Heading>Entering &GAPDoc; Documents</Heading>
257

258
Here are some additional rules for writing &GAPDoc; XML documents.
259

260
<Subsection Label="otherspecchar">
261
<Heading>Other special characters</Heading>
262
As &GAPDoc; documents are used to produce  &LaTeX; and HTML
263
documents, the question arises how to deal with characters with a
264
special meaning for other applications (for example 
265
<Q><C>&amp;</C></Q>,
266
<Q><C>#</C></Q>,
267
<Q><C>$</C></Q>,
268
<Q><C>%</C></Q>,
269
<Q><C>~</C></Q>,
270
<Q><C>\</C></Q>,
271
<Q><C>{</C></Q>,
272
<Q><C>}</C></Q>,
273
<Q><C>_</C></Q>,
274
<Q><C>^</C></Q>,
275
<Q><C>&nbsp;</C></Q> (this is a non-breakable space, 
276
<Q><C>~</C></Q> in &LaTeX;) have a special meaning for &LaTeX; and
277
<Q><C>&amp;</C></Q>,
278
<Q><C>&lt;</C></Q>,
279
<Q><C>></C></Q> have a special meaning for HTML (and XML). 
280
In &GAPDoc; you can usually just type these characters directly, it is
281
the task of the converter programs which translate to some output format
282
to take care of such special characters. The exceptions to this simple
283
rule are: 
284
<List >
285
<Item>
286
&amp; and &lt; must be entered as <C>&amp;amp;</C> and 
287
<C>&amp;lt;</C> as explained in <Ref Subsect="XMLspchar"/>. 
288
</Item>
289
<Item>The content of the &GAPDoc; elements <C>&lt;M></C>, 
290
<C>&lt;Math></C> and <C>&lt;Display></C> is &LaTeX; code,
291
see <Ref  Sect="MathForm"/>.</Item>
292
<Item>The content of an <C>&lt;Alt></C> element with <C>Only</C>
293
attribute contains code for the specified output type, see 
294
<Ref Subsect="Alt"/>.</Item>
295
</List>
296

297
Remark: In former versions of &GAPDoc; one had to use particular
298
entities for all the special characters mentioned above 
299
(<C>&amp;tamp;</C>, <C>&amp;hash;</C>, 
300
<C>&amp;dollar;</C>, <C>&amp;percent;</C>, <C>&amp;tilde;</C>, 
301
<C>&amp;bslash;</C>, <C>&amp;obrace;</C>, <C>&amp;cbrace;</C>, 
302
<C>&amp;uscore;</C>, <C>&amp;circum;</C>, <C>&amp;tlt;</C>, <C>&amp;tgt;</C>).
303
These are no longer needed, but they are still defined for backwards
304
compatibility with older &GAPDoc; documents.
305

306
</Subsection>
307

308
<Subsection Label="GDformulae">
309
<Heading>Mathematical Formulae</Heading>
310

311
Mathematical formulae in &GAPDoc; are typed  as in &LaTeX;. They must be
312
the content  of one of three  types of &GAPDoc; elements  concerned with
313
mathematical  formulae:  <Q><C>Math</C></Q>, <Q><C>Display</C></Q>,  and
314
<Q><C>M</C></Q>  (see Sections&nbsp;<Ref  Subsect="Math"/> and&nbsp;<Ref
315
Subsect="M"/> for more  details). The first two  correspond to &LaTeX;'s
316
math mode and display  math mode. The last one is a  special form of the
317
<Q><C>Math</C></Q> element  type, that  imposes certain  restrictions on
318
the content. On the other hand the content of an <Q><C>M</C></Q> element
319
is processed in a well defined way for text terminal or HTML output. The
320
<Q><C>Display</C></Q>  element  also  has  an attribute  such  that  its
321
content is processed as in <Q><C>M</C></Q> elements.<P/>
322

323
Note that the content of these element is &LaTeX; code, but  
324
the special  characters
325
<Q><C>&lt;</C></Q>  and <Q><C>&amp;</C></Q>  for XML  must be  entered via
326
the  entities described  in&nbsp;<Ref  Subsect="XMLspchar"/> or  by using  a
327
<C>CDATA</C> statement, see&nbsp;<Ref Subsect="XMLcdata"/>.<P/>
328

329
</Subsection>
330

331
<Subsection Label="GDent">
332
<Heading>More Entities</Heading>
333

334
In &GAPDoc; there are some more predefined  entities:
335

336
<Table Align="|l|l|">
337
<Caption>Predefined Entities in the &GAPDoc; system</Caption>
338
<HorLine/>
339
<Row> <Item><C>&amp;GAP;</C></Item>       <Item>&GAP;</Item> </Row>
340
<HorLine/>
341
<Row> <Item><C>&amp;GAPDoc;</C></Item>    <Item>&GAPDoc;</Item> </Row>
342
<HorLine/>
343
<Row> <Item><C>&amp;TeX;</C></Item>       <Item>&TeX;</Item> </Row>
344
<HorLine/>
345
<Row> <Item><C>&amp;LaTeX;</C></Item>     <Item>&LaTeX;</Item> </Row>
346
<HorLine/>
347
<Row> <Item><C>&amp;BibTeX;</C></Item>    <Item>&BibTeX;</Item> </Row>
348
<HorLine/>
349
<Row> <Item><C>&amp;MeatAxe;</C></Item>   <Item>&MeatAxe;</Item> </Row>
350
<HorLine/>
351
<Row> <Item><C>&amp;XGAP;</C></Item>      <Item>&XGAP;</Item> </Row>
352
<HorLine/>
353
<Row> <Item><C>&amp;copyright;</C></Item> <Item>&copyright;</Item> </Row>
354
<HorLine/>
355
<Row> <Item><C>&amp;nbsp;</C></Item> <Item><Q>&nbsp;</Q></Item> </Row>
356
<HorLine/>
357
<Row> <Item><C>&amp;ndash;</C></Item> <Item>&ndash;</Item> </Row>
358
<HorLine/>
359
</Table>
360

361
Here <C>&amp;nbsp;</C> is a non-breakable space character.
362
<P/>
363

364
Additional entities are defined for some mathematical symbols, see <Ref
365
Sect="MathForm"/> for more details.
366
<P/>
367
One can define  further  local entities right inside  the head (see&nbsp;<Ref
368
Subsect="XMLhead"/>) of a &GAPDoc; XML document as in the following example.
369

370
<Listing Type="Example">
371
<![CDATA[<?xml version="1.0" encoding="UTF-8"?>
372

373
<!DOCTYPE Book SYSTEM "gapdoc.dtd"
374
  [ <!ENTITY MyEntity "some longish <E>text</E> possibly with markup">
375
  ]>]]>
376
</Listing>
377

378
These additional definitions go into  the <C>&lt;!DOCTYPE</C> tag in square
379
brackets. Such new entities are used like this: <C>&amp;MyEntity;</C> <P/>
380

381
</Subsection>
382

383
</Section>
384
</Chapter>
385

386

387
Product

Resources

Company