This document is a tutorial guide to some new features of the ODD language introduced at
release 3.0 of TEI P5. It assumes the reader already knows something about how ODD is
designed and used, and presents only those aspects which have changed with the introduction
of Pure ODD. For discussion and background information about the
motivation for these changes see
Two major changes are described. Firstly, the content model of an element
specification, the content of the content element in an elementSpec, is
now expressed using some new TEI elements, rather than (as previously) expressions in the
RELAXNG syntax. Secondly, TEI-defined data specifications are now expressed using a
dataSpec element rather than by a macro specification (macroSpec of
type="dt") and a new dataRef element is used to select one as content for the
datatype element which defines the datatype of an attribute
Defining a content model
In Pure ODD, we use the content element to describe the intended content of an
element.
If the element concerned is empty, that is, it has no content at all, but only
attributes, then the content element itself is empty:
an element which may not contain anything
If the element concerned contains text, we use the special element textNode
an element which may contain only textual data
A text node may be of any length, including zero.
More usually, an element has what is known as element content. In this case,
the content element will contain references to one or more other elements, each
represented by an elementRef element. If there is only one such child element it
can be given directly:
an element which may contain only a one element
The attributes minOccurs and maxOccurs are used to indicate
repetition. In the following example, we define an element which may contain any number of
occurrences of the one element greater than two:
an element which may contain two or more one elements
In some unusual circumstances (for example, defining the content of an element such as the TEI's egXML element)
it may be necessary to say that a content model should permit any element at all, or any element from one or more specific namespaces. The anyElement
element is provided for this purpose:
an element which may contain one or more elements not taken from the TEI namespace
Grouping elements
An element may contain references to more than one different element. These elements
may be grouped in one of three ways: as a sequence, as an alternation, or interleaved.
In a sequence, all of the child elements must follow each other in the same order as
they appear within the content element:
an element which may contain a one element followed by a two
element
In an alternation, any of the child elements may appear:
an element which contains either a one element or a
two element
In an interleaved model, the child elements may appear in any order:
an element which contains either a one element followed by a
two element, or a two element followed by a one element.
Not all target schema languages support the concept of interleaving. An ODD processor
may map specifications using the interleave element to a less precise construct
in the target language, or to a combination of constructs in different constraint
languages.
The attributes minOccurs and maxOccurs are also used on these
grouping elements to indicate repetition of the group. For example:
an element which contains up to three repetitions of pairs of elements, each
containing a one followed by a two
When alternations are repeated, any one of the child elements may appear any number of
times:
an element which contains one or more one or two
elements
Occurrence indicators may be given at both levels. For example:
an element which contains up to three repetitions of two or moreone
elements followed by a two element
Sequences, alternations, and interleaved sequences may all be nested and combined as
necessary, permitting quite complex structures to be expressed.
Mixed content
An element may have what is known as mixed content, meaning that it may
contain a mixture of text fragments and some specified elements. This can be represented
in Pure ODD using the alternate element:
an element which contains any combination of text nodes, one
elements, and two elements
It may also, more economically, be represented using interleave:
an element which contains any combination of text nodes, one
elements, and two elements
A text node may appear anywhere within an alternation or a sequence. For example:
an element which may contain either two or moreone elements, or a
text node
However, not all current schema languages support such content models.
Class references
A classRef element can be used within a content model in the same way as an
elementRef. It is a shorthand way of saying that any member, or all members,
of a named model class of elements is permitted. For example:
an element which contains a sequence of up to three elements which are members
of the model.digital class
The classRef element here is understood to mean any one member of the class
.
Its attribute expand can be used to specify other meanings. For example, supposing
that the model.digital has members dig1 dig2 and
dig3, a reference to the class can have the following meanings:
Value of expand
| Expansion of classRef |
alternate [default] | |
sequence | |
sequenceOptional | |
sequenceOptionalRepeatable | |
sequenceRepeatable | |
A model class contains a predefined set of element types which we wish to manipulate or
reference together, usually because they can all appear in the same context, or share
other properties. Pure ODD allows us to define such classes by means of a
classSpec element. Once defined, members of that class can be referenced by
means of a classRef element, as we have just seen. An element specification
(not discussed here) includes a specification of the classes of which it is a member in
its classes element.
Datatypes
A further part of the specification for an element is the list of attributes it may bear,
provided by an attList element. The specification for an attribute (provided by
an attDef element) also includes information about the kinds of value it is
permitted to take, for example, whether it is a date or an integer. We call this its
datatype.
In Pure ODD, a datatype is specified using the dataRef element. In the following
example we define an attribute called count on the element one, and
specify that its value must be a positive integer:
an empty element with an attribute
indicates how often the one element is used
The name nonNegativeInteger
identifies one of the datatypes defined by the
W3C as part of its schema language, and is not further defined by our ODD.
It is however possible to provide a more detailed specification for a datatype using the
dataSpec element to document its values, intended uses, etc. The TEI defines
many such datatypes in the Guidelines, and these can also be re-used directly within your
ODD. The attribute key is used (rather than name) to indicate that a
TEI-defined or locally-defined datatype is intended, as in the following example:
indicates how often the one element is used
In this example, the name teidata.count
identifies a TEI datatype specification.
That specification is provided by a dataSpec element with the identifier
teidata.count, which is provided as part of the TEI system. A TEI
dataSpec is similar to an elementSpec. Its content element
may contain a dataRef element or a valList, or a number of such elements
combined using the elements alternate or sequence in the same way as
elementRefs are combined.
For example, if we wish to say that the value of the attribute count can be
either a non-negative integer or the string unknown we would first define a
dataSpec element with appropriate content:
permits non-negative integers or the string unknown
or equivalent
As noted above, the key attribute on dataRef can then be used to
refer to this data specification:
may indicate how often the one element is used if we know
A dataRef element can also be used to define the content of an element. For
example, there is a TEI data specification called teidata.xpath which can
be used to indicate that the value of an attribute must be a conformant XPath
specification.
indicates an XPath to the nodes required
The same dataRef might also be used to indicate that the content of an element
must be a conformant XPath specification:
indicates an XPath to the nodes required
Note however that not all schema languages support the ability to constrain element content in
this way.
Macro specifications
A small number of comparatively complex content models are frequently used by other TEI specifications. Rather than define
them afresh each time, it is convenient to reference a macro, the value of which contains their definition. For example,
the following code defines a macro called macro.xText
a content model alternating any gaiji-like element and plain text
To use this declaration, a content model can simply reference it by means of a macroRef:
an element containing any gaiji-like element and plain text