Implementing These Tag Sets

Modular DTD Design

The Book Tag Set, the Collection Tag Set, and the full Archiving and Interchange Suite have been written as a set of XML DTD “modules”, each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different tag sets. Modules are primarily intended for maintenance and creation of new tag sets; all the elements of the same “type” (class) are stored together.

The Book Tag Set and Collection Tag Set have been built from these modules, and both of these Tag Sets and individual modules from the Archival and Interchange Suite will be described in this Tag Library.

The major disadvantage of a modular system is the longer learning curve, since it may not be immediately obvious where within the system to find a particular element or attribute cluster. To help with this, each element page includes an expanded content model and also names the module in which that element is defined.

There are many advantages to such a modular approach. The smaller units are written once, maintained in one place, and used in many different tag sets. This makes it much easier to keep lower level structures consistent across document types, while allowing for any real differences that analysis identifies. A tag set for a new function (such as an authoring tag set) or a new publication type (such as a book) can be built quickly, since most of the necessary components will already be defined in the Suite. Editorial and production personnel can bring the experience gained on one tagging project directly to the next with very little loss or retraining. Customized software (including authoring, typesetting, and electronic display tools) can be written once, shared among projects, and modified only for real distinctions.

Modules in the Suite are primarily intended to group elements for maintenance. There are different kinds of modules. A module may:

Learn the DTD Structure

If you want to learn about the Tag Set in order to write a new tag set based on this Tag Set or to modify either of these Tag Sets:

How To Make New Tag Sets

Parameter Entities Modules to Customize and Change

Parameter entities are the major mechanism for customizing a Tag Set or creating a new Tag Set from the modules in the Suite. Individual Tag Sets will be constructed by 1) establishing element and attribute combinations and content models using parameter entities in one of the Tag-Set-specific customizing modules and 2) choosing appropriate modules from the Suite that declare the elements needed. For example, if the base tag set contained 6 kinds of lists and 2 table models, a more specific tag set, such as an authoring tag set, might use a Customize Classes Module to redefine the List Class to name only 3 lists and redefine the Display Class to allow only one table model.

The standard modules to create a customized tag set are: the DTD itself, a module to name its components, and as many override modules and new elements modules as necessary. Typical modules for a new tag set are:

Element Classes Concept

Many of the elements in the Book Tag Set have been grouped into loose element classes. There is no hard and fast rule for what constitutes a class; each one is a design decision, a matter of judgment. These classes are designed to ease customization to meet the particular needs of new tag sets. Base classes for the Archiving and Interchange Suite are defined in a separate Default Element Classes Module (%default-classes.ent;).

Content models are built using sequences of elements, and OR groups that are classes (typically) or mixes. As an example, the content model for a Paragraph element is declared to be an OR group (that is, a choice) of data characters and any of the elements named in the Paragraph Elements mix. The mix %p-elements; is declared to be a large OR group of many other element-defining classes: the Block Display Class Elements, the Mathematical Expressions Class Elements, the List Class Elements, the Citation Class Elements, et al.

These element classes can be viewed as building blocks that will be used to build larger parameter entities for element mixes. (Note: A mix describes a usage circumstance for a group of elements, such as all the paragraph-level elements, all the elements allowed inside a table cell, all the elements inside a paragraph, or all the inline elements). For example, to add another block display item to the Block Display Class Elements, you would edit the %block-display.class; parameter entity in the Tag-Set-specific Book Class override Module to override the default parameter entity in the Suite’s Default Element Classes Module module and create a new module containing the Element Declaration of the new block display item.

Parameter Entity Names for Classes and Mixes

PARAMETER ENTITY: SAME FUNCTION, SAME NAME — The Suite modules and initial Tag Sets have used a series of parameter entity naming conventions consistently. While parsing software cannot enforce these parameter entity naming or usage conventions, these conventions can make it much easier for a person to know how the content models work and what must be modified to make a Tag Set change.

CLASSES — Classes are functional groupings of elements used together in an OR group. Each class is named with a parameter entity, and all class parameter entity names end in the suffix “.class”:

 <!ENTITY % list.class "def-list | list">

A class, by definition, should never be made empty; the class should be removed from all models where you do not want the class elements included.

MIXES — Mixes are functional OR groups of classes; mixes should never contain element names directly. All mixes must be declared after all classes, since mixes are composed of classes. Mix names have no set suffix; for example, they may end in “-mix” or “-elements”. Content models and content model overrides use mixes and classes for all OR groups. Only content model sequences are made up of element names directly.

MODEL OVERRIDES — parameter entity mixes for overriding a content model are of two styles: 1) inline mixes and 2) full content model replacements. These two groupings have been defined and named separately to preserve the mixed-content or element-content nature of the models in DTDs derived from the Suite.

The inline parameter entities to be intermingled with character data (#PCDATA) in a mixed content model are named with a suffix “-elements”. For example, “%institution-elements;” would be used in the content model for the element <institution>:

 <!ENTITY % institution-elements "| %subsup.class;" >
 <!ELEMENT  institution (#PCDATA %institution-elements;)* >

All inline mixes begin with an OR bar, so that the mix can be removed leaving just character data (#PCDATA):

 <!ENTITY % rendition-plus "| %emphasis.class;  | %subsup.class;" >

The override of a complete content model will be named with a suffix “-model” and should include the entire content model, including the enclosing parentheses:

 <!ENTITY % kwd-group-model "(title?, (%kwd.class;)+ )" >
 <!ELEMENT  kwd-group %kwd-group-model; >
 

Subsidiary section:

How To Build a New Custom Tag Set

Subsidiary section:

Tag Set and Suite Naming Conventions

Modules in These DTDs

The Book Tag Set and Collection Tag Set were written as customizations of the Archiving and Interchange Suite. The basic Suite has module for defining tables, paragraphs, etc. The Book Tag Set (book.dtd) and its customization modules define the elements for a monograph or book and also calls in the Suite modules. In contrast, the Collection Tag Set bookcollection.dtd and its customization modules define a group of books, where the intent is to give information about the collection, and list its members, but not provide the actual content of any member book.

The Book Tag Set expressed as a DTD is comprised of the Book Tag Set module itself (book.dtd), the four book customization modules that are used to override element and attribute declarations in the Suite (%bookcustom-modules.ent;, %bookcustom-classes.ent;, %bookcustom-mixes.ent;, %bookcustom-models.ent;), and the five modules that add new elements and attributes over and above what the Suite provides [%bookmeta.ent;, %bookpart.ent;, %bookimagemap.ent;, and %bookmultilink.ent;].

 Book DTD

(File name book.dtd) The top-level Book Tag Set Module that declares the document element (<book>) and the other top-level elements that define the primary components of a book (book metadata, book front matter, body, and back matter). The DTD invokes all the modules it uses, by reference, as external parameter entities: first the NCBI Book Tag Set Module of Modules is called to name all Book-specific customized modules, then the Suite Module of Modules is called to name all the potential modules from the Suite, then customized and default modules are called (for parameter entities naming element classes, mixes, and models), then the Common Module for shared elements and attribute lists is called, and finally all the other Suite element modules are called as needed, and the four new Book-specific element modules, %bookmeta.ent;, %bookpart.ent;, %bookimagemap.ent;, and %bookmultilink.ent;.

Book override Modules

The NCBI Book Tag Set customization modules override the definitions in the modules of the Suite:

Book-Specific Module to Name Modules

(parameter entity %bookcustom-modules.ent;) Defines all the external modules that are specific to the Collection Tag Set or Book Tag Set (except itself, which must be both named and called inside a DTD). A Tag Set can select from these modules by referencing the module names through external parameter entities. The entities are declared in this module, but referenced (actually called in) in the DTD proper. To include a set of elements (such as all the lists or all the MathML elements), a DTD references the external parameter entity of the module (defined in this module) that contains these declarations.

Note: The Book Tag Set Module of Modules and the Suite Module of Modules need to be the first two external modules called by either the Collection Tag Set or the Book Tag Set. Customization modules for classes, mixes, and models will typically be called following the Book Tag Set Module of Modules and the Suite Module of Modules.

Book-Specific Class overrides Module

(parameter entity %bookcustom-classes.ent;) Sets up parameter entities that will be used to override the Suite’s default classes (those that are described in the %default-classes.ent; module)

Note: This module must be called before %default-classes.ent; module (which this module overrides) and the %bookcustom-mixes.ent; and %bookcustom-models.ent; modules (which may build on classes defined in this module).

Book-Specific Mix overrides Module

(parameter entity %bookcustom-mixes.ent;) Sets up parameter entities that will be used to override default mixes (groupings made of “classes”) prescribed by the %default-mixes.ent; module

Note: This module must be called after the Book Customize Classes Module (%bookcustom-classes.ent;) and the Default Classes Module (%default-classes.ent;) but before the %default-mixes.ent; module (which this module overrides) and the %bookcustom-models.ent; module (which may build on mixes defined in this module).

Book-Specific Models/Attributes overrides Module

(parameter entity %bookcustom-models.ent;) Sets up parameter entities that will be used to override default content model parameter entities set elsewhere in the Suite. Also defines customizable attribute Declared Values and attribute lists for the Tag Set being defined.

Note: This module must be called after the Book Tag Set Customize Mixes Module (%bookcustom-mixes.ent;) and Default Mixes Module (%default-mixes.ent;) but before any “base” modules of the Suite.

Book Element Modules

The new elements added just for the Book Tag Set are defined in the following modules:

Book Tag Set Metadata Module

(parameter entity %bookmeta.ent;) Describes book-specific metadata elements that are not defined in the Suite metadata module %articlemeta.ent;

Book Tag Set Book Part Module

(parameter entity %bookpart.ent;) Declares book-component-level metadata, such as chapter-specific or part-specific metadata elements

Book Tag Set Image Map Module

(parameter entity %bookimagemap.ent;) Declares the elements used to create client-side image maps, which make hot spots on graphics

Book Tag Set Multilink Module

(parameter entity %bookmultilink.ent;) Defines links to external resources (Note: The external and multiple external links defined in this module are used in the Book Tag Set for external links instead of the XLink mechanism. The XLink mechanism, although deprecated, is still specified in all of the Suite modules.)

Suite Setup Modules

The basic Suite modules overridden by the BOOK-specific customization modules just named include the following:

Suite Module to Name the Modules

(parameter entity %modules.ent;) Defines all the external modules that are part of the modular Archiving and Interchange Suite (except itself, which must be both named and called inside a DTD). A Tag Set selects from these modules by referencing the module names through external parameter entities. The entities are declared in the Suite Module of Modules (%modules.ent;), but referenced (or not) in the DTD proper. To include a set of elements (such as all the book metadata or all the display elements) a DTD references the external parameter entity of the module that contains these declarations.

Note: The Book Tag Set Module of Modules and the Suite Module of Modules need to be the first two external modules called by either the Collection Tag Set or Book Tag Set. Customization modules for classes, mixes, and models will typically be called following the Book Tag Set Module of Modules and the Suite Module of Modules.

Suite Default Element Classes Module

(parameter entity %default-classes.ent;) Sets up the parameter entities that name the element members of each class that will be used to establish the content models

Note: This module must be called before the Book Customize Mixes Module (%bookcustom-mixes.ent;) and the Default Element Mixes Module (%default-mixes.ent;), as well as the Book Customize Models Module, %bookcustom-models.ent; (which builds on those modules).

Suite Default Element Mixes Module

(parameter entity %default-mixes.ent;) Sets up the parameter entities that name mixes (groupings made of “classes”) that will be used to establish the content models

Note: This module must be called before the Book Customize Models Module (%bookcustom-models.ent;) or any “base” module of the Interchange Suite.

Collection Tag Set

The Collection Tag Set models a collection list and description, is a small Tag Set that defines a few collection elements and uses all of the book and Suite modules.

Collection Tag Set

(File name bookcollection.dtd) The top-level Collection Tag Set Module that declares the document element (<collection>) and the other top-level elements that define a grouping of related books (collection metadata, book front matter, body, and back matter). All elements but these few — and the elements needed to flesh out a collection’s metadata such as <collection-list> — are declared in the modules of the Suite. The DTD invokes all the other modules it uses, by reference, as external parameter entities: first the Book Tag Set Module of Modules is called to name all Book-specific customized modules, then the Suite Module of Modules is called to name all the potential modules from the Suite, then customized and default modules are called (for parameter entities naming element classes, mixes, and models), then the Common Module for shared elements and attribute lists is called, and finally all the other modules are called as needed, including the Tag-Set-specific element modules, %bookmeta.ent;, %bookpart.ent;, %bookimagemap.ent;, and %bookmultilink.ent;.

Basic Suite Modules

The modules comprising the rest of the Suite that are used to build both the Book Tag Set and the Collection Tag Set are the following:

Common (Shared) Elements Module

(parameter entity %common.ent;) Declarations for elements, attributes, entities, and notations that are shared by more than one class module

Note: This module must be called before any of the modules comprising the Interchange Suite.

Article Metadata Elements Module

(parameter entity %articlemeta.ent;) Declares the metadata elements (issue elements and article header elements) used to describe a journal article. This module has been incorporated in the Book Tag Set and Collection Tag Set to include metadata elements that, although previously declared to model journal articles, are also used in the metadata of a book or book component such as a chapter.

Back Matter Elements Module

(parameter entity %backmatter.ent;) Declares elements that are not part of the main textual flow of a work, but are considered to be ancillary material such as appendices, glossaries, and bibliographic reference lists

Display Class Elements Module

(parameter entity %display.ent;) Declares the display-related elements, such as figures, graphics, math, chemical expressions and structures, tables, etc.

Format Class Elements Module

(parameter entity %format.ent;) Declares elements concerned with rendition of output, for example, printing on a page or display on a screen. This module includes the elements in the Appearance Class, the Break Class, and the Emphasis Class.

Funding Elements Module

(parameter entity %funding.ent;) Declares elements that model open access, grant, sponsorship, or other funding information, for example, the grant number (<award-id>) and the grant holder (<principal-award-recipient>)

Link Class Elements Module

(parameter entity %link.ent;) Declares elements that are links (internal or external) by definition, such as URLs (<uri>) and internal cross references (<xref>)

List Class Elements Module

(parameter entity %list.ent;) Declares the elements in the List Class; these are all lists except the lists of bibliographic references (citations). Lists are considered to be composed of items.

Math Class Elements Module

(parameter entity %math.ent;) Declares the elements in the math classes such as display equations

NLM Citation Module

(parameter entity %nlmcitation.ent;) Adds the model for the NLM structured bibliographic citation element (<nlm-citation>). This element is now obsolete and should be replaced by either the <element-citation> or the <element-citation> element.

Paragraph-Like Elements Module

(parameter entity %para.ent;) Declares structural, non-display elements that may appear in the same places as a paragraph. These elements are named in the various paragraph class parameter entities.

Subject Phrase Class Elements Module

(parameter entity %phrase.ent;) Declares the Phrase Class elements, that is, names the inline, subject-specific elements.

Bibliographic Reference (Citation) Class Elements Module

(parameter entity %references.ent;) Declares the bibliographic reference elements

Related Object Elements Module

(parameter entity %related-object.ent;) Defines the container element <related-object>, used as a container for text links to a related object, possibly accompanied by a very brief description of the object

Section Class Elements Module

(parameter entity %section.ent;) Declares the elements of the Section Class, that is, declares all section-level elements in the Book Tag Set (or Collection Tag Set).

MathML Setup Module

(parameter entity %mathmlsetup.ent;) Invokes the MathML modules

Tag Set Creation Note: To include the MathML elements, a Tag Set must reference this module. This module sets up all parameter entities needed to use the MathML Tag Set and references (invokes) the MathML 2.0 Tag Set Module, which, in turn, invokes all the other MathML modules.

MathML 2.0 Tag Set Module

(parameter entity %mathml.dtd;) Mathematical Markup Language (MathML) 2.0, an XML application for describing mathematical notation and capturing both its structure and content

MathML 2.0 Qualified Names 1.0

(parameter entity %mathml-qname.mod;) Declares parameter entities to support namespace-qualified names, namespace declarations, and name prefixing for MathML, as well as declares the parameter entities used to provide namespace-qualified names for all MathML element types

Extra Entities for MathML 2.0

(parameter entity %ent-mmlextra;) Used for MathML processing

Aliases for MathML 2.0

(parameter entity %ent-mmlalias;) Used for MathML processing

XHTML Table Setup Module

(parameter entity %XHTMLtablesetup.ent;) Sets all parameter entities needed by the XHTML table model, and then invokes the module containing that model

Tag Set Creation Note: To include the XHTML table model in a tag set, a Tag Set must reference this module. This module sets up all parameter entities needed to use the XHTML table model and references (invokes) the XHTML Table Model Module. (See next item.)

XHTML Table Model Module

(parameter entity %xhtml-table-1.mod;) The public XML DTD version of the XHTML table model. This module is invoked from the module %XHTMLtablesetup.ent;. (See previous item.) This is the default table model for this Tag Set.

XHTML Table Style Module

(parameter entity %xhtml-inlstyle-1.mod;) Declares the @style attribute, which supports inline style markup for elements such as <td>and <tr> elements within XHTML tables.

OASIS XML Table Setup Module

(parameter entity %oasis-tablesetup.ent;) Note: Not used in the current Book Tag Set. Sets all parameter entities needed by the OASIS (CALS) Exchange table model, and then invokes the module containing that model

Tag Set Creation Note: To include the OASIS table model in a Tag Set, the DTD must reference this module. This module sets up all parameter entities needed to use the OASIS table model and references (invokes) the OASIS XML Exchange Table Model Module. This module has been modified to use a namespace prefix of “oasis” for all OASIS table elements, to disambiguate these elements and thus permit both the CALS and XHTML table models to be used in one tag set, should the developer choose to do this. There is a separate http://dtd.nlm.nih.gov/options/OASIS/tag-library/19990315/index.htmlTag Library describing the OASIS elements, attributes, and parameter entities.

OASIS XML Exchange Table Model Module

(parameter entity %oasis-exchange.ent;) Note: Not used in the current Book and Book Collection Tag Sets. The OASIS (CALS) Exchange table model. This module is invoked in %oasis-tablesetup.ent;.

XML Special Characters Module

(parameter entity %xmlspecchars.ent;) Standard ISO XML special character entities used in this Tag Set

Custom Special Characters Module

(parameter entity %chars.ent;) Custom special character entities created specifically for use in this Tag Set

Notation Declarations Module

(parameter entity %notat.ent;) Container module for the Notation Declarations to be used with this Suite. These notations have been placed in their own module for easy expansion or replacement.