General Introduction

The Journal Archiving and Interchange Document Type Definition (DTD) and the base Suite provide a set of XML modules that name and define the elements, and their attributes, for describing the textual and graphical content of journal articles (including some non-article material found in journals such as letters, editorials, and book and product reviews). The modules were developed as part of an effort to create XML applications through which materials on health-related disciplines can be shared and reused electronically. The modules can be used to construct DTDs for authoring and archiving journal articles as well as DTDs for transferring journal articles from publishers to archives and between archives. Although the full Suite was developed to support electronic production, the structures should be adequate to support some print production as well.

This Tag Library describes the DTD Suite and the first of the DTDs developed from the Suite, the Journal Archiving and Interchange DTD. The Tag Library provides:


DTD Design Principles


Purpose: Preservation of Intellectual Content

The intent of this DTD Suite is to “preserve the intellectual content of journals independent of the form in which that content was originally delivered”. The tags defined here will be used to describe journal articles that originate with many publishers and societies but whose content will be stored in repositories, such as the NLM PubMed Central repository. Therefore, the Suite has been optimized for conversion from a variety of journal source DTDs, with the intent of providing a single format in which publishers can deliver their content to a wide range of archives. There are so many journal DTDs currently in use by publishers, repositories, content-aggregators, scientific societies, and compositors that this Suite cannot possibly incorporate all the variation to be found in such diverse models. But a wide variety of structures can be accommodated, because the content models for the elements have been made very flexible, including a wide range of elements with nearly all structures optional.

The conversion focus also means that this is a larger, more inclusive DTD than might have been necessary if the intent had been, for example, to create only a journal-authoring DTD. Many elements have been created explicitly so that information tagged by publishers would not be discarded when they converted material from another DTD to an archival interchange or repository DTD created from this Suite. Because of the broad scope of the several proposed electronic archives, this Suite contains elements and attributes that may occur only in a very few journals. Attribute values that a particular DTD would restrict to a list of options, were declared as data character values so that all options could be accepted. Care has been taken to provide several mechanisms (frequently information classing attributes) to preserve the intellectual content of a document structure when that structure is converted from another DTD or schema to this one, even if there is no exact element equivalent of the structure.

The exact replication of the look and feel of any particular journal has not been a consideration. Therefore, many purely formatting mechanisms have not been included.


Introduction to the Journal Archiving and Interchange DTD

The Journal Archiving and Interchange DTD defines a document that is a component of a journal such as an article, a book or product review, a letter to the editor, etc. Each such document may have up to four components, which must appear in the following order:

  • Front matter (required). The article front matter contains the metadata (header information) such as the article title, the journal in which it appears, the date and issue of publication for that issue of that journal, etc.
  • Body of the article (required). The body of the article is the main textual and graphic content of the article. This usually consists of paragraphs and sections, which may themselves contain figures, tables, sidebars (boxed text), etc. The front matter and body may be all there is to an article.
  • Back matter for the article (optional). If present, contains information that is ancillary to the main text, such as a glossary, appendix, or list of references.
  • Optionally, either one or more responses or one or more subordinate articles:
    • Response (rarely used). A response is a commentary on the article itself, for example, an opinion from an editor on the importance of the article or a reply from the original author to a letter concerning his article.
    • Sub-Article (rarely used). A sub-article is a small article that is completely contained inside another article.

For the purposes of the review of draft 1.0 of the Journal Archiving and Interchange DTD, the DTD draws on all the modules in the Suite, as provided.


Modular DTD Design

The Archiving and Interchange DTD Suite has been written as a set of XML DTD modules called DTD “modules”, each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different DTDs, for example, both an Archival and Interchange DTD and an Archival Repository DTD. Modules are primarily intended for maintenance; all the elements of the same “type” (class) are stored together.

The first DTD (an Archival and Interchange DTD for the National Library of Medicine) has been built from these modules, and both this DTD and the individual modules will be described in this Tag Library.

The major disadvantage of a modular system is the longer learning curve, because it may not be immediately obvious where within the system to find a particular element or attribute cluster. To help with this, the description of each element in the Element Section of this documentation names the module in which that element is defined.

There are many advantages to such a modular approach. The smaller units are written once, maintained in one place, and used in many different DTDs. This makes it much easier to keep lower-level structures consistent across document types, while allowing for any real differences that analysis identifies. A DTD for a new function (such as an authoring DTD) or a new publication type can be built quickly, because most of the necessary components will already be defined in the DTD Suite. Editorial and production personnel can bring the experience gained on one tagging project directly to the next with very little loss or retraining. Customized software (including authoring, typesetting, and electronic display tools) can be written once, shared among projects, and modified only for real distinctions.


Potential DTDs from This Suite of Modules

It has been proposed that three DTDs be created initially from the modules in this DTD Suite:

  • Archival Repository DTD — An XML DTD used by one or more archives internally to store their content, to make the repository more uniform for consistent processing and searching. This DTD may also be used to transfer content between archives. This DTD will be designed to be open and inclusive to allow journal articles to be translated from a wide variety of proprietary journal article DTDs. As an example, the repository DTD can include more than one table model to enable conversionless inclusion of both the CALS-derivative and the XHTML-derivative table models that publishers have used.
  • Archiving and Interchange DTD — A DTD into which publishers’ original XML (or SGML) content can be converted, providing a common format for transferring content to any one of a number of archives. This DTD may also be used to transfer content between archives. The initial Archiving and Interchange DTD will be written as a proper subset of a larger Archival Repository DTD, so that all documents tagged according to the Archival and Interchange DTD will be, by definition, also deliverable as Archival Repository DTD documents without conversion. This will allow lossless transfer of journal articles between archives.
  • Authoring DTD— A Journal Publishing DTD has also been developed that is intended for writing and editing new journal articles. This is a more prescriptive and restrictive subset of the Journal Archiving and Interchange DTD, designed to enable “good” journal coding and to provide assistance to the authors and editors through more restricted models than the repository and interchange DTDs allow.


Accessibility

Because access for a wide range of output devices as well as for the visually impaired is becoming more and more important in the STM journal community, the modules in this Suite were designed to follow, as much as possible, the W3C Web Content Accessibility Guidelines 2.0 working draft, which was the latest accessibility specification available when the Suite was constructed. This Specification specifies accessibility guidelines on many levels from design through application. The guidelines that pertain to the modeling of materials were followed to at least Level-2 compliance. For example, a Long Description <long-desc> element was defined as part of many other elements such as <fig>; therefore, it can be added not only to all figures and other graphical objects but to any section of the text (for example, to a sidebar <boxed-text>) to provide an accessible description of the object. The xml:lang attribute was added to all section-level elements and many paragraph-level elements to permit explicit indication of the language of the content, as required by these guidelines. The Abbreviation or Acronym element (also to be used for acronyms) was added to meet Checkpoint 4.3[3.5].


How to Read This Tag Library


Terms and Definitions

Element

Elements are nouns, such as “Speech” and “Speaker”, that represent components of journal articles, the articles themselves, and accompanying metadata.

Attribute

Attributes hold facts about an element, such as which type of list (e.g., numbered, bulleted, or plain) is being requested when using the List <list> tag, or the name of a pointer to an external file that contains an image. Each attribute has both a name (such as list-type) and a value (such as “bulleted”).

Metadata

Data about the data, for example, bibliographic information. The distinction is between metadata elements that describe an article (such as the name of the journal in which an article was published) versus elements that contain the textual and graphical content of the article.


How to Start Using This Tag Library

A full DTD Suite delivery package includes the Journal Archiving and Interchange DTD, the customization file for each DTD, the files that make up the full Archiving and Interchange DTD Suite modules, one or more tagged sample documents, and this documentation, as a set of linked HTML files. How you use the documentation will depend on what you need to learn about the modules and the DTD.

If you want to learn about the elements and the attributes in the Suite or to learn how the journal article model is constructed, here is a good way to start.

  • Read the Tag Library General Introduction, taking particular note of the next section that describes the parts of the Tag Library, so you will know what resources are available. Stop just before the section “How to Make New DTDs from These Modules”.
  • Next, if you do not know the symbols used in the Document Hierarchy diagrams, read the “Key to the Near and Far Diagrams”.
  • Scan the Document Hierarchy diagrams to get a good sense of the top-level elements and their contents. (Find what is inside an <article>, then what is inside each of the four large pieces of an article; keep working your way down.)
  • Pick an element from one of the diagrams. (Look up the element in the Elements Section to find the full name of the element, its definition, usage notes, content allowed, and any attributes. Look up one of the attributes to find its full name, usage notes, and potential values.)
  • Finally, if you are interested in conversion from a particular source:
    • Look at an article in a printed or online journal or look at the DTD for the other journal. Can all of the information you want to store from an article fit into the models shown in the diagrams? Do you have, or know how to get, all of the information the models require? Will that information always be available for documents that are complete and correct? How difficult will it be to identify the parts of the information using the elements and attributes described in these models? Would changes to one or more models make this easier?
    • Now look at some non-article content, such as a news column, a book review, or some letters to the editor. Are there tags to handle all of these article types and all of their components?

If you want to learn about the DTD Suite to write a new DTD:

  • Skim the Tag Library General Introduction. (Start reading at the section “How to Make New DTDs from These Modules”. Read the Parameter Entities that name the classes. Scan the DTD modules.)
  • If you do not know the symbols used in the Document Hierarchy diagrams, then read the “Key to the Near and Far Diagrams”.
  • Use the Document Hierarchy diagrams to give you a good sense of the top-level elements and their contents.
  • Pick an element from one of the diagrams. (Look up the element in the Elements Section to find the full name of the element, the definition, usage notes, content allowed, and attributes list. Look up one of the attributes to find its full name, usage notes, and potential values.)
  • Read the DTD Modules, given at the end of this documentation.
    New DTDs are created by writing a new DTD module and a new customization module; therefore, you might want to read (in order) the DTD module (archivearticle.dtd), the module that names all the other modules (%modules.ent;), and the customization module (%archivecustomize.ent;). Then read any one of the class modules.


Structure of This Tag Library

This Tag Library contains the following sections:

Introduction

An introduction to the contents of this Tag Library, to the design philosophy and intended usage of the Archiving and Interchange DTD Suite, and to the first DTD, the Journal Archiving and Interchange DTD.

Elements Section

Descriptions of the elements used in the Journal Archiving and Interchange DTD and DTD Suite modules. The element descriptions are listed in alphabetical order by tag name. (Note: Each element has two names: a “tag name” (formally called an element-type name) that is used in tagged documents, the DTDs, and by the software; and an “element name” (usually longer) that provides a fuller, more descriptive name for the benefit of human readers. For example, a tag name might be <disp-quote> with the corresponding element name Quote, Displayed, or a tag name might be <verse-group> with the corresponding element name Verse Form for Poetry.)

Attributes Section

Descriptions of the attributes in the DTD modules. Similar to elements, attributes also have two names: the shorter machine-readable one and a (usually longer) human-readable one. Attributes are listed in order by the shorter machine-readable names, for example, the attribute short name list-type instead of the more informal, easier to read: Type of List.

Parameter Entity Section (For Implementors Only)

Names (with occasional descriptions) and contents of the Parameter Entities in the DTD modules.

Context Table

Listings of where each element may be used. All elements are given in a simple alphabetical list. There is a single table for the elements from all the Suite modules that are called from the DTD.

The Context Table is formatted in two columns. The first column lists an element’s tag name, and the second column lists the tag names of all the elements in which the first element may occur. For example, if the first column contains the front matter element <front> and the second column contains only the article element <article>, this means that the <front> (Front Matter) element may only be used inside an <article> (Article) element.

Most elements may be used inside more than one other element. For example, the attribution element <attrib> (Attribution) may be used inside both block quote <disp-quote> and poem <verse-group> elements.

Note: These Context Table listings (which list where an element may be used) are the inverse of the content description that is given as a part of each element in the element section, which lists what can be inside the named element.

Document Hierarchy Diagrams

Tree-like graphical representations of the content of many elements. This can be a fast visual way to determine the structure of an article or of any element within an article.

Index by Tag Name

Index of element descriptions, alphabetically by tag name (element-type name).

Index by Element Name

Index of element descriptions, alphabetically by element name (the longer, more descriptive name).

DTD Section

Copies of the Journal Archiving and Interchange DTD, its customization module, and the full Archiving and Interchange DTD Suite of XML DTD modules described in this Tag Library.


Tag Library Typographic Conventions

<alt-text> The tag name of an element (written in lowercase with the entire name surrounded by “< >”).
Alternate Text Name (For a Figure, Etc.) The element name (long descriptive name of an element) or the descriptive name of an attribute (written in title case, that is, with important words capitalized and the words separated by spaces).
must not Emphasis to stress a point.

How to Make New DTDs from These Modules (For Implementors Only)


Modular DTD Design

This DTD Suite has been written as a series of XML DTD modules that can be combined into a number of different DTDs. The modules are separate physical files that, taken together, define all element structures (such as tables, math, chemistry, paragraphs, sections, figures, footnotes, and reference elements) as well as attributes and entities in the Suite.

Modules are primarily intended to group elements for maintenance. There are different kinds of modules. A module may either:

  • be a building block for a base DTD (such as the Module to Name the Modules %modules.ent;)
  • define the elements inside a particular structure, for example, the Reference Elements Module names all the potential components of bibliographic reference lists
  • name the members of a “class” of elements, where class is a named grouping of elements that share a similar usage or potential location. For example, the Phrase Class module defines small floating elements that may occur within text, such as inside a paragraph or a title, or that describe textual content, for example, a disease name, drug name, or the name of a discipline.
  • be a module of “editorial convenience”, for example, the common module that holds elements and attributes used in the content models of the class elements


Parameter Entities to Customize and Change

The Archiving and Interchange DTD Suite makes intensive use of Parameter Entities as the major mechanism for customizing a DTD or creating a new DTD from the modules in the Suite. Individual DTDs will be constructed by 1) establishing element and attribute combinations and content models using Parameter Entities and 2) then choosing appropriate modules from the Suite that declare the elements needed. Two modules are central to this reuse: the Module to Name the Modules (which names all of the component modules in the DTD Suite) and the Customization or Customize Classes module (which defines Parameter Entities that will be used to build element models and attribute lists). New DTDs can be constructed by:

  • defining in a DTD module the document element (the top-level such as article, book, or report) and any other structural elements unique to the new document type
  • selecting, from the Module to Name the Modules, those modules that contain the elements needed for the DTD (for instance, selecting lists and not selecting math elements)
  • redefining selected content models and attribute lists through the use of Parameter Entities in the Customization Module
For example, if the base DTD contained six kinds of lists and two table models, a more specific DTD such as an authoring DTD, might use the Customization Module to redefine the List Class to name only three lists and redefine the Display Class to allow only one table model.


Element Classes

Many of the elements in the Journal Archiving and Interchange DTD have been grouped into loose element classes. These classes are designed to make it easy to customize these DTDs to meet the particular needs of new DTDs, such as an authoring DTD. Most classes are defined in separate modules that bear their name, although a few that are defined are in the Common Module. Thus, the Link Class is defined in the Link Module, the List Class is defined in the List Module, etc. For such class modules, comments at the top of the module name the Parameter Entity used to invoke the class and define the default class membership. (The real class membership is always defined in the DTD-specific Customization Module.) For example, to add a new type of list to the List Class elements, you would edit the List Class Elements Parameter Entity in the Customization Module, add the new Element Declaration to the List Module, and be sure that the DTD was invoking the list module.

These element classes can be viewed as building blocks that will be used to build larger Parameter Entities for element mixes. A mix describes a usage circumstance that all the elements share (such as all the paragraph-level elements, all the elements allowed inside a table cell, all the elements inside a paragraph, or all of the inline elements). Content models are built from these mixes. As an example, the content model for a Paragraph <p> is declared to be an OR group (that is, a choice) of data characters and any of the elements named in the mix called %inside-para;, where the inside-paragraph-mix is declared to be a large OR group of many other element-defining classes: the Block Display Class, the Math Class, the List Class, the Link Class, etc.

There are also a few groupings that are not pure classes but just groupings of convenience. For example, there is no “Address Class”; there is a Parameter Entity called %address-elements; that holds a few of the elements, such as country, email, and fax number, that are the contents of an address element <address> and are defined in the Common Module. There is no hard and fast rule for what constitutes a class; each one is a design decision, a matter of judgment.


How to Build a New DTD

As an illustration, one series of steps to build a new DTD could be as follows.

  1. Create a new customization module, defining any overrides to the classes or to other Parameter Entities. This will set up the Parameter Entities and element class definitions used for the content models and attribute lists for the new DTD.
  2. Write any new class modules you need or add any additional elements or attributes to the existing class modules and element group modules.
  3. Construct a new DTD module, within that DTD:
    • Define the document element and any other unique elements.
    • Use an External Parameter Entity to reference the standard Module to Name the Modules (%modules.ent;), which names all the potential modules.
    • Use an External Parameter Entity to reference the new specific Customization Module (suggested entity: %xxxcustomize.ent;, where “xxx” is the name of the DTD) that establishes the structures.
    • Use an External Parameter Entity to reference the standard Common Module (%common.ent;) that defines elements and attributes so common they are used in many places in the other modules.
    • Use many External Parameter Entities to reference all of the modules you need for your DTD (Note: Do not reference the modules you do not need. For example, if you do not want to use MathML tagging, first use the Customization Module to change the definition of the Math Class (%math.class;), then do not call the MathML Setup Modules (%mathmlsetup.ent;). This module is the one that calls in all of the other MathML Modules.).
    • Define any Parameter Entities needed in the DTD module itself.

The idea is that all lower-level elements (paragraphs, lists, figures, etc.) will be defined in the modules, not in the DTD itself. The DTD will be fairly short and only include definitions of the topmost elements. In the case of the Journal Archiving and Interchange DTD, the elements <article>, <front>, <body>, and <back> are defined.


The Element Classes in the Suite

The classes described here are defined in the Journal Archiving and Interchange DTD Customization Module and have been used to divide the elements into physical modules. The documentation for the classes and their current default element contents are listed in the Parameter Entity Section toward the end of this Tag Library. In the Parameter Entity Section, the names of the elements in a group or class are listed within quotation marks, separated by vertical bars. For example, Phrase Class will be listed as “%phrase.class;” and shown to contain:

"abbrev | named-content"

which means that the two elements <abbrev> and <named-content> are defined as Phrase Class elements.

Accessibility Class

(%access.class;) Elements added to make it easier to process journal articles in ways that are more accessible to people and devices with special needs, for example, the visually handicapped. Includes, for example, the element <alt-text>, which is a short phrase name or description of an object, usually a graphical object, that can be used “behind the picture” on a website or pronounced in a talking system [defined in the Common Module].

Appearance Class

(%appearance.class;) Formatting elements (usage discouraged) used primarily in tables, for example, a horizontal rule [defined in the Format Module].

Break Class

(%break.class;) Formatting element (usage discouraged) used to force a line break, primarily in tables and titles [defined in the Format Module].

Citation Class

(%citation.class;) Reference to an external document (a citation) as used within, for example, the text of a paragraph [defined in the Common Module].

Conference Class

(%conference.class;) Metadata elements that may be used to describe a conference, for example, the conference name, theme, and sponsoring organization [defined in the Common Module].

Display Class

(Several Parameter Entities: %block-display.class;, %inline-display.class;, %simple-display.class;) Graphical or other display-related elements, including figures, chemical formulas, and images [defined in the Display Class Module].

Emphasis Class

(%emphasis.class;) Used to produce rendering/typographical distinctions such as superscript, subscript, or bold text [defined in the Format Module].

Label Class

(%label.class;) The label element used to hold the number, prefix character, or prefix word or phrase of a labeled object such as a table, figure, or footnote [defined in the Common Module].

Link Class

(%link.class;, %simple-link.class;, %ext-links.class; ) Elements that associate one location with another, including cross references, and URIs for links to the World Wide Web [defined in the Link Module].

List Class

(%list.class;) The types of lists used in text, including numbered lists and bulleted lists [defined in the List Module].

Math Class

(%math.class;) The mathematical elements (such as Formula, Inline <inline-formula> and Formula, Display <disp-formula>) and elements that can contain the MathML tags [defined in the Math Module].

Paragraph Class

(%para.class;, %rest-of-para.class;, %intable-para.class;) Information for the reader that is at the same structural level as a paragraph, including both regular paragraphs and specially named paragraphs that may have distinctive uses or different displays, such as dialogs and formal statements [defined in the Common Module and the Paragraph Module].

Personal Name Class

(%person-name.class;) The element components of a person’s name (such as <surname>) that can be used, for example, inside the name of a contributor [defined in the Common Module].

Phrase Class

(%phrase.class;) Inline elements that surround a word or phrase in the text because the subject (content) should be identified to support some kind of display, searching, or processing. For example, a <named-content> element could be used to identify a drug name, genus/species, product, etc. [defined in the Phrase Module].

Reference Class

(%references.class;) The elements that may be included inside a Citation (bibliographic reference) [defined in the Reference Module].

Section Class

(%references.class;) The elements that are at the same hierarchical level as a section [defined in the Section Module].

Table Class

(%table.class;) Elements that contain the rows and columns inside the Table Wrapper element <table-wrap>. The following elements can be set up for inclusion: Table (XHTML table model) <table>.

In the full modular DTD Suite, the element <oasis:table> could also be selected, but it is not included in the Journal Archiving and Interchange DTD.


Modules in the Archiving and Interchange DTD Suite

There is, thus far, one DTD in the module Suite, the Journal Archiving and Interchange DTD (archivearticle.dtd). The DTD module and its specific customization module (%archivecustomize.ent;) define an archival repository and interchange focused DTD. The following modules are critical for the customization process that creates that DTD:

Journal Archiving and Interchange DTD

(File name archivearticle.dtd) The top-level Journal Archiving and Interchange DTD Module that declares the document element (Article) and the other top-level elements that define a journal article (article, front matter, back matter, and sub-articles or responses). All elements but these few are declared in the modules of the Suite. The DTD invokes all the other modules it uses, by reference, as external Parameter Entities: first the Module to Name the Modules is called to name all the potential modules; then the Customization Module to set up any necessary Parameter Entities; then the Common Module for shared elements and attribute lists; and then all the other modules needed.

Module to Name the Modules

(Parameter Entity %modules.ent;) Defines all the external modules that are part of the modular Archiving and Interchange DTD Suite (except itself and the Customization Module, which must be both named and called inside a DTD). A DTD selects from these modules by referencing the module names through external Parameter Entities. The entities are declared in the Module to Name the Modules (%modules.ent;) but referenced (or not) in the DTD proper. To include a set of elements (such as all the lists or all the MathML elements), a DTD references the external Parameter Entity of the module that contains these declarations. Note: The Module to Name the Modules needs to be the first external module called by a DTD. A Customization Module will typically be called after this module.

Journal Archiving and Interchange DTD Customization Module

(Parameter Entity %archivecustomize.ent;) Sets up the Parameter Entities that name the element members of each class that will be used to establish the content models. Also defines customizable attribute Declared Values and attribute lists for the DTD being defined. Note: This module must be called after the Module to Name the Modules (%modules.ent;) but before any other module.

The modules composing the rest of the DTD Suite are:

Common (Shared) Elements Module

(Parameter Entity %common.ent;) Declarations for elements, attributes, entities, and Notations that are shared by more than one class module. Note: This module must be called before any of the class or element grouping modules.

Article Metadata Elements Module

(Parameter Entity %articlemeta.ent;) Declares the metadata elements (issue elements and article header elements) used to describe a journal article. (Note: Metadata elements that describe the journal are in the Journal Metadata Module, %journalmeta.ent;.)

Back Matter Elements Module

(Parameter Entity %backmatter.ent;) Declares elements that are not part of the main textual flow of a work but are considered to be ancillary material, such as appendices, glossaries, and bibliographic reference lists.

Display Class Elements Module

(Parameter Entity %display.ent;) Declares the display-related elements such as figures, graphics, math, chemical expressions and structures, tables, etc.

Format Class Elements Module

(Parameter Entity %format.ent;) Declares elements concerned with rendition of output, for example, printing on a page or displaying on a screen. This module includes the elements in the Appearance Class, the Break Class, and the Emphasis Class.

Journal Metadata Elements Module

(Parameter Entity %journalmeta.ent;) Declares the elements used to describe the journal in which a journal article is published. (Note: The issue and article metadata are defined in the Article Metadata module, %articlemeta.ent;.)

Link Class Elements Module

(Parameter Entity %link.ent;) Declares the elements in the Link Class; these are elements that are links (internal or external) by definition, such as URLs <uri> and internal cross references <xref>.

List Class Elements Module

(Parameter Entity %list.ent;) Declares the elements in the List Class; these are all lists except the lists of bibliographic references (citations). Lists are considered to be composed of items.

Math Class Elements Module

(Parameter Entity %math.ent;) Declares the elements in the math classes, such as display equations.

Paragraph-Like Elements Module

(Parameter Entity %para.ent;) Declares structural, non-display elements that may appear in the same places as a paragraph. These elements are named in the various paragraph-class Parameter Entities.

Subject Phrase Class Elements Module

(Parameter Entity %phrase.ent;) Declares the Phrase Class elements, that is, names the inline, subject-specific elements. At the time of DTD creation, there was only one, but it had an attribute to name the type. If more specific subject words (such as “gene”) are added to a later version of this DTD, they would be added to the %phrase.class; entity and defined in this module or in %common.ent;.

Bibliographic Reference (Citation) Class Elements Module

(Parameter Entity %references.ent;) Declares the bibliographic reference elements.

Section Class Elements Module

(Parameter Entity %section.ent;) Declares the elements of the Section Class, that is, declares all section-level elements in the Journal Archiving and Interchange DTD. At the time of this initial DTD creation, there is only one such element, Section <sec> itself, but future expansion to named sections (such as <methodology> or <materials>) or any new section-level structures would be added here.

MathML Setup Module

(Parameter Entity %mathmlsetup.ent;) Invokes the MathML modules. (DTD Creation Note: To include the MathML elements, a DTD must reference this module. This module sets up all Parameter Entities needed to use the MathML tagset and references (invokes) the MathML 2.0 DTD Module, which, in turn, invokes all of the other MathML modules.)

MathML 2.0 DTD Module

(Parameter Entity %MathML DTD;) Mathematical Markup Language (MathML) 2.0, an XML application for describing mathematical notation and capturing both its structure and content.

MathML 2.0 Qualified Names 1.0

(Parameter Entity %mathml-qname.mod;) Declares Parameter Entities to support namespace-qualified names, namespace declarations, and name prefixing for MathML, as well as declares the Parameter Entities used to provide namespace-qualified names for all MathML element types.

Extra Entities for MathML 2.0

(Parameter Entity %ent-mmlextra;) Used for MathML processing.

Aliases for MathML 2.0

(Parameter Entity %ent-mmlalias;) Used for MathML processing.

XHTML Table Setup Module

(Parameter Entity %XHTMLtablesetup.ent;) Sets all Parameter Entities needed by the HTML 4.0 (XHTML) table model and then invokes the module containing that model. (DTD Creation Note: To include the XHTML Table Model, reference this module from the DTD. This module sets up all Parameter Entities needed to use the XHTML Table Model and references (invokes) the XHTML Table Model Module.)

XHTML Table Model Module

(Parameter Entity %htmltable.dtd;) The public XML version of the HTML 4.0 (XHTML) table model. This module is invoked in %XHTMLtablesetup.ent;.

OASIS XML Table Setup Module

(Parameter Entity %oasis-tablesetup.ent;) Sets all Parameter Entities needed by the OASIS (CALS) Exchange table model and then invokes the module containing that model. (DTD Creation Note: To include the OASIS Table Model, reference this module from the DTD. This module sets up all Parameter Entities needed to use the OASIS Table Model and references (invokes) the OASIS XML Exchange Table Model Module.)

OASIS XML Exchange Table Model Module

(Parameter Entity %oasis-exchange.ent;) The OASIS (CALS) Exchange table model. This module is invoked in %oasis-tablesetup.ent;.

XML Special Characters Module

(Parameter Entity %xmlspecchars.ent;) Standard ISO XML special character entities used in this DTD.

Custom Special Characters Module

(Parameter Entity %chars.ent;) Custom special character entities created specifically for use in this DTD.

Notation Declarations Module

(Parameter Entity %notat.ent;) Container module for the Notation Declarations to be used with this DTD Suite. These notations have been placed in their own module for easy expansion or replacement.


Archiving and Interchange
Suite Naming Conventions


XML Component Naming Conventions

Element and attribute names that originate with the Archiving and Interchange DTD Suite are in all lowercase. Element and attribute names taken from PUBLIC modules incorporated into these DTDs are in the case in which they are found in the original module (e.g., MathML and various table modules). Elements named with two words are separated by a hyphen, for example, <def-list> and <term-head>.

Classes are functional groupings of elements, defined and used together. Each class is named with a Parameter Entity, and all class Parameter Entity names end in the suffix “.class”.


File Naming Conventions

This Tag Library describes the components of the first of the DTDs to be constructed, the Journal Archiving and Interchange DTD. This DTD consists of a base DTD module (delivered as the file archivearticle.dtd), which references the other DTD modules.

The individual modules (as delivered) have been given DOS/Windows three-digit suffixes indicating their type:

*.dtd

A module that can be used as the top level of an XML hierarchy. Used for the Journal Archiving and Interchange DTD top level, archivearticle.dtd, but also taken unchanged for public DTD modules that have been included in this DTD, such as the MathML DTD and the XHTML table model.

*.ent

A DTD fragment for incorporation into a full DTD. May contain element declarations, entity declarations, etc.

*.mod

A DTD fragment for incorporation into a full DTD. May contain element declarations, entity declarations, etc. This extension has the same meaning as *.ent and is only used to maintain the extension names dictated by the inclusion of PUBLIC DTD fragments, for example, mathml2-qname-1.mod.

Each DTD and module has been assigned a unique formal public identifier (fpi). File names are never referenced directly in the comments in the DTD; the file is referred to by the name of the external Parameter Entity, which names the fpi and a system name for the file. The external Parameter Entity has been set to the initial delivery filename.

Although the DTD cannot dictate graphic file names, the comments do suggest that the best practice for graphic file names for documents tagged according to this DTD Suite would be to limit the names and path names to these characters: letters (both upper- and lowercase), numbers, underscore, hyphen, and period. All such names will be assumed to be case sensitive. DOS-style file extensions may be used.


Phase II DTD Work

In the interest of getting a version of this Suite into production as quickly as practical, several structures and functions that might be appropriately included in a journal DTD have been delayed until a future version of this Suite. Such components include:

  • questions and Answers, except as they can be modeled with the current DTDs by using paragraphs and lists
  • proper systematic identification keys (except as they can be tagged using regular list structures)
  • continuing Medical Education material
  • forms and fill-in-the-blank areas
  • conflict of Interest statements and Financial Disclosures, except as they can be modeled using paragraphs and footnotes
  • electronic and Digital Rights Management material
  • advertising included in the journal (for example, job ads, classified advertising, and display advertising)
  • calendars, meeting schedules, and announcements, except as these can be handled as ordinary articles or sections within articles
  • material specific to an individual journal, such as Author Guidelines, Policy and Scope statements, Editorial or advisory boards, detailed indicia, etc.


Acknowledgments

We thank bmj.com, Molecular Biology of the Cell, and The Proceedings of the National Academy of Sciences of the U.S.A. for providing the sample articles used in this Tag Library.