General Introduction

The Journal Archiving and Interchange Document Type Definition (hereafter Journal Archiving DTD) defines elements and attributes that describe the content of journal articles (including some non-article material found in journals, such as letters, editorials, and book and product reviews). The DTD is intended for use by archives as a repository DTD, as well as for transferring journal articles among publishers and archives. The DTD is very loose, flexible, and permissive in an attempt to capture nearly anything a publisher might have tagged.

This Journal Archiving DTD comprises a few DTD-specific modules and uses (by reference) the base modules of the Journal Archiving and Interchange DTD Suite. The modules of that Suite were developed as part of an effort to create XML applications through which materials on health-related disciplines could be shared and reused electronically. The Suite can be used to construct many DTDs in addition to this one. Although the full Suite was developed to support electronic production, the structures should be adequate to support some print production as well.

This Tag Library describes the DTD Suite and the Journal Archiving and Interchange DTD, by providing:


Introduction to Version 2.1

There were several rationales for revising the Archiving and Interchange DTD Suite and this DTD made from it:

The major changes for this release were to amend the Module of Modules (%modules.ent;) to reflect the new MathML and character sets, including a revised directory structure for both the MathML modules and the general entity set modules. This necessitated changing the verison of all the DTD files, to point to the new Module of Modules. Minor changes requested by the committee are also reflected in many of the modules.

Editors Note: We took the opportunity of the revision to fix typos, alignment errors, element order in parameter entities, and infelicities of the wording of comments. We wish thank the many users


DTD Design Principles


Purpose: Preservation of Intellectual Content

The intent of this DTD Suite is to “preserve the intellectual content of journals independent of the form in which that content was originally delivered”. The tags defined here will be used to describe journal articles that originate with many publishers and societies but whose content will be stored in repositories such as the NLM PubMed Central repository. Therefore, the Suite has been optimized for conversion from a variety of journal source DTDs, with the intent of providing a single format in which publishers can deliver their content to a wide range of archives. There are so many journal DTDs currently in use by publishers, repositories, content-aggregators, scientific societies, and compositors that this Suite cannot possibly incorporate all the variation to be found in such diverse models. But a wide variety of structures can be accommodated, since the content models for the elements have been made very flexible, including a wide range of elements with nearly all content structures optional.

The conversion focus also means that this is a larger, more inclusive DTD than might have been necessary if the intent had been, for example, to create only a journal authoring DTD. Many elements have been created explicitly so that information tagged by publishers would not be discarded when they converted material from another DTD to an archival interchange or repository DTD created from this Suite. Because of the broad scope of the several proposed electronic archives, this Suite contains elements and attributes that may only occur in a very few journals. Attribute values that a particular DTD would restrict to a list of options were declared as data character values, so that all options could be accepted. Care has been taken to provide several mechanisms (frequently, information classing attributes) to preserve the intellectual content of a document structure when that structure is converted from another DTD or schema to this one, even if there is no exact element equivalent of the structure.

The exact replication of the look and feel of any particular journal has not been a consideration. Therefore, many purely formatting mechanisms have not been included.


Scope: Journal Articles Plus

Although the DTD Suite aims at much broader application, this Journal Archiving and Interchange DTD models journal articles, where a journal article is defined as the typical research article found in an STM journal. By design, the definition of an article includes some article-like material found in journals, for example, letters, editorials, short new pieces, obituaries, meeting reports, and book and product reviews.

By design, this is a model for journal articles, not for complete journals. Thus, there is no overarching model for a collection of articles. In addition, the following journal material has not been described by this DTD/schema:

  • Company or product display advertising;
  • Job search or classified advertising;
  • Calendars, meeting schedules, and conference announcements (except as these can be handled as ordinary articles or sections within articles); and
  • Material specific to an individual journal, such as Author Guidelines, Policy and Scope statements, Editorial or advisory boards, detailed indicia, etc.


Introduction to the Journal Archiving and Interchange DTD

The Journal Archiving and Interchange DTD defines a document that is a top-level component of a journal such as an article, a book or product review, or a letter to the editor. Each such document is composed of one or more parts; if there is more than one part, they must appear in the following order:

  • Front matter (required). The article front matter contains the metadata for the article (also called article header information), for example, the article title, the journal in which it appears, the date and issue of publication for that issue of that journal, a copyright statement, etc. This is not textual front matter as in books, this is bibliographic information about the article and the journal in which it was published.
  • Body of the article (optional). The body of the article is the main textual and graphic content of the article. This usually consists of paragraphs and sections, which may themselves contain figures, tables, sidebars (boxed text), etc. The body of the article is optional to accommodate those repositories that just keep article header information and do not tag the textual content.
  • Back matter for the article (optional). If present, the article back matter contains information that is ancillary to the main text, such as a glossary, appendix, or list of cited references.
  • Following the front, body, and back, there may be either one or more responses or one or more subordinate articles:
    • Response (rarely used). A response is a commentary on the article itself, for example, an opinion from an editor on the importance of the article or a reply from the original author to a letter concerning his article.
    • Sub-article (rarely used). A sub-article is a small article that is completely contained inside another article.


Modular DTD Design

The Archiving and Interchange DTD Suite has been written as a set of XML DTD “modules”, each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different DTDs, for example, both an Archival and Interchange DTD and an Archival Repository DTD. Modules are primarily intended for maintenance; all the elements of the same “type” (class) are stored together.

The first DTD (an Archival and Interchange DTD for the National Library of Medicine) has been built from these modules; both this DTD and the individual modules will be described in this Tag Library.

The major disadvantage of a modular system is the longer learning curve, since it may not be immediately obvious where within the system to find a particular element or attribute cluster. To help with this, the description of each element in the Element Section of this documentation names the module in which that element is defined.

There are many advantages to such a modular approach. The smaller units are written once, maintained in one place, and used in many different DTDs. This makes it much easier to keep lower level structures consistent across document types, while allowing for any real differences that analysis identifies. A DTD for a new function (such as an authoring DTD) or a new publication type can be built quickly, since most of the necessary components will already be defined in the DTD Suite. Editorial and production personnel can bring the experience gained on one tagging project directly to the next with very little loss or retraining. Customized software (including authoring, typesetting, and electronic display tools) can be written once, shared among projects, and modified only for real distinctions.


Potential DTDs from This Suite of Modules

It has been proposed that three DTDs be created initially from the modules in this DTD Suite:

  • Archival and Interchange DTD — (Journal Archiving and Interchange DTD) A DTD into which publishers’ original XML (or SGML) content can be converted, providing a common format for storing this material and for transferring content to any one of a number of archives. This DTD has been designed to be open and inclusive, to allow journal articles to be translated from a wide variety of proprietary journal article DTDs. It is permissive by design, intended to capture whatever warts and wrinkles exist in Publishers’ content.
  • Publishing DTD — (Journal Publishing DTD) A DTD for use as a repository DTD by archives which desire consistent processing and searching capability. A Journal Publishing DTD has been developed that is a more prescriptive and restrictive subset of the Journal Archiving and Interchange DTD Suite than the Archiving DTD (above). This DTD may also be used to transfer content between archives.
  • Authoring DTD — A DTD intended for writing and editing new journal articles. The Authoring DTD will be designed to enable “good” journal coding and to provide assistance to authors and editors through more restricted models than the interchange and repository DTDs allow.


Accessibility

Because access for a wide range of output devices, as well as for the visually impaired, is becoming more and more important in the STM journal community, the modules in this Suite were designed to follow, as much as possible, the W3C Web Content Accessibility Guidelines 2.0 working draft (22 August 2002), which was the latest accessibility specification available when the Suite was initially constructed. This Specification specifies accessibility guidelines on many levels from design through application. The guidelines which pertain to the modeling of materials were followed to at least Level-2 compliance. For example, a Long Description <long-desc> element was defined as part of many other elements, such as Figure <fig>, so it can be added not only to all figures and other graphical objects, but to any section of the text (for example, to a Boxed Text <boxed-text>) to provide an accessible description of the object. The xml:lang attribute was added to all section-level elements and many paragraph-level elements to permit explicit indication of the language of the content, as required by these guidelines. The Abbreviation or Acronym <abbrev> element (also to be used for acronyms) was added to meet Checkpoint 4.3.


How To Read This Tag Library


Terms and Definitions

Element

Elements are nouns, like “Speech” and “Speaker”, that represent components of journal articles, the articles themselves, and accompanying metadata.

Attribute

Attributes hold facts about an element, such as which type of list (e.g., numbered, bulleted, or plain) is being requested when using the List <list> tag, or the name of a pointer to an external file that contains an image. Each attribute has both a name (e.g., list-type ) and a value (e.g., “bulleted”).

Metadata

Data about the data, for example, bibliographic information. The distinction is between metadata elements which describe an article (such as the name of the journal in which an article was published) versus elements which contain the textual and graphical content of the article.


How To Start Using This Tag Library

A full DTD Suite delivery package includes the Journal Archiving and Interchange DTD, the customization files for this DTD, the modules that comprise the full Archiving and Interchange DTD Suite, one or more tagged sample documents, and this documentation, as a set of linked HTML files. How you use the documentation will depend on what you need to learn about the modules and the DTD.

If you want to learn about the elements and the attributes in the Suite or to learn how the journal article model is constructed, here is a good way to start.

  • Read the Tag Library General Introduction, taking particular note of the next section which describes the parts of the Tag Library, so you will know what resources are available. Stop just before the section “How To Make New DTDs From These Modules”.
  • Next, if you do not know the symbols used in the Document Hierarchy diagrams, read the “Key to the Near & Far® Diagrams”.
  • Scan the Document Hierarchy diagrams to get a good sense of the top-level elements and their contents. (Find what is inside an <article>, now what is inside each of the four large pieces of an article, keep working your way down.)
  • Pick an element from one of the diagrams. (Look up the element in the Elements Section to find the full name of the element, its definition, usage notes, content allowed, and any attributes. Look up one of the attributes to find its full name, usage notes, and potential values.).
  • Finally, if you are interested in conversion from a particular source:
    • Look at an article in a printed or online journal or look at the DTD for the other journal. Can all the information you want to store from an article fit into the models shown in the diagrams? Do you have, or know how to get, all the information the models require? Will that information always be available for documents that are complete and correct? How difficult will it be to identify the parts of the information using the elements and attributes described in these models? Would changes to one or more models make this easier?
    • Now look at some non-article content, such as a news column, a book review, or some letters to the editor. Are there tags to handle all these article types and all their components?

If you want to learn about the DTD Suite in order to write a new DTD:

  • Skim the Tag Library General Introduction. (Start really reading at the section “How To Make New DTDs From These Modules”. Read the Parameter Entities that name the classes. Scan the DTD modules.)
  • If you do not know the symbols used in the Document Hierarchy diagrams, then read the “Key to the Near & Far® Diagrams”.
  • Use the Document Hierarchy diagrams to give you a good sense of the top-level elements and their contents.
  • Pick an element from one of the diagrams (Look up the element in the Elements Section to find the full name of the element, the definition, usage notes, content allowed, and attributes list. Look up one of the attributes to find its full name, usage notes, and potential values.).
  • Read the DTD Modules, given at the end of this documentation.
    New DTDs are created by writing a new DTD module and new customization modules, so you might want to read (in order) the DTD module (archivearticle.dtd), the module that names all the other modules (%modules.ent;), the module that names all customization modules (%archivecustom-modules.ent;), and the customization modules themselves (%archivecustom-classes.ent;, %archivecustom-mixes.ent;, and %archivecustom-models.ent;). (You might also wish to familiarize yourself with the relationship between the “customization” modules and the “default” modules for classes, mixes, and models.) Those can be followed by any one of the class modules (although the DTD Suite has been designed to have the %common.ent; module precede any other class module).


Structure of This Tag Library

This Tag Library contains the following sections:

Introduction

This introduction to the contents of this Tag Library, to the design philosophy and intended usage of the Archiving and Interchange DTD Suite, and to the Journal Archiving and Interchange DTD.

Elements Section

Descriptions of the elements used in the Journal Archiving and Interchange DTD and DTD Suite modules. The element descriptions are listed in alphabetical order by tag name. (Note: Each element has two names: a “tag name” (formally called an element-type name) that is used in tagged documents, the DTDs, and by the software, and an “element name” (usually longer) that provides a fuller, more descriptive name for the benefit of human readers. For example, a tag name might be <disp-quote> with the corresponding element name Quote, Displayed or a tag name might be <verse-group> with the corresponding element name Verse Form for Poetry.

Attributes Section

Descriptions of the attributes in the DTD modules. Like elements, attributes also have two names: the shorter machine-readable one and a (usually longer) human-readable one. Attributes are listed in order by the shorter machine-readable names, for example, the attribute short name list-type instead of the more informal, easier to read: Type of List.

Parameter Entity Section (For Implementors Only)

Names (with occasional descriptions) and contents of the Parameter Entities in the DTD modules.

Context Table

Listings of where each element may be used. All elements are given in a simple alphabetical list. There is a single table for the elements from all the Suite modules that are called from the DTD.

The Context Table is formatted in two columns. The first column lists an element’s tag name, and the second column lists the tag names of all the elements in which the first element may occur. For example, if the first column contains the front matter element <front> and the second column contains only the article element <article>, this means that the <front> (Front Matter) element may only be used inside an <article> (Article) element.

Most elements may be used inside more than one other element. For example, the element <access-date> (Access Date for Cited Work) may be used inside the <citation>, <product>, and <related-article> elements.

Note: These Context Table listings (which list where an element may be used) are the inverse of the content definition that is given as a part of each element description, which lists what can be inside the named element.

Document Hierarchy Diagrams

Tree-like graphical representations of the content of many elements. This can be a fast visual way to determine the structure of an article or of any element within an article.

Index By Tag Name

Index of element descriptions, alphabetically by tag name (element-type name)

Index By Element Name

Index of element descriptions, alphabetically by element name (the longer, more descriptive name)

DTD Section

Copies of the Journal Archiving and Interchange DTD, its customization modules, and the full Archiving and Interchange DTD Suite of XML DTD modules described in this Tag Library


Tag Library Typographic Conventions

<alt-text> The tag name of an element (Written in lower case with the entire name surrounded by “< >”)
Alternate Text Name (For a Figure, Etc.) The element name (long descriptive name of an element) or the descriptive name of an attribute (Written in title case, that is, with important words capitalized, and the words separated by spaces)
must not Emphasis to stress a point

How To Make New DTDs from These Modules (For Implementors Only)


Modular DTD Design

This DTD Suite has been written as a series of XML DTD modules that can be combined into a number of different DTDs. The modules are separate physical files that, taken together, define all element structures (such as tables, math, chemistry, paragraphs, sections, figures, footnotes, and reference elements), as well as attributes and entities in the Suite.

Modules in the Suite are primarily intended to group elements for maintenance. There are different kinds of modules. A module may either:

  • Be a building block for a base DTD (such as the Module to Name the Modules module)
  • Define the elements inside a particular structure. For example, the Bibliography References (Citation) Elements Module module names all the potential components of bibliographic reference lists.
  • Name the members of a “class” of elements, where class is a named grouping of elements that share a similar usage or potential location. For example, the Phrase-Level Content Elements Module module defines small floating elements that may occur within text, such as inside a paragraph or a title, or that describe textual content, for example, a disease name, drug name, or the name of a discipline.
  • Be a module of “editorial convenience”. For example, the Common (Shared) Element Declarations Module module holds elements and attributes used in the content models of the various class elements.


Parameter Entities Modules to Customize and Change

Parameter Entities are the major mechanism for customizing a DTD or creating a new DTD from the modules in the Suite. Individual DTDs will be constructed by 1) establishing element and attribute combinations and content models using Parameter Entities in one of the DTD-specific customizing modules and 2) choosing appropriate modules from the Suite that declare the elements needed. For example, if the base DTD contained 6 kinds of lists and 2 table models, a more specific DTD, such as an authoring DTD, might use a Customize Classes Module to redefine the List Class to name only 3 lists and redefine the Display Class to allow only one table model.

The standard modules to create a customized DTD are: the DTD itself, a module to name its components, and as many over-ride modules and new elements modules as necessary. Typical modules for a new DTD are:

  • DTD — The DTD module (.dtd) for the new DTD base DTD (At a minimum, this module declares the top-level element (such as article, book, or report) and any other structural elements unique to the new document type.)
  • DTD-specific Module of Modules — The DTD-specific module of modules, to name all the new modules created expressly for the new DTD
  • Class Over-rides — DTD-specific over-rides of the Suite default element classes
  • Mix Over-rides — DTD-specific over-rides of the Suite default class mixes
  • Model Over-rides — DTD-specific content model over-rides for the content models in the modules of the suite (using “-elements” and “-model” Parameter Entities)
  • New Models — DTD-specific new elements (for example, a new Book DTD might add book-specific metadata elements)


Element Classes

Many of the elements in the Journal Archiving and Interchange DTD have been grouped into loose element classes. There is no hard and fast rule for what constitutes a class; each one is a design decision, a matter of judgment. These classes are designed to ease customization to meet the particular needs of new DTDs. Base classes for the DTD Suite are defined in a separate Default Element Classes Module (%default-classes.ent;).

Content models are built using sequences of elements, and OR groups that are classes (typically) or mixes. As an example, the content model for a Paragraph element is declared to be an OR group (that is, a choice) of text, numbers, or special characters and any of the elements named in the Paragraph Elements mix. The mix %p-elements; is declared to be a large OR group of many other element-defining classes: the Block Display Class Elements, the Mathematical Expressions Class Elements, the List Class Elements, the Citation Class Elements, et al.

These element classes can be viewed as building blocks that will be used to build larger Parameter Entities for element mixes. (Note: A mix describes a usage circumstance for a group of elements, such as all the paragraph-level elements, all the elements allowed inside a table cell, all the elements inside a paragraph, or all the inline elements). For example, to add another block display item to the Block Display Class Elements, you would edit the %block-display.class; Parameter Entity in the DTD-specific Archive Class Over-ride Module to override the default Parameter Entity in the DTD Suite’s Default Element Classes Module module and create a new module containing the Element Declaration of the new block display item.


How To Build a New Custom DTD


The Concept

The basic idea for a new DTD is that all lower-level elements (paragraphs, lists, figures, etc.) will be defined in modules — either the modules of the base Suite or in new DTD-specific modules rather than in the DTD itself. The new DTD will be fairly short and include only definitions of the topmost elements, at least the document element and maybe its children.

Modules are defined (declared) using External Parameter Entities in the Suite’s Module to Name the Modules or in the DTD-specific Module of Modules. Modules are called (referenced) in the DTD proper, in the order needed to define the Parameter Entities in sequence.

Version 2.1 of this Journal Archiving and Interchange DTD was written as an example of the new best-practice customization technique. A new variant DTD that follows this plan will probably consist of the following modules:

  • A DTD module to define the top-level elements (for example, archivearticle.dtd);
  • A DTD-specific Module of Modules to name new non-Suite modules in the DTD (for example, %archivecustom-modules.ent;);
  • A DTD-specific definition of element classes to add new classes and over-ride the Suite default classes (for example, %archivecustom-classes.ent;);
  • A DTD-specific definition of element mixes to add new mixes and over-ride the default mixes (for example, %archivecustom-mixes.ent;);
  • A DTD-specific module of content model over-rides (for example, %archivecustom-models.ent;);
  • DTD-specific modules to hold new element declarations; and
  • All or most of the modules in the Suite.


Making a Variant DTD

To show the process, here is a series of instructions for making a new DTD, illustrated by showing how the Journal Archiving and Interchange DTD was created from the modules of the whole Suite.

  1. Modules — Write a new DTD-specific Module of Modules which defines all new customization modules the DTD needs. (As an example, the Archiving DTD created the module %archivecustom-modules.ent;, which contains the definitions of the class-over-ride module %archivecustom-classes.ent;, the mix-over-ride module %archivecustom-mixes.ent;, and the models-over-ride module %archivecustom-models.ent;.)
  2. Class Over-rides — Write a DTD-specific class-over-ride module, defining any over-rides to the Suite classes, which are defined in the default classes module, %default-classes.ent;. (As an example, the Archiving DTD created the module %archivecustom-classes.ent;, in which a new model for %contrib-info.class; was declared and an entirely new class %x.class; was added.)
  3. Mix Over-rides — Write a DTD-specific mix-over-ride module defining any over-rides to the Suite mixes, which are defined in the default mixes module, %default-mixes.ent;. (As an example, the Archiving DTD created the module %archivecustom-mixes.ent;, in which a new mix %all-phrase; was declared and then used in many existing mixes such as %simple-phrase;.)
  4. Model Over-rides — Create a DTD-specific content-model-over-ride module defining any over-rides to the content models and attribute lists for the DTD Suite. (As an example, the Archiving DTD created the module %archivecustom-models.ent;, in which element collections (suffixed “-elements”) that will be mixed with #PCDATA were redefined, full content models over-rides (suffixed “-model”) were redefined, and some new attributes and attribute lists were added.)
  5. New Elements — Write any new element modules needed. These will define any new block-level or phrase-level elements. (As an example, the Archiving DTD did not need any new elements not in the Suite, but the new NLM Book DTD added modules for book metadata and book component parts.)
  6. DTD Module — With those modules in place, construct a new DTD module. Within that module:
    • Use an External Parameter Entity Declaration to name and then call the DTD-specific modules of modules. (For the Archiving DTD, the module %archivecustom-modules.ent;)
    • Use an External Parameter Entity Declaration to name and then call the DTD Suite Modules of Modules, which names all the potential modules. (For the Archiving DTD, the module %modules.ent;)
    • Use an External Parameter Entity reference to call the DTD-specific class over-rides. (For the Archiving DTD, the module %archivecustom-classes.ent;)
    • Use an External Parameter Entity reference to call the DTD Suite default classes. (For the Archiving DTD, the module %default-classes.ent;)
    • Use an External Parameter Entity reference to call the DTD-specific mix over-rides. (For the Archiving DTD, the module %archivecustom-mixes.ent;)
    • Use an External Parameter Entity reference to call the DTD Suite default mixes. (For the Archiving DTD, the module %default-mixes.ent;)
    • Use an External Parameter Entity reference to call the DTD-specific content models and attribute list over-rides. (For the Archiving DTD, the module %archivecustom-models.ent;)
    • Use an External Parameter Entity reference to call in the standard Common Module (%common.ent;) that defines elements and attributes so common they are used by many modules.
    • Select, from the Module of Modules, those modules which contain the elements needed for the DTD (for instance, selecting lists and not selecting math elements) and calling in each of the modules needed. (The Archive DTD calls these in alphabetical order, since the order does not matter.)
    • Define the document element and any other unique elements and entities needed for this DTD. (For example, the Archiving DTD declares only six elements — <article> [the top-level element] and its components: <front>, <body>, <back>, <sub-article>, and <response>.)


The Element Classes in the Suite

The classes described here — with a few exceptions, such as %x.class;, noted below — are defined in the Journal Archiving and Interchange DTD Suite Default Element Classes Module (%default-classes.ent;) and have been used to divide the elements into physical modules. The documentation for the classes and their current default element contents are listed in the Parameter Entity Section toward the end of this Tag Library. In the Parameter Entity Section, the names of the elements in a group or class are listed within quotation marks, separated by vertical bars. For example, Phrase Class will be listed as “%phrase.class;” and shown to contain:

(abbrev | named-content)

which means that the two elements <abbrev> and <named-content> are defined as Phrase Class Elements.

Accessibility Class

(%access.class;) Elements added to make the processing of journal articles more accessible to people with special needs and the devices that meet those needs, for example, the visually handicapped. Includes, for example, the element <alt-text> which is a short phrase name or description of an object, usually a graphical object, that can be used “behind the picture” on a website or pronounced in an audio system.

Address Class

(%address.class;) Potential element components of an address, such as <country> or <fax>

Appearance Class

(%appearance.class;) Formatting elements used primarily in tables, for example, a horizontal rule (usage discouraged)

Appendix Class

(%app.class;) A construct containing only the appendix for use in the back matter of an article

Back Matter Class

(%back.class;) Ancillary elements, typically used in the back matter of an article, section, etc.

Break Class

(%break.class;) Formatting element used to force a line break, primarily in tables and titles (usage discouraged)

Citation Class

(%citation.class;) Reference (a citation) to an external document as used within, for example, the text of a paragraph

Conference Class

(%conference.class;) Metadata elements that may be used to describe a conference, for example, the conference name, theme, and sponsoring organization

Contributor Information Class

(%contrib-info.class;) Metadata about a contributor [Defined in the %archivecustom-classes.ent; module]

Corresponding Author Class

(%corresp.class;) Elements associated with the corresponding author

Date Class

(%date.class;) Dates and other matters of history such as a Date as a String

Date Parts Class

(%date-parts.class;) The components of a date, such as <year> or <season>

Definition Class

(%def.class;) Definitions (<def>) and other elements to match with terms and abbreviations

Degree Class

(%degree.class;) The academic or professional degrees that accompany a person’s name

Display Class

(Several Parameter Entities: %caption.class;, %block-display.class;, %display-back-matter.class;, %fig-display.class;, %inline-display.class;, %just-base-display.class;, %simple-display.class;), %simple-intable-display.class;) Graphical or other display-related elements, including figures, chemical formulas, and images [%inline-display.class; defined in the %archivecustom-classes.ent; module]

Emphasis Class

(%emphasis.class;, %subsup.class;) Used to produce rendering/typographical distinctions, such as superscript, subscript, or bold text [Defined in the %archivecustom-classes.ent; module]

Front and Back Class

(%front-back.class;) Ancillary elements, typically used in the front or back matter of an article

Identifier Class

(%id.class;) DOIs and other identifiers used by publishers at many levels, for example, for an <abstract> or a <fig>

Keyword Class

(%kwd.class;) Keywords and other elements which name a subject term, critical expression, key phrase, etc. associated with an entire document and used for identification and indexing purposes [<x> which holds generated punctuation or other generated text, for example, the commas or semicolons between keywords, defined in the %archivecustom-classes.ent; module]

Label Class

(%label.class;) The label element, used to hold the number, prefix character, or prefix word or phrase of a labeled object such as a table, figure, or footnote

Link Class

(Several Parameter Entities: %address-link.class;, %article-link.class;, %simple-link.class;, %fn-link.class;) Elements that associate one location with another, including cross references, and URIs for links to the World Wide Web

List Class

(%list.class;) The types of lists used in text, including numbered lists and bulleted lists

Math Class

(Several Parameter Entities: %math.class;, %block-math.class;, %inline-math.class;) The mathematical elements (<mml:math>, <tex-math>) and the elements that can contain them (such as <inline-formula> and <disp-formula>)

Name Class

(%name.class;) The elements used to name the personal names for individuals (such as <string-name>) or the collaboration names for groups (<collab>) for people who produce products or articles

Paragraph Class

(Several Parameter Entities: %just-para.class;, %rest-of-para.class;, %intable-para.class;) Information for the reader that is at the same structural level as a paragraph, including both regular paragraphs and specially-named paragraphs that may have distinctive uses or different displays, such as dialogs and formal statements

Personal Name Class

(%person-name.class;) The element components of a person’s name (such as <surname>),which can be used, for example, inside the name of a contributor

Phrase Class

(%phrase.class;) Inline elements that surround a word or phrase in text because the subject (content) should be identified to support some kind of display, searching, or processing (such as <named-content> to identify a drug name, genus/species, product, etc.)

Reference Class

(%references.class;) The elements that may be included inside a Citation (bibliographic reference)

Reference List Class

(%ref-list.class;) A construct containing only the reference list (defined in References Module) for use in the back matter of an article

Section Back Matter Class

(%sec-back.class;) Ancillary elements, typically used in the back matter of a section, etc.

Section Class

(%references.class;) The elements that are at the same hierarchical level as a section

Table Class

(Several Parameter Entities: %table.class;, %just-table.class;, %table-foot.class;, %tbody.class;) Elements that contain the rows and columns inside the Table Wrapper element (<table-wrap>). The following XHTML table model elements can be set up for inclusion: <table>.

In the full modular DTD Suite, the OASIS table model element <oasis:table> may also be selected, but it is not included in the Journal Archiving and Interchange DTD or documented in this Tag Library.

X Class

(%x.class;) Class containing a single element to hold generated punctuation or other generated text, for example, the commas or semicolons between keywords [Defined in the %archivecustom-classes.ent; module]


Modules in the Archiving and Interchange DTD Suite

The DTD Suite was created to allow a multiplicity of DTDs, based on the needs of the intended use, for example, an authoring DTD versus one for a repository. The Journal Archiving and Interchange DTD (archivearticle.dtd) and its specific customization modules (%archivecustom-classes.ent;, %archivecustom-mixes.ent;, %archivecustom-models.ent;, and %archivecustom-modules.ent;) define an archival and interchange focused DTD. The following modules are critical for the customization process that creates that DTD:

Journal Archiving and Interchange DTD

(File name archivearticle.dtd) The top-level Journal Archiving and Interchange DTD Module that declares the document element (Article) and the other top-level elements that define a journal article (front matter, back matter, and sub-articles or responses). All elements but these few are declared in the modules of the Suite. The DTD invokes all the other modules it uses, by reference, as external Parameter Entities: first the Archiving DTD-Specific Module of Modules is called to name all Archive-specific customized modules, then the Suite Module of Modules is called to name all the potential modules from the Suite, then customized and default modules are called (for Parameter Entities naming element classes, mixes, and models), then the Common Module for shared elements and attribute lists is called, and finally all the other modules are called as needed.

Module to Name Archiving DTD-Specific Modules

(Parameter Entity %archivecustom-modules.ent;) Defines all the external modules that are specific to the Archiving DTD (except itself, which must be both named and called inside a DTD). A DTD selects from these modules by referencing the module names through external Parameter Entities. The entities are declared in the Archiving DTD-Specific Module of Modules (%archivecustom-modules.ent;), but referenced (or not) in the DTD proper. To include a set of elements (such as all the lists or all the MathML elements), a DTD references the external Parameter Entity of the module that contains these declarations.

Note: The Archiving DTD-Specific Module of Modules and the Suite Module to Name the Modules need to be the first two external modules called by the Archiving DTD. Customization modules for classes, mixes, and models will typically be called following the Archiving DTD-Specific Modules and the Module to Name the Modules.

Suite Module to Name the Modules

(Parameter Entity %modules.ent;) Defines all the external modules that are part of the modular Archiving and Interchange DTD Suite (except itself, it must be both named and called inside a DTD). A DTD selects from the Suite modules by referencing the module names through external Parameter Entities. The entities are declared in the Module to Name the Modules (%modules.ent;), but referenced (or not) in the DTD proper. To include a set of elements (such as all the article metadata or all the display elements) a DTD references the external Parameter Entity of the module that contains these declarations.

Note: The Archiving DTD-Specific Modules of Modules and the Suite Module to Name the Modules need to be the first two external modules called by the Archiving DTD. Customization modules for classes, mixes, and models will typically be called next, following these two.

Archiving DTD-Specific Class Customizations Module

(Parameter Entity %archivecustom-classes.ent;) Sets up Parameter Entities that will be used to override default classes prescribed by the %default-classes.ent; module

Note: This module must be called after the Archiving DTD-Specific Modules (%archivecustom-modules.ent;) and the Suite Module to Name the Modules (%modules.ent;) but before any other module, including specifically the %default-classes.ent; module (which this module overrides) and the %archivecustom-mixes.ent; and %archivecustom-models.ent; modules (which build on this module).

Suite Default Element Classes Module

(Parameter Entity %default-classes.ent;) Sets up the Parameter Entities that name the element members of each class that will be used to establish the content models

Note: This module must be called before the Archiving Customize Mixes Module (%archivecustom-mixes.ent;) and the Default Element Mixes Module (%default-mixes.ent;), as well as the Archiving Customize Models Module, %archivecustom-models.ent; (which builds on those modules).

Archiving DTD-Specific Mix Customizations Module

(Parameter Entity %archivecustom-mixes.ent;) Sets up Parameter Entities that will be used to override default mixes (groupings made of “classes”) prescribed by the %default-mixes.ent; module

Note: This module must be called after the Archiving Customize Classes Module (%archivecustom-classes.ent;) and the Default Classes Module (%default-classes.ent;) but before any other module, including specifically the %default-mixes.ent; module (which this module overrides) and the %archivecustom-models.ent; module (which builds on this module).

Suite Default Element Mixes Module

(Parameter Entity %default-mixes.ent;) Sets up the Parameter Entities that name mixes (groupings made of “classes”) that will be used to establish the content models

Note: This module must be called before the Archiving Customize Models Module (%archivecustom-models.ent;) or any “base” module of the interchange Suite.

Archiving DTD-Specific Models/Attributes Customizations Module

(Parameter Entity %archivecustom-models.ent;) Sets up Parameter Entities that will be used to override default content model Parameter Entities set elsewhere in the Suite. Also defines customizable attribute Declared Values and attribute lists for the DTD being defined.

Note: This module must be called after the Archiving DTD Customize Mixes Module (%archivecustom-mixes.ent;) and Default Mixes Module (%default-mixes.ent;) but before any “base” module of the interchange Suite.

The modules comprising the rest of the DTD Suite are:

Common (Shared) Elements Module

(Parameter Entity %common.ent;) Declarations for elements, attributes, entities, and notations that are shared by more than one class module

Note: This module must be called before any of the modules comprising the interchange Suite.

Article Metadata Elements Module

(Parameter Entity %articlemeta.ent;) Declares the metadata elements (issue elements and article header elements) used to describe a journal article

Note: Metadata elements that describe the journal are in the Journal Metadata Module, %journalmeta.ent;.

Back Matter Elements Module

(Parameter Entity %backmatter.ent;) Declares elements that are not part of the main textual flow of a work, but are considered to be ancillary material such as appendices, glossaries, and bibliographic reference lists

Display Class Elements Module

(Parameter Entity %display.ent;) Declares the display-related elements, such as figures, graphics, math, chemical expressions and structures, tables, etc.

Format Class Elements Module

(Parameter Entity %format.ent;) Declares elements concerned with rendition of output, for example, printing on a page or display on a screen. This module includes the elements in the Appearance Class, the Break Class, and the Emphasis Class.

Journal Metadata Elements Module

(Parameter Entity %journalmeta.ent;) Declares the elements used to describe the journal in which a journal article is published

Note: The issue and article metadata is defined in the Article Metadata module, %articlemeta.ent;.

Link Class Elements Module

(Parameter Entity %link.ent;) Declares elements that are links (internal or external) by definition, such as URLs (<uri>) and internal cross references (<xref>)

List Class Elements Module

(Parameter Entity %list.ent;) Declares the elements in the List Class; these are all lists except the lists of bibliographic references (citations). Lists are considered to be composed of items.

Math Class Elements Module

(Parameter Entity %math.ent;) Declares the elements in the math classes such as display equations

Paragraph-Like Elements Module

(Parameter Entity %para.ent;) Declares structural, non-display elements that may appear in the same places as a paragraph. These elements are named in the various paragraph class Parameter Entities.

Subject Phrase Class Elements Module

(Parameter Entity %phrase.ent;) Declares the Phrase Class elements, that is, names the inline, subject-specific elements. At the time of this DTD’s creation, there were only two phrase-level elements. If more specific subject words (such as “gene”) are added to later versions of this DTD, they should be added to the %phrase.class; Parameter Entity and defined in this module or in the Common Module, %common.ent;.

Bibliographic Reference (Citation) Class Elements Module

(Parameter Entity %references.ent;) Declares the bibliographic reference elements

Section Class Elements Module

(Parameter Entity %section.ent;) Declares the elements of the Section Class, that is, declares all section-level elements in the Journal Archiving and Interchange DTD. At the time of this DTD’s creation, there is only one such element, Section (<sec>) itself, but future expansion to named sections (such as <methodology> or <materials>) or any new section-level structures would be added here.

MathML Setup Module

(Parameter Entity %mathmlsetup.ent;) Invokes the MathML modules

DTD Creation Note: To include the MathML elements, a DTD must reference this module. This module sets up all Parameter Entities needed to use the MathML tagset and references (invokes) the MathML 2.0 DTD Module, which, in turn, invokes all the other MathML modules.

MathML 2.0 DTD Module

(Parameter Entity %mathml.dtd;) Mathematical Markup Language (MathML) 2.0, an XML application for describing mathematical notation and capturing both its structure and content

MathML 2.0 Qualified Names 1.0

(Parameter Entity %mathml-qname.mod;) Declares Parameter Entities to support namespace-qualified names, namespace declarations, and name prefixing for MathML, as well as declares the Parameter Entities used to provide namespace-qualified names for all MathML element types

Extra Entities for MathML 2.0

(Parameter Entity %ent-mmlextra;) Used for MathML processing

Aliases for MathML 2.0

(Parameter Entity %ent-mmlalias;) Used for MathML processing

XHTML Table Setup Module

(Parameter Entity %XHTMLtablesetup.ent;) Sets all Parameter Entities needed by the HTML 4.0 (XHTML) table model, and then invokes the module containing that model

DTD Creation Note: To include the XHTML table model, a DTD must reference this module. This module sets up all Parameter Entities needed to use the XHTML table model and references (invokes) the XHTML Table Model Module.

XHTML Table Model Module

(Parameter Entity %htmltable.dtd;) The public XML version of the HTML 4.0 (XHTML) table model. This module is invoked in %XHTMLtablesetup.ent;.

OASIS XML Table Setup Module

(Parameter Entity %oasis-tablesetup.ent;) Note: Not used in the current Archiving DTD. Sets all Parameter Entities needed by the OASIS (CALS) Exchange table model, and then invokes the module containing that model

DTD Creation Note: To include the OASIS table model, a DTD must reference this module. This module sets up all Parameter Entities needed to use the OASIS table model and references (invokes) the OASIS XML Exchange Table Model Module.

OASIS XML Exchange Table Model Module

(Parameter Entity %oasis-exchange.ent;) Note: Not used in the current Archiving DTD. The OASIS (CALS) Exchange table model. This module is invoked in %oasis-tablesetup.ent;.

XML Special Characters Module

(Parameter Entity %xmlspecchars.ent;) Standard ISO XML special character entities used in this DTD

Custom Special Characters Module

(Parameter Entity %chars.ent;) Custom special character entities created specifically for use in this DTD

Notation Declarations Module

(Parameter Entity %notat.ent;) Container module for the Notation Declarations to be used with this DTD Suite. These notations have been placed in their own module for easy expansion or replacement.


Archiving and Interchange DTD Suite Naming Conventions


XML Component Naming Conventions


Basic Element and Attribute Naming Rules

  • CASE — Element, attribute, and entity names that originate with Archiving and Interchange DTD Suite are in all lower case. Element and attribute names taken from PUBLIC modules (e.g., MathML and various table modules) incorporated into these DTDs are in the case in which they were found in the original module.
  • TWO-WORD NAMES — Elements named with two words are separated by a hyphen, for example, <def-list> and <term-head>.
  • WORD STANDARDIZATION — Abbreviations are standardized so that, for example, “figure” is always used as “fig” (as in the element <fig-group>) and group is not abbreviated (as in the elements <fig-group>, <kwd-group>, and <fn-group>). The naming rules are described in the Archiving and Interchange DTD Naming Rules section of this Tag Library.

Parameter Entity Names for Classes and Mixes

PARAMETER ENTITY: SAME FUNCTION, SAME NAME — The Suite modules and initial DTDs have used a series of Parameter Entity naming conventions consistently. While parsing software cannot enforce these Parameter Entity naming or usage conventions, these conventions can make it much easier for a person to know how the content models work and what must be modified to make a DTD change.

CLASSES — Classes are functional groupings of elements used together in an OR group. Each class is named with a Parameter Entity, and all class Parameter Entity names end in the suffix “.class”:

 <!ENTITY % list.class "def-list | list">

A class, by definition, should never be made “empty”; the class should be removed from all models where you do not want the class elements included.

MIXES — Mixes are functional OR groups of classes; mixes should never contain element names directly. All mixes must be declared after all classes, since mixes are composed of classes. Mix names have no set suffix; for example, they may end in “-mix” or “-elements”. Content models and content model over-rides use mixes and classes for all OR groups. Only content model sequences are made up of element names directly.

MODEL OVER-RIDES — Parameter Entity mixes for over-riding a content model are of two styles: 1) inline mixes and 2) full content model replacements. These two groupings have been defined and named separately to preserve the mixed-content or element- content nature of the models in DTDs derived from the Suite.

The inline Parameter Entities to be intermingled with character data (#PCDATA) in a mixed content model are named with a suffix “-elements”. For example, “%access-date-elements;” would be used in the content model for the element <access-date>:

 <!ENTITY % access-date-elements "| %date-parts.class; | %x.class;" >
 <!ELEMENT  access-date (#PCDATA %access-date-elements;)* >

All inline mixes begin with an OR bar, so that the mix can be removed leaving just character data (#PCDATA):

 <!ENTITY % rendition-plus "| %all-phrase;" >

The over-ride of a complete content model will be named with a suffix “-model” and should include the entire content model, including the enclosing parentheses:

 <!ENTITY % kwd-group-model "(title?, (%kwd.class; | %x.class;)+ )" >
 <!ELEMENT  kwd-group %kwd-group-model; >
 

File Naming Conventions

DTD — This Tag Library describes the components for the Journal Archiving and Interchange DTD. This DTD consists of a base DTD module (delivered as the file archivearticle.dtd) which calls in all the other modules as External Parameter Entities. Each module specific to this DTD (therefore, not part of the Suite) takes the prefix “archivecustom-”.

Each DTD and module has been assigned a unique formal public identifier (fpi). File names are never referenced directly in the comments in the DTD; the file is referred to by the name of the external Parameter Entity, which names the fpi and a system name for the file. The external Parameter Entity has been set to the initial delivery filename.

The individual modules of both the Suite and the DTD (as delivered) have been given DOS/Windows 3-digit suffixes indicating their type:

*.dtd

A module that can be used as the top level of an XML hierarchy. Used for the Journal Archiving and Interchange DTD top level, archivearticle.dtd, but also taken unchanged for public DTD modules that have been included in this DTD such as the MathML DTD and the XHTML table model.

*.ent

A DTD fragment for incorporation into a full DTD. May contain element declarations, entity declarations, etc.

*.mod

A DTD fragment for incorporation into a full DTD. May contain element declarations, entity declarations, etc. This extension has the same meaning as *.ent and is only used to maintain the extension names dictated by the inclusion of PUBLIC DTD fragments, for example, mathml2-qname-1.mod.

While the DTD cannot dictate graphic file names, the comments do suggest that best practice for naming graphic files in documents tagged according to this DTD Suite would be to limit the names and path names to these characters: letters (both upper and lower case), numbers, underscore, hyphen, and period. All such names will be assumed to be case sensitive. DOS-style file extensions may be used.


Phase II DTD Work

Modeling several structures and functions that might appropriately be part of a DTD using this Suite has been delayed until a later version of the Suite. Such components include:

  • Questions and Answers (except as they can be modeled with the current DTD by using paragraphs and lists);
  • Proper systematic identification keys (except as they can be tagged using regular list structures);
  • Continuing Medical Education material;
  • Forms and fill-in-the-blank areas;
  • Conflict of Interest statements and Financial Disclosures (except as they can be modeled using paragraphs and footnotes);
  • Electronic and Digital Rights Management material;
  • Advertising included in a journal (for example, employment listings, classified advertising, and display advertising);
  • Calendars, meeting schedules, and announcements (except as these can be handled as ordinary articles or sections within articles); and
  • Material specific to an individual journal such as Author Guidelines, Policy and Scope statements, Editorial or advisory boards, detailed indicia, etc.


Acknowledgments

We thank bmj.com, Molecular Biology of the Cell, and The Proceedings of the National Academy of Sciences of the U.S.A. for providing the sample articles used in this tag library.