General Introduction

The Article Authoring Document Type Definition (hereafter Authoring DTD) is used to create new journal articles tagged with XML elements. The DTD defines the elements (and their attributes) that describe the content of most journal articles, as well the content of some non-article journal material, such as editorials and book reviews. The person using this DTD is typically one of the article’s authors or someone (such as a copyeditor) to whom the author has sent the XML files. The DTD is intended for use with XML-aware or text-editing software to guide the writer in choosing tags correctly. This DTD is a simplified version of the Journal Publishing (Blue) DTD, which describes the biological and medical journal articles archived in PubMed Central.

This Tag Library provides the documentation for the authoring tag set including examples of element and attribute usage. The documentation is divided in to two levels: author documentation and implementor documentation, which is more detailed for people who need to modify or maintain the DTD. Author-level reference material includes:

Implementor-level material is also included:

DTD Design Principles

Purpose and Characteristics

This Authoring DTD create(s) a standardized format for new journal articles that can be used by authors to submit publications to journals and to archives such as PubMed Central. The focus on authoring means that this is a smaller, tighter, less inclusive DTD than would have been necessary to create a journal archiving DTD. Unlike an archiving DTD, which needs to accommodate a wide variety of structures and offers very flexible content models for elements, this DTD is more prescriptive than descriptive and includes many elements whose content must occur in a specified order. This is a DTD optimized for authorship of new journal articles, where regularization and control of content is important, and where it is useful rather than harmful to have only one way to tag a structure.

Since no assumptions can be made concerning the processing software and editorial situation that will receive an article authored in this DTD, tagging that forces specific formatting has been avoided. There is no way for an author to number his/her lists explicitly, for example, or to manually number the cited references, since many journals have their own citation policies. Numbers for the cited references must be generated by software to match editorial policy. For proofing purposes the stlyesheets that may be used to produce PDF from these tagged articles number the references as counting numbers and the stylesheets that produce HTML for proofing display the identifier of the <ref> as the number.

Scope: Journal Articles

  • Articles. This is not a DTD for complete journals. The Authoring DTD models journal articles, where such an article is defined as the typical research article found in an STM journal. By design, the definition of an article is broad enough to include other article-like material found in journals, for example, editorials, short new pieces, obituaries, meeting reports, and book or product reviews.
  • Header and Body. This DTD describes both the metadata for a journal article and the content of the article. Both are required to make a complete article.
  • Multidisciplinary. Although designed for biomedical journals, this DTD should be sufficiently general to describe not only STM journals but technical journals in any field.

Modular DTD Design

The Authoring DTD has been written as a set of “modules” that make use of the modules of the Archiving and Interchange DTD Suite. Each module is a separate physical file, no module is an entire DTD by itself, and modules can be combined into a number of different DTDs. The module files are primarily intended for ease of constructing new DTDs and ease of maintenance.

The major disadvantage of a modular system is the longer learning curve, since it may not be immediately obvious where within the system to find a particular element or attribute cluster. To help with this, the description of each element in the Element Section of this documentation names the module in which that element is defined.


Because access for a wide range of output devices, as well as for the visually impaired, is becoming more and more important in the STM journal community, the modules in the Archiving and Interchange Suite were designed to follow, as much as possible, the W3C Web Content Accessibility Guidelines 2.0 working draft. This DTD is based on the Archiving and Interchange DTD Suite, which used the August 2002 specification, which was the latest accessibility specification available when the Suite was initially constructed. This Specification specifies accessibility guidelines on many levels from design through application. The guidelines which pertain to the modeling of materials were followed to at least Level-2 compliance. For example, a Long Description <long-desc> element was defined as part of many other elements, such as Figure <fig>, so it can be added not only to all figures and other graphical objects, but to any section of the text (for example, to a Boxed Text <boxed-text>) to provide an accessible description of the object. The xml:lang attribute was added to all section-level elements and many paragraph-level elements to permit explicit indication of the language of the content, as required by these guidelines. The Abbreviation or Acronym <abbrev> element (also to be used for acronyms) was added to meet Checkpoint 4.3.

Subsidiary sections:

Documentation for Authors

Documentation for Implementors


We thank, Molecular Biology of the Cell, and The Proceedings of the National Academy of Sciences of the U.S.A. for providing the sample articles used in this tag Library.