NCBI Logo NLM Journal Archiving and Interchange Tag Suite NLM Logo

Tag Suite Home
Version 2.1
Changes to the Entire Suite
Tag Sets

Version 2.1

[Updated versions of the Tag Suite have been released. Current version information is available here.]

There were several rationales for revising the NLM Archiving and Interchange Tag Suite:

Although the changes are fully backwards compatible for XML documents (document instances), the new Archiving Tag Set and the full Tag Suite may not be backwards compatible for all previous customizations.

Summary of Version 2.1 Changes

The major changes for this release were to amend the Module of Modules (%modules.ent;) to reflect the new MathML and character sets, including a revised directory structure for both the MathML modules and the general entity set modules. This necessitated changing the verison of all the Tag Suite files, to point to the new Module of Modules. Minor changes requested by the committee are also reflected in many of the modules.

Editors Note: We took the opportunity of the revision to fix typos, alignment errors, element order in parameter entities, and infelicities of the wording of comments. We wish to thank the many users who brought these to our attention.

Changes to the Entire Suite

The version number for the Suite modules was set to “v2.1 20050630”.

Versioning Note: All modules change version numbers at a numbered release, but, for a dot release, a module that has not changed (for example%phrase.ent;) retains its previous venison number. Therefore only modules that have changed are marked as version 2.1.

MathML DTD Upgrade

The new MathML required no changes to the math setup files %math.ent; and %mathmlsetup.ent;. Suite Verison 2.1 adds the latest version of the MathML 2.0 DTD modules (mathml2.dtd,v 1.12 2003/11/04). The following files have been replaced (there were no new modules added):

The new parameter entity, %MathMLstrict; was left to the default “IGNORE”. Setting this entity to “INCLUDE” would enable marked sections to enforce stricter checking of MathML syntax rules.

Not only do the new math modules completely replace the old, but there has been a directory level change. The module %mathml2-qname-1.mod; is now inside the top level (as a peer of the %mathml.dtd;) instead of one level down in the mathml directory.

MathML Namespaces

For reasons of backwards compatibility, the MathML prefix for the Suite will continue to be “mml”, although the latest MathML DTD defaults to a prefix of “m”.

(Implementor’s Note: In Version 2.1 (as in all previous versions) the MathML namespace pseudoattribute has been implemented as a FIXED attribute in the DTD. Some XML processors (for example, certain implementations of the MSXML parser) do not recognize the defaulted value and require that the MathML namespace be declared explicitly on the top-level <article> element in the instance. The same implementations also require an explicit pseudoattribute for the XLink namespace.)

MathML Character Set Upgrade

The sets of general entities for special characters for the Suite have always been taken directly from the W3C MathML character sets. Since the MathML site has modified their character entity sets, the Suite was changed to match. The new sets of entities:

Therefore a new directory structure was adopted for the sets of character entities in the Suite. To match the new MathML directories, there are now 3 character subdirectories:


Characters defined originally in the SGML specification ISO 8879 (directory patterned on MathML)


Characters originally defined in 8879 but redefined in ISO Tech Report ISO 9573-13 (directory patterned on MathML)


The three Greek alphabet sets not used in MathML but carried forward because they were used in earlier version of the DTD Suite (Suite specific)

The modules %xmlspecchars.ent; was modified to invoke files and directory structure.

The MathML DTD parameter entity mathml-charent.module was set to IGNORE to get rid of the invocation of the character sets from within the MathML DTD itself. Characters for the Suite must be called independently of MathML so that the Suite can be used without MathML. Since ignoring all of the entity set calls in MathML DTD also gets rid of the mmlextra and mmlalias calls, those were also added to the MathML setup module, which was already calling the MathML DTD.

(Note: Implementor Alert: On the W3C website, the current MathML DTD includes some entity files in both 8879 and 9573 sets, for example isoamsa.ent is in both directories, but has added characters in the iso9573-13 directory. This Suite chose to use the most inclusive of the entity files referenced in the MathML DTD. We also did not fix the well known "dagger" problem, in our entity sets there are still entities for dagger and double dagger in both the isopub and isoamsb modules and the preferred double dagger within MathML [but not elsewhere] is the MathML alias %ddagger;.)

Copyright and Permissions

The copyright information in the previous Suite was limited to a spartan <copyright-statement> and <copyright-year>. A new <copyright-holder> element was added and a new wrapper element <permissions> was added to consolidate the copyright and licensing information. The model for <permissions> is the following, in order, even for Archiving (Green):

For backwards compatibility, <permissions> was added to places (such as <article-metadata> where one of the copyright elements was also allowed. The Permissions element does not replace the copyright that was there, it is in addition to it. The documentation will explain that using the Permissions wrapper is best practice, but previously tagged material will not need to be changed.

In order to make this change, the following were moved to the common module:

Permissions have been added to:

Minor Base Suite Changes

The following changes were made in several modules. Each module has an updated change history.


Both an old-style OASIS SOCAT and an XML catalog file will be delivered with the Suite. The XML catalog contains instructions for modifying and setting up the catalog.

        "-//OASIS//DTD Entity Resolution XML Catalog V2.1//EN"
        <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="public">

Tag Sets

These Tag Sets are availble in version 2.1:

National Center for Biotechnology Information
U.S. National Library of Medicine
8600 Rockville Pike, Bethesda, MD 20894
Copyright, Disclaimer, Privacy, Accessibility

U.S. National Institutes of HealthU.S. Department of Health and Human

Last updated: September 14, 2012