Version 2.1
[Updated versions of the Tag Set have been released. The current version is available here.]
Version 2.1 was released on November 16, 2005. Following is a list of updates.
The files for Version 2.1 are available:
by anonymous FTP: //ftp.ncbi.nih.gov/pub/archive_dtd/books/
on the web: book.dtd
Documentation
The Tag Library for version 2.1 is here: https://dtd.nlm.nih.gov/book/tag-library/2.1/
There were several rationales for revising the Archiving and Interchange Tag Suite and this Tag Set made from it:
The W3C had produced new DTD and schema versions of MathML;
The W3C had produced (in association with MathML) new, more complete sets of general character entities;
The Authoring Tag Set was ready for release and it seemed prudent both to use the latest of everything for the new Tag Set and not to go ahead of the other Tag Sets; and
The AIT Working Group, and others via the listserv, had requested changes since the last release.
Although the changes are fully backwards compatible for XML documents (document instances), the new Archiving Tag Set and the full Tag Suite may not be backwards compatible for all previous customizations.
The major changes for this release were to amend the Module of Modules (%modules.ent;) to reflect the new MathML and character sets, including a revised directory structure for both the MathML modules and the general entity set modules. This necessitated changing the verison of all the Suite files, to point to the new Module of Modules. Minor changes requested by the committee are also reflected in many of the modules.
Editors Note: We took the opportunity of the revision to fix typos, alignment errors, element order in parameter entities, and infelicities of the wording of comments. We wish to thank the many users who brought these to our attention.
The version number for the Suite modules, the Publishing DTD (Blue), the Archiving DTD (Green), and the three NLM Book DTDs was set to “v2.1 20050630”. Since the entire Suite revised at once, the new Authoring DTD (Pumpkin) was created at verison 2.1 of the same date, to make it more obvious which verison of the Suite it has used.
Versioning Note: All modules change version numbers at a numbered release, but, for a dot release, a module that has not changed (for example%phrase.ent;) retains its previous venison number. Therefore only modules that have changed are marked as version 2.1.
The new MathML required no changes to the math setup files %math.ent; and %mathmlsetup.ent;. Suite Verison 2.1 adds the latest version of the MathML 2.0 DTD modules (mathml2.dtd,v 1.12 2003/11/04). The following files have been replaced (there were no new modules added):
mathml2.dtd;
mathml/mmlextra.ent;
mathml/mmlalias.ent; and
mathml2-qname-1.mod.
The new parameter entity, %MathMLstrict; was left to the default “IGNORE”. Setting this entity to “INCLUDE” would enable marked sections to enforce stricter checking of MathML syntax rules.
Not only do the new math modules completely replace the old, but there has been a directory level change. The module %mathml2-qname-1.mod; is now inside the top level (as a peer of the %mathml.dtd;) instead of one level down in the mathml directory.
For reasons of backwards compatibility, the MathML prefix for the Suite will continue to be “mml”, although the latest MathML DTD defaults to a prefix of “m”.
(Implementor’s Note: In Version 2.1 (as in all previous versions) the MathML namespace pseudoattribute has been implemented as a FIXED attribute in the DTD. Some XML processors (for example, certain implementations of the MSXML parser) do not recognize the defaulted value and require that the MathML namespace be declared explicitly on the top-level <article> element in the instance. The same implementations also require an explicit pseudoattribute for the XLink namespace.)
The sets of general entities for special characters for the Suite have always been taken directly from the W3C MathML character sets. Since the MathML site has modified their character entity sets, the Suite was changed to match. The new sets of entities:
Match Unicode 4.0;
Have changed the older private use areas to 4.0 mappings; and
Use a new directory structure, which separates the ISO 8879 (SGML) sets from the ISO 9573-13 (ISO tech rpt) sets.
Therefore a new directory structure was adopted for the sets of character entities in the Suite. To match the new MathML directories, there are now 3 character subdirectories:
iso8879 | Characters defined originally in the SGML specification ISO 8879 (directory patterned on MathML) |
iso9573-13 | Characters originally defined in 8879 but redefined in ISO Tech Report ISO 9573-13 (directory patterned on MathML) |
xmlchars | The three Greek alphabet sets not used in MathML but carried forward because they were used in earlier version of the DTD Suite (Suite specific) |
The modules %xmlspecchars.ent; was modified to invoke files and directory structure.
The MathML DTD parameter entity mathml-charent.module was set to IGNORE to get rid of the invocation of the character sets from within the MathML DTD itself. Characters for the Suite must be called independently of MathML so that the Suite can be used without MathML. Since ignoring all of the entity set calls in MathML DTD also gets rid of the mmlextra and mmlalias calls, those were also added to the MathML setup module, which was already calling the MathML DTD.
(Note: Implementor Alert: On the W3C website, the current MathML DTD includes some entity files in both 8879 and 9573 sets, for example isoamsa.ent is in both directories, but has added characters in the iso9573-13 directory. This Suite chose to use the most inclusive of the entity files referenced in the MathML DTD. We also did not fix the well known "dagger" problem, in our entity sets there are still entities for dagger and double dagger in both the isopub and isoamsb modules and the preferred double dagger within MathML [but not elsewhere] is the MathML alias %ddagger;.)
The copyright information in the previous Suite was limited to a Spartan <copyright-statement> and <copyright-year>. A new copyright holder element was added and a new wrapper element <permissions> was added to consolidate the copyright and licensing information. The model for <permissions> is the following, in order, even for Archiving (Green):
copyright-statement
copyright-year
copyright-holder
license
For backwards compatibility, <permissions> was added to places (such as <article-metadata> where one of the copyright elements was also allowed. The Permissions element does not replace the copyright that was there, it is in addition to it. The documentation will explain that using the Permissions wrapper is best practice, but previously tagged material will not need to be changed.
In order to make this change, the following were moved to the common module:
%license-atts;
%license-model;
<copyright-year>
<license>
Permissions have been added to:
<appendix>
<article-meta>
The parameter entity display-back-matter.class and thus to the default of the following elements:
<array>
<boxed-text>
<chem-struct-wrapper>
<disp-formula>
<disp-quote>
<fig>
<graphic>
<preformat>
<statement>
<supplementary-material>
<table-wrap>
<table-wrap-foot>
<verse-group>
The following changes were made in several modules. Each module has an updated change history.
List Item — The attribute list for <list-item> was made into a parameter entity, so that, for example, individual DTDs could change the attribute “id” from CDATA to type ID. (There was already a parameter entity for <list> to allow the same change.
Titles and Subtitle
The new element <journal-subtitle> was defined in %common.ent; and used in <journal-meta>.
Added <journal-subtitle> to <journal-meta> through the parameter entity %journal-meta-model;
Added <journal-subtitle> to the references class
The xml:lang attribute was associated with <subtitle> element.
Added the optional <trans-subtitle> element to to <article-meta> model through the parameter entity %title-group-model;.
Attribute Changes
Journal Identifier Attributes — The parameter entity %related-article-atts; was changed to use the parameter entity %journal-id-atts;. The entity %journal-id-atts; was moved to %common.ent; to allow this use.
doaj — Added new values “doaj” (Directory of Open Access Journals) and “manuscript” (Manuscript) were added to %pub-id-types; as well as to the list of suggested journal ID types.
Hard-coded Date Attributes — In the common modules, the parameter entity %date-atts; was defined, but not used on the <date> element. Since the attribute list was hard-coded at the element, it could not be over-ridden. The parameter entity is now used, allowing the over-ride to work as designed.
X-Generated Text — Added xml:space attribute with a value of “preserve” to the <x> element (per list request).
Both an old-style OASIS SOCAT and an XML catalog file will be delivered with the Suite. The XML catalog contains instructions for modifying and setting up the catalog.
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V2.1//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="public">
Most changes were reflections of Suite changes:
Permissions — The permissions wrapper has been added to:
<book-meta>
<book-part-meta>
<digital-edition-meta>
Translated Subtitle — Added optional <trans-subtitle> element to <book-meta> model through %book-title-group-model;.
There were additional minor changes:
Reference Types — Book needed the same new reference types which Historical already had, so moved the ref-type over-ride from the Historical override module to the Book override module. The Historical Tag Set first calls bookcustom-models then historical-models, so there is no content change in the Historical Tag Set.
<preformat> — Book needed to be able to use <named-content> inside <preformat>, which Historical could already do, because Historical mixes <named-content> into the emphasis class, to use for yellow highlighting, underlining, etc. This seemed useful for Book as well, so the emphasis override was created in the Book custom classes module to include <named-content>. Now both Book and Historical treat <named-content> as one of the emphasis elements, and use it almost anywhere. (Note that Historical cannot use the SAME override, because Historical also adds <annotation> to the emphasis class.) Since <named-content> had to be removed from the phrase.class to avoid clashes (as the emhasis class is more inclusive), phrase.class was also redefined to remove <name-content>.
Removed the Parameter Entity %title-group-model;, since it was a remnant, not used in the book Tag Sets.
A Frequently Asked Questions page is available.
Available Schemas
In addition to the DTD format, the Tag Set is also available as a W3C XML schema and as a RELAX NG schema. Both are generated directly from the DTD and neither is intended for maintenance. See the individual schema pages for more information.
National Center for Biotechnology Information Last updated: November 21, 2008 |