Saturday, May 24, 2008

S1000D Content Reuse for Aircraft Documentation

One of the justifications for moving to an XML-based S1000D content management system (CMS) is the ability to reduce cost and improve quality by reusing content. In the aerospace industry, hundreds of thousands of pages of maintenance and operation documentation are produced and maintained for every new aircraft project. Warnings and cautions are a good example of reuse in aerospace documentation. They describe hazards that may cause injury or death or damage to the aircraft. For product liability reasons, these warnings and cautions are carefully reviewed and approved by qualified personnel. Technical authors may be required to reuse these warnings and cautions verbatim across all documents. In this blog, I will discuss some principles and practices that facilitate S1000D content reuse.

From a technical perspective, the key to successful reuse in S1000D is the W3C XInclude specification. The S1000D specification does not make reference to XInclude. The reason is that earlier versions of S1000D were based on SGML. Some S1000D CMS still rely on the SGML/XML 1.0 external parsed entity mechanism for implementing reuse. This approach has several limitations and should be avoided. The preferred approach in modern XML content applications is to use XInclude which allows the transclusion of not only whole chunks of XML content, but also elements (addressed using XPath/XPointer) within those chunks. The following are some examples:

<xi:include href="dm.xml"/>
<xi:include href="dm.xml" xpointer="warning-001"/>

In the first example a data module file named dm.xml is included. In the second example, an element with ID value "waning-001" within the data module is included.

Using XInclude in an S1000D content application requires some modifications to the XML Schema used for the authoring of data modules to allow the insertion of xi:include elements. However, these modifications will still produce valid S1000D documents since you're not altering the structure of your documents, but rather simply modularizing the content.

While we are on the subject of inclusion, the XLink specification can be used as a simpler alternative to the XML 1.0 unparsed entity and notation mechanism (another concept inherited from SGML) for including illustrations into S1000D documents.

At the DocTrain 2007 conference in Boston, I gave a presentation on how to integrate training and documentation using S1000D and the Shareable Content Object Reference Model (SCORM) specification. One way to reuse S1000D content in SCORM is to assign a unique ID to all elements in S1000D data modules (DMs) that are reusable such as paragraphs, steps, warning, cautions, notes, tables, etc. This can be done automatically using the XSLT generate-id() function. The instructional designer then searches the S1000D common source database (CSDB) to find and display relevant DMs. She can then use XInclude to include reusable elements from S1000D DMs into SCORM shareabe content objects (SCOs). When this is done, the SCOs are automatically updated when the DMs are updated.

Successful S1000D reuse requires adherence to the principle of context-agnostic content. For example, to make it possible to reuse a warning across multiple documents in different contexts, one should avoid formulations such as "refer to the illustration in the next section" inside the warning.

Enforcing the principle of context-agnostic content can be semi-automated using an assertion-based schema language like ISO Schematron to report the occurrence of keywords such as "previous", "next", "below", etc. The warning shall be routed through a comprehensive review and approval workflow provided by the CMS before final publication. The principle of business rules definitions and enforcement ensures that reusable content is of the highest quality. Consider a dual-purpose data module that is written to be reused by both training and publications. A business rule could require the use of a certain language style (e.g. active as opposed to passive voice) for the dual-purpose data module.

Another principle that can help when the content cannot be context-agnostic, is the parameterization of reusable content. With parameterization, you include variable references in the reusable content that are resolved at run time. The Exist XML database has an elegant way of handling this using a combination of XInclude and XQuery as in the following example:

<xi:include href="warning.xq?var1=material&var2=process"/>

Here warning.xq is a stored XQuery witch is compiled and executed by Exist to return the root element of the warning. The content of the warning depends on the material and process used to carry out the maintenance procedure. var1 and var2 are passed as global external variables to the XQuery.

The issue of content granularity is directly related to the principle of context-agnostic content. Although the data module is the basic unit of information in S1000D, content can be managed at a lower level of granularity. An interested feature of some XML editors is the ability to select an element inside an XML document and convert that element into an XIncluded file. So while a technical author is writing a warning inside a data module, she can pull out that warning as an XIncluded XML file if she determines that the warning could be reusable in other publications.

Another area where XQuery facilitates reuse is the dynamic assembly of content based on product attributes such as applicability, security, and skill level. S1000D has a comprehensive metadata facility called IDSTATUS that can be leveraged to filter content. A good example is applicability filtering. In the case of an aircraft, the applicability of an S1000D maintenance or operation procedure can depend on the following attributes and conditions (among others):

  1. Manufacturer serial number
  2. Aircraft registration number
  3. Service bulletin incorporation
  4. Location of maintenance
  5. Aviation regulations
  6. Temperature, wind speed, and sandy conditions.

XInclude and XQuery can be used together to package content into S1000D publication modules by executing queries that filter content based on metadata in the IDSTATUS.

An important condition for content reuse is the principle of discoverability of reusable content. Obviously, you cannot reuse a piece of content if you don't know that it exists and where to find it. A technical author should be able to query or browse the S1000D CSDB (Common Source Data Base) to find relevant reusable content. To facilitate enterprise-wide content reuse, I highly recommend a CSDB based on a native XQuery-compliant XML database and deployed as a web application. That will allow authors to perform both full-text and structured queries on the CSDB. The query should return a list of data modules or reusable chunks. The author should then be able to select the reusable chunk to automatically insert an XInclude targeting that chunk.

In support of the principle of reusable content discoverability, appropriate metadata should be added to the content. The DMs already have comprehensive metadata in the IDSTATUS section. Reusable content at a lower level of granularity (like a warning) should also have appropriate metadata specified.

An XQuery-enabled native XML database can help with the governance of your reuse initiative by providing powerful reporting capabilities. For example, you can easily run an XQuery to find all documents that contain an XInclude to a particular chunk. This is important for understanding the impact of updates to that chunk. Another potential issue that could require some attention is the versionning of reusable content. Some form of notification mechanism can be helpful to alert consumers to changes to reusable content. This can take the form of an Atom feed to which consumers can subscribe.

It is important to select an XML authoring tool that has good support for XInclude. Fortunately, some commercial XML editors now have decent support for XInclude. However, these XML editors remain complex specialist tools that are often used only by professional technical authors in documentation departments. At one of our aerospace customers, manufacturing assembly and functional test procedures were used to create installation and testing procedures for service publications. To allow their engineers to contribute S1000D content, we designed a light XML authoring application based on an XForms front-end with XML data persisted in a native XML database using a RESTful API.

Any data reuse strategy should look beyond training and publication to identify opportunities to reuse data and streamline processes across the entire aircraft lifecycle.

1 comment:

Denis Corbeil said...

Good article Joel,

As you may be aware, Bombardier CSeries program is going with ASD S1000D specification and XML for the Technical Publications, Training, and looking at it for Operations (FTP).

Regards,

Denis Corbeil
Manager, CSeries
Technical Publications