Tuesday, June 22, 2010

Health Information Exchanges (HIEs): Emerging Architectural Patterns

The following are some emerging architectural patterns in the nascent field of HIEs:

  • Decentralized and Hybrid Models utilizing a service-oriented architecture (SOA). A centralized registry stores metadata about the type and location of clinical data available in edge systems connected to the HIE. For privacy and security reasons, the clinical data itself is kept at its source as opposed to a centralized repository. Upon request, a Record Locator Service (RLS) finds the data in edge systems and securely routes the data to authorized requestors.

  • A Data Use and Reciprocal Support Agreement (DURSA) provides the legal framework for participation in the HIE.

  • Use of a master patient index (MPI) as a core infrastructure for patient matching. The MPI stores identifying information on all patients with records in participating systems.

  • Use of Integrating the Health Enterprise (IHE) profiles such as Patient Identity Cross-Reference (PIX) and Cross Enterprise Document Exchange (XSD.b) to facilitate patient discovery and the query and retrieval of clinical documents.

  • Health Information Event Messaging to provide the ability to subscribe to health information.

  • Interoperability with NHIN Exchange to enable exchange of data with other state and federal agencies such as the Department of Veterans Affairs (VA) and the Department of Defense (DoD). This requires support for NHIN messaging standards such as the Web Services Interoperability (WS-I) Profiles WS-I Basic v 2.0 and WS-I Security v 1.1. WS-I Basic specifies the use of SOAP 1.2, WSDL 1.1, WS-Addressing 1.0, WS-Policy 1.5 , MTOM 1.0 , and UDDI v3.0.2 for the NHIN Web Services Registry. WS-I Security defines the security standards for NHIN Exchange including TLS, AES 128, X.509, XML D-Sig, and Attachment Security. NHIN has adopted an authentication and authorization framework based on SAML 2.0.

  • Use of the HL7 Continuity of Care Documents (CCD) as the data exchange standards for clinical documents. Meaningful Use criteria allow the ASTM CCR specification as well.

  • Health Record Banks (HRBs) containing Personal Health Records (PHRs) as participating nodes in the HIE. The HRBs allow patients to exercise control over their health records by granting permissions to specific providers to view those health records.

  • Ability to connect to the HIE through a local EMR or a web-based portal (for example to allow access for physicians without an EMR).

  • For simple and secure interoperability, the NHIN Direct draft proposal at the time of this writing is to use:


    • SMTP as a backbone protocol
    • S/MIME-signed and encrypted messages for security
    • IHE XDM for content and metadata packaging
    • IHE XDR, REST, and Email (POP/IMAP) as edge protocols
    • TLS (with a server certificate only) for on-the-wire security
    • XDR as the backbone for NHIN Exchange.

Saturday, June 12, 2010

Putting XQuery to Work in Healthcare

The following are some of the challenges that healthcare organizations will be facing during the next few years:

  • Conversion from HIPAA 4010 to 5010
  • Conversion from ICD-9 to ICD-10
  • Efficiently storing, querying, processing, and exchanging Electronic Health Records (EHRs)
  • Mapping from HL7 2.x to HL7 v3 messages
  • Assembling EHRs by aggregating data from multiple organizations participating in Health Information Exchanges (HIEs).


XQuery is not just a query language for XML data sources. It is also a very powerful declarative, strongly typed, and side-effect free programming language for processing and manipulating XML documents. XQuery is a natural solution for querying and aggregating data coming from heterogeneous sources such as relational databases, native XML databases, file systems, and legacy data formats such as EDI. Some developers will find XQuery easier to use than XSLT because XQuery has a SQL-like syntax.


Migration to HIPAA 5010 and ICD 10

Conversion from HIPAA 4010 to 5010 and ICD-9 to ICD-10 will be a priority on the agenda in the next three years (details on final compliance dates can be found on this HHS web page).

The XQuery and XQuery Update Facility specifications provide a simple and elegant solution to this conversion challenge.


Health Information Exchanges (HIEs) and the Virtual Health Record

In a HIE with multiple participating organizations, EHR data must be assembled either through a centralized, federated, or hybrid data model. The data needed to assemble a longitudinal EHR (a virtual health record) for a patient could be coming from several providers, payers, lab companies, and medical devices. XQuery was designed to handle that type of XML processing use case.


Storing, Updating, and Querying EHRs

The HL7 CCD and ASTM CCR have been retained as Meaningful Use XML data exchange standards for EHRs. Mapping between an XML HL7 CCD representation (which is derived from the HL7 UML-based Reference Information Model or RIM) and an existing relational database structure is not trivial. IBM has been granted a patent entitled "Conversion of hierarchically-structured HL7 specifications to relational databases". The HL7 RIMBAA project provides some best practices on mapping RIM objects to a relational database structure.

With the emergence of native XML databases such as Oracle XML DB and IBM pureXML, XML is no longer just a messaging format. It can be used as a format for storing and querying data as well.

This article shows a sample code of updating an EHR stored in HL7 CDA format in an IBM DB2 pureXML native XML database.


Mapping from HL7 2.x to HL7 v3 messages

In countries like Canada where HL7 v3 has been adopted, a frequent challenge is to map from legacy HL7 2.x messages to HL7 v3 messages for example for lab results. An XQuery-based transform can be used to map from an HL7 v2.x XML structure to an HL7 v3 XML structure.


An Alternative to GELLO?

In a previous post entitled "Clinical Decision Support: Crossing the Chasm", I argued that Clinical Decision Support Systems (CDSS) implementers should be free to use any programming language of their choice. GELLO is an HL7 standard which specifies an expression and query language for CDSS. The following are the requirements for a CDSS expression and query language as specified in the GELLO specification:

  • vendor-independent
  • platform-independent
  • object-oriented and compatible with the vMR
  • easy to read/write
  • side-effect free
  • flexible
  • extensible


XQuery satisfies all these requirements except the third. XQuery is a functional programming language with no side effect as opposed to an object-oriented programming language. GELLO settled on the OMG Object Constraint Language (OCL). The following paragraph from the GELLO specification explains why XQuery (known as XQL at the time) wasn't selected:

XQL is a query language designed specifically for XML documents. XML documents are unordered, labeled trees, with nodes representing the document entity, elements, attributes, processing instructions and comments. The implied data model behind XML neither matches that of a relational data model nor that of an object-oriented data model. XQL is a query language for XML in the same sense as SQL is a query language for relational tables. Since the HL7 RIM data model and the vMR data model are both object-oriented, it is clear that XQL is not an appropriate approach for an object-oriented query and expression language.


That might have been true back in 2004 in an object-oriented world. Today, if the inputs to a CDSS are EHRs represented in HL7 CCD or ASTM CCR format, and those EHRs are stored in an XQuery compliant native XML database, then XQuery could be a strong candidate for an expression and query language for the CDSS.

Wednesday, June 9, 2010

Data Modeling for Electronic Health Records (EHR) Systems

Getting the data model right is of paramount importance for an Electronic Health Records (EHR) system. The factors that drive the data model include but are not limited to:

  • Patient safety
  • Support for clinical workflows
  • Different uses of the data such as input to clinical decision support systems
  • Reporting and analytics
  • Regulatory requirements such as Meaningful Use criteria.

Model First

Proven methodologies like contract-first web service design and model driven development (MDD) put the emphasis on deriving application code from the data model and not the other way around. Thousands of line of code can be auto-generated from the model, so it's important to get the model right.


Requirements Gathering

The objective here is to determine the entities, their attributes, and the relationships between those entities. For example, what are the attributes that are necessary to describe a patient's condition and how do you express the fact that a condition is a manifestation of an allergy? The data modeler should work closely with clinicians to gather those requirements. Industry standards should be leveraged as well. For example, HITSP C32 defines the data elements for each EHR data module such as conditions, medications, allergies, and lab results. These data elements are then mapped to the HL7 Continuity of Care Document (CCD) XML schema.

The HL7 CCD is itself derived from the HL7 Reference Information Model (RIM). The latter is expressed as a set of UML class diagrams and is the foundation model for health care and clinical data. A simpler alternative to the CCD is the ASTM Continuity of Care Records (CCR). Both the CCD and CCR provide an XML schema for data exchange and are Meaningful Use criteria. Another relevant data model is the HL7 vMR (Virtual Medical Record) which aims to define a data model for the input and output of Clinical Decision Support Systems (CDSS).

These standards can be cumbersome to use as such from a software development perspective. Nonetheless, they can inform the design of the data model for an EHR system. Alignment with the CCD and CCR will facilitate data exchange with other providers and organizations. The following are Meaningful Use criteria for data exchange:

  1. Electronically receive a patient summary record, from other providers and organizations including, at a minimum, diagnostic test results, problem list, medication list, medication allergy list, immunizations, and procedures and upon receipt of a patient summary record formatted in an alternative standard specified in Table 2A row 1, displaying it in human readable format.

  2. Enable a user to electronically transmit a patient summary record to other providers and organizations including, at a minimum, diagnostic test results, problem list, medication list, medication allergy list, immunizations, and procedures in accordance with the standards specified in Table 2A row 1.



Applying Data Modeling Patterns

Applying data modeling patterns allows model consistency and quality. Relational data modeling is a well established discipline. My favorite resource for relational data modeling patterns is: The Data Model Resource Book, Vol. 3: Universal Patterns for Data Modeling.

Some XML Schema best practices can be found here.


Data Stores

Today, options for data store are no longer limited to relational databases. Alternatives include: native XML databases (e.g. DB2 pureXML), Entity-Attribute-Value with Classes and Relationships (EAV/CR), and Resource Description Framework (RDF) stores.

Native XML databases are more resilient to schema changes and do not require handling the impedance mismatch between XML documents, Java objects, and relational tables which can introduce design complexity, performance, and maintainability issues.

Storing EHRs in an RDF store can enable the inference of medical facts based on existing explicit medical facts. Such inferences can be driven by an ontology expressed in OWL or a set of rules expressed in a rule language such SWRL. Semantic Web technologies can also be helpful in checking the consistency of a model, data and knowledge integration across domains (e.g. the genomics and clinical domains), and for managing classification schemes like medical terminologies. RDF, OWL, and SWRL have been successfully implemented in Clinical Decision Support Systems (CDSS).

The data modeling notation used should be independent of the storage model or at least compatible with the latter. For example, if native XML storage is used, then a relational modeling notation might not be appropriate. In general, UML provides the right level of abstraction for implementation-agnostic modeling.


Due Diligence

When adopting a "noSQL" storage model, it is important to ensure that (a) the database can meet performance and scalability criteria and (b) the team has the skills to develop and maintain the database. Due diligence should be performed through benchmarking using a tool such as the IBM Transaction Processing over XML (TpoX). The team might need formal training in a new query language like XQuery or SPARQL.


A Longitudinal View of the Patient Health

Maintaining an up-to-date and truly longitudinal view of a patient's medical history requires merging and reconciling data from heterogeneous sources including providers' EMR systems, lab companies, medical devices, and payers' claim transaction repositories. The data model should facilitate the assembly of data from such diverse sources. XML tools based on XSLT, XQuery, or XQuery Update can be used to automate the merging.


The Importance of Data Validation

Data validation can be performed at the database layer, the application layer, and the UI layer. The data model should support the validation of the data. The following are examples of techniques that can be used for data validation:

  • XML Schema for structural validation of XML documents
  • ISO Schematron (based on XPath 2.0 and XSLT 2.0) for business rules validation of XML documents
  • A business rules engine like Drools
  • A data processing framework like Smooks
  • The validation features of a UI framework such as JSF2
  • The built-in validation features of the database.


The Future: Modeling with the NIEM IEPD


The HHS ONC issued an RFP for using the National Information Exchange Model (NIEM) Information Exchange Package Documentation (IEPD) process for healthcare data exchange. The ONC will release a NIEM Concept of Operations (ConOps). The NIEM IEPD process is explained here.