Sunday, December 11, 2011

The New Economics of Healthcare and What It Means for Health IT

Healthcare expenditures represented 17.6% of US GDP in 2009 and continue to rise. In a recently released report titled "Therapy or Surgery? A Prescription for Canada's Health Systems", Don Drummond, former chief economist at TD Bank warned that "assuming total program spending and revenues grow at the nominal GDP growth rate of 4 percent, healthcare would comprise 80 percent of the Ontario budget by 2030, up from 46 percent today."

To address quality and costs challenges, the concept of a value-based healthcare system is gaining significant ground. Articles on the subjects have recently appeared in prestigious publications such as the New England Journal of Medicine (NEJM) and Health Affairs. There is no clear consensus among industry experts that initiatives such as the Accountable Care Organization (ACO) and the Patient Centered Medical Home (PCMH) will lead to dramatic improvements in quality and costs reductions. However, everyone seems to agree that the ultimate goal of a healthcare system should be to maximize positive patient health outcomes per dollar spent.

Measuring Healthcare Value

Michael Porter is arguably the leading thinker on the subject of healthcare value. In an article titled "What Is Value in Health Care?" that appeared in the New England Journal of Medicine, he wrote:
"Value should always be defined around the customer, and in a well-functioning health care system, the creation of value for patients should determine the rewards for all other actors in the system."

Current process measures and the regulatory requirements to report them are necessary but not sufficient. It seems like the healthcare industry has escaped the business process reengineering and quality improvement movements that have permeated many industries in the nineties. Fortunately, thanks to the hard work of visionary leaders like Donald Berwick, former Administrator of the US Centers for Medicare & Medicaid Services (CMS) and co-founder of the Institute for Healthcare Improvement (IHI), quality measures reporting is now an essential component of incentive programs such as Meaningful Use and the ACO model. Healthcare transformation during the next few years will be focused on the migration from a paradigm based on the volume of services delivered to one that is based on measuring value (patient-centered outcomes) as well as the total costs incurred throughout the care cycle to achieve those outcomes.

Patient-centered outcome measures include essential metrics such as mortality, functional status, time to recovery, severity of side effects, and remission (e.g. depression remission at six and twelve months). These measures should take into account the values, goals, and wishes of the patient. Therefore patient-centered outcome should also include the patient's own evaluation of the care received. One important implication of this shift from fee-for-service to value is the growing importance of wellness, prevention, early screening, and disease and population management. In short, the elimination of waste and the optimization of healthcare delivery through the standardization of care pathways and treatment protocols based on the latest scientific evidence will become a top priority.


Measuring Healthcare Costs

The total costs incurred throughout the care cycle represent the assets used to achieve those outcomes such as: employees, consumables, facilities, medical devices, equipment, energy, and computing resources such as software and hardware. Measuring these costs would appear intuitive and straightforward at first. However, in the US healthcare system, Medicare reimbursement is based not on the actual usage of resources, but on so called relative value units (RVUs). Service charges are calculated based on formula that includes three RVUs: one for physician work, one for practice expense, and one for malpractice expense. Similar reimbursement schemes are used in other developed countries as well.

In an article titled "The Big Idea: How to Solve the Cost Crisis in Health Care", Michael Porter and Robert Kaplan proposed time-driven activity-based costing (TDABC) as a more accurate methodology for measuring costs in healthcare. Bundled Payment (using episodes of care as a basis for payment and value measurement) is emerging as a solution to contain healthcare costs.


Implications for Healthcare Information Technology

The trend toward bundle payments and other cost containment schemes will put an increasing pressure on providers to maximize the value of the assets involved in the care cycle. This will be achieved by optimizing the decision making process with software that can analyze large data sets and make specific and accurate recommendations faster than humanely possible.

In the clinical domain in particular, the following are examples of health IT solutions that will make a big difference in maximizing value for patients:

  • Automated execution of Clinical Practice Guidelines (CPGs), Care Pathways (CPs), and treatment protocols using technologies such as business rules, predictive analytics, and Business Process Management (BPM).

  • Creation of disease registries as well as secondary use of EHR data to track patient outcomes, compliance with CPGs, Comparative Effectiveness Research (CER), and Patient-Centered Outcome Research (PCOR). Advanced analytics will play an important role in providing important insights into clinical data on the effectiveness of various treatment options based on the clinical profile of a specific patient or subpopulation of patients. Personalized Medicine leveraging advances in genomics will play an important role here as well. This will require computing power to handle the large data sets involved in making clinical decisions based on genomic data.

  • Applications of Knowledge Representation, Reasoning, Natural Language Processing (NLP), Speech Recognition, Information Retrieval, and Machine Learning to enable next generation Clinical Question Answering (CQA).

  • Clinical Knowledge Management (CKM) to support a learning health system. The Institute of Medicine (IOM) released a report earlier this year titled "Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care". The report describes the learning health system as:

    "delivery of best practice guidance at the point of choice, continuous learning and feedback in both health and health care, and seamless, ongoing communication among participants, all facilitated through the application of IT."


  • Social Health Enterprise tools that allow clinicians to communicate and collaborate beyond email.

  • Leveraging mobile devices and tablets to provide information and cognitive support to clinicians, patients, and care givers while enforcing strict security.

  • Healthcare interoperability standards. As Doug Fridsma, Director of the Office of Standards and Interoperability puts it in a recent blog post: "standards are not optional". Standardization at the data, security, and transport level is necessary. However, care should be taken to ensure that these standards can be widely implemented by health IT vendors. Adopting healthcare profiles of cross-industry standards and creating Open Source reference implementations using the tools and techniques developers are familiar with (as was done by the ONC-sponsored Direct Project) can help meet that objective.

    On the other hand, over-standardization in areas that are going through a rapid rate of technological innovation could have a negative impact.

  • Applications of human factors research to enable the effective use of technology in clinical settings. Examples include: implementation of usability guidelines to reduce alert fatigue in clinical decision support (CDS) and the use of speech recognition, checklists, simulation, Visual Analytics, and disease-specific documentation templates or Smart Forms.

    There are many lessons to be learned from other mission-critical industries that have adopted automation. Following several incidents and accidents related to the introduction of the "Glass Cockpit" about 25 years ago, human factors training known as Cockpit Resource Management (CRM) is now standard practice in the aviation industry.

  • Lastly, Cloud Computing and Service-Oriented Architecture (SOA) will allow health enterprises to reduce costs and share computing resources while focusing on their core competency: medical care.

Saturday, November 12, 2011

Thoughts on the Query Health Initiative

I have been following the Query Health Initiative of the ONC Standards and Interoperability Framework with great interest. The following are key goals of the Query Health Initiative:

  • Identify standards and services for distributed population health queries to EHRs, HIEs, and other clinical data sources such as registries.

  • Define a framework to allow partners to create their own distributed query networks (with or without intermediaries called "Network Data Partners"). I believe that the solution should not mandate a specific implementation in order to foster innovation in the field.

  • Support a number of use cases such as quality measures reporting, public health surveillance, comparative effectiveness research (CER), and patient centered outcome research (PCOR).

  • Support queries over a common and extensible Clinical Information Model (CIM).

  • Support security, audit trails, privacy, patient consent directives, and other policy and legal requirements. Techniques such as the de-identification of data will be essential to maintaining privacy.

  • Create a solution that can be implemented with a financially sustainable model. I think there are many lessons to be learned here from failed or struggling Health Information Exchange (HIE) initiatives.


The Query Health Initiative supports a distributed model as a opposed to a centralized model. This allows data to be kept in the originating systems and securely queried and aggregated.

Previous initiatives to create such a distributed query health network include:

  • i2b2 (Informatics for Integrating Biology and the Bedside) and SHRINE (Shared Health Research Information Network) - a scalable informatics framework that will enable clinical researchers to use existing clinical data as well as genomic data for research and discovery.

  • hQuery - an open source project by MITRE which leverages the ability of certified EHRs to produce C32 or CCR documents. hQuery is based on a MongoDB document database and uses JavaScript Map and Reduce functions.

  • PopMedNet – a multi-purpose distributed networks for secondary use of EHR, administrative, claims, and registry data.


In addition, large health enterprises are investing considerable efforts and resources in building clinical data warehouses and analytics capabilities for their own internal needs. Examples are the Enterprise Data Trust at Mayo Clinic and the STRIDE (Stanford Translational Research Integrated Database Environment) project at Stanford University. These existing systems may have to eventually participate in distributed population health query networks. Most of these systems are based on SQL databases which have reached a high level of maturity in terms of scalability, performance, and the availability of data processing and analytics techniques and tools.

However, NoSQL and alternatives based on Big Data Analytics (e.g. Hadoop and Hive) are currently making significant inroads into the enterprise. Emerging NoSQL databases include key-value stores, document databases, graph databases, and triple stores such as those based on the SPARQL query language for RDF. In some cases, these NoSQL databases provide superior scalability when compared to SQL databases. They are increasingly popular with developers because they simplify the application development process by eliminating the need for Object Relational Mapping (ORM). For example, in document-oriented databases such as MongoDB (which is used in hQuery), objects are persisted as JSON documents.

So, I believe that the technical choices that are made for the Query Health Initiative should be grounded in that current reality of enterprise data management.

  1. First, I think that queries should be formulated in a declarative as a opposed to a procedural manner. This rules out an approach based on JavaScript.

  2. Second, I believe that queries should be formulated in an established query language. By established query language, I mean a standard like SQL, SPARQL, or XQuery that was design specifically for the purpose of querying data stores. This rules out standards such as the HL7 Health Quality Measures Format (HQMF) or any implementation of the HL7 CDA. In fact, quality measures reporting is just one of many use cases in Query Health. In my opinion, in a value-based healthcare system, patient-centered outcome measurement is even more important than quality measures which are essentially process measures and do not necessarily correlate with improved patient outcomes.

    By the way, I believe this same principle should extend to other ONC Standards and Interoperability efforts such as the Data Segmentation Initiative which is trying to define an interoperable approach to implementing privacy policies, consent directives, and authorizations. The Data Segmentation Initiative should embrace the approach taken by the OASIS Cross-Enterprise Security and Privacy Authorization (XSPA) which consists in defining healthcare profiles for well established and recognized standards such as SAML, XACML, and WS-Trust. This contrasts with an approach that would consist in creating a CDA implementation for patient consent directives. This discussion on patient consent directives is relevant to the Query Health Initiative.

    i2b2 which is one of the projects considered by the Query Health Initiative uses SQL. i2b2 is a well engineered, robust, and proven architecture. The i2b2 data model is based on the "star schema" which has a central "fact" table where each row represents an observation about a patient. The current implementation of the clinical research chart (CRC) in i2b2 is based on the Oracle and Microsoft SQL Server databases.

    The maturity of SQL-based tools could be a deciding factor.

  3. Third, the adopted solution should leave the door open to innovation, by giving participants the choice of embracing alternative and emerging solutions such as SPARQL or UnSQL, a newly proposed query language for NoSQL document databases. Erik Meijer and Gavin Bierman from Microsoft Research wrote a paper titled "A co-Relational Model of Data for Large Shared Data Banks" where they propose a new common query language for both SQL and noSQL databases called coSQL.


In the era of Clinical Question Answering (CQA), Natural Language Processing (NLP) and ontologies will play a critical role in clinical data repositories. SPARQL-based queries when combined with the use of ontologies could offer significant advantages over traditional SQL-based systems (see my previous post titled "Why do we Need Ontologies in Healthcare Applications"). Furthermore, standard vocabularies (such as SNOMED CT) and value sets are essential components of clinical data repositories. These terminologies are often derived from ontologies, so a solution that integrate well with ontologies will be important. I believe that the CTS2 specification satisfies all the vocabulary and value set requirements for Query Health. CTS2 is also currently being implemented by commercial vocabulary tool vendors and various open source projects.

The W3C R2RML (RDB to RDF mapping language) specification allows existing applications to provide an RDF view other relational databases. The SPARQL 1.1 query language supports key Query Health requirements such as aggregates, grouping, and subqueries, while the SPARQL 1.1 Federation Extensions specification supports federated queries.

The Translational Medicine Ontology is designed as a unifying ontology for the integration of EHR, genomic, treatment, drug, and other types of clinical data. This allows the creation of knowledge bases that can be queried with SPARQL to answer important questions related to clinical research as well as patient care. If you are interested in this topic, I highly recommend these two papers:

Saturday, August 27, 2011

Why Do We Need Ontologies in Healthcare Applications

There is an ongoing thread in the HL7 mailing list about "what can OWL do?" in the wake of Grahame Grieve's recent post titled: "HL7 needs a fresh look because V3 has failed".

This post is my answer to "what can OWL do?".


Ontology vs. Information Model


Ontologies are our conceptualization (understanding) of the world while information models (of data structures) describe and constrain how the data is stored and transmitted in messages. Thomas Gruber popularized the notion of ontology in the nineties when he wrote in a paper titled "A Translation Approach to Portable Ontology Specifications":

A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose. Every knowledge base, knowledge-based system, or knowledge-level agent is committed to some conceptualization, explicitly or implicitly.

An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what "exists" is exactly that which can be represented.

When people ask me to explain how ontologies are relevant to healthcare, I often use this quote from a report titled "Semantic Interoperability Deployment and Research Roadmap" by Alan Rector, an authority in the field of biomedical ontologies:
Ontologies are about the things being represented – patients, their diseases. They are about what is always true, whether or not it is known to the clinician. For example, all patients have a body temperature (possibly ambient if they are dead); however, the body temperature may not be known or recorded. It makes no sense to talk about a patient with a "missing" body temperature.

Data structures are about the artefacts in which information is recorded. Not every data structure about a patient need include a field for body temperature, and even if it does, that field may be missing for any given patient. It makes perfect sense to speak about a patient record with missing data for body temperature.

Hence, at the practical level, ontologies can help us verify the soundness of statements in messages based on our conceptualization of the world. Information models in healthcare often take the form of an XML schema, a Schematron schema, or a relational database schema. One distinguishing characteristic of ontologies is that they are based on an Open World Assumption (OWA) which is based on the AAA slogan or Anyone can say Anything about Any topic. Statements that are not included in an ontology are considered unknown as opposed to false. In contrast, information models of data structures such as XML messages and relational databases are based on a Closed World Assumption (CWA) which holds that any statement that is not known to the message or database to be true is false (this is also referred to as "negation as failure" or NF).

The OWA principle recognizes that our understanding of the world is incomplete, evolving, and that new knowledge can be discovered and added at any time. To return to Alan Rector's example, one cannot assume that because there is no mention of a patient's body temperature in an electronic health record message, that the patient does not have a body temperature. Another distinguishing characteristic of ontologies is the Nonunique Naming Assumption as opposed to the Unique Name Assumption (UNA) in CWA-based systems. People do use different labels to represent the same concept. This discussion of OWA vs. CWA is not just academic. The reality is that data about a patient can exist in multiple systems, organizations, jurisdictions, and even countries using different vocabularies and XML data structures. Concepts such as longitudinal or lifelong health record and medication reconciliation will soon reveal the limits of healthcare systems based on a CWA.

OWL2, a W3C Recommendation, is an expressive ontology language and provides reasoning and inferencing capabilities to software applications. Logical axioms specify restrictions through property domains and ranges. OWL2 also support negation and disjunction. OWL2 reasoning capabilites can be enhanced with a rule language such as the Semantic Web Rule Language (SWRL). Given the complexity and scale of medical knowledge today, the use of ontology-based reasoning will become essential in applications such as medical terminologies, clinical knowledge management for automated decision support, and even automatically verifying the accuracy of messages exchanged between healthcare applications.

Unfortunately, ontologies are not widely used in software engineering today. They are not well understood by the majority of developers. Undergraduate computer science programs don't usually teach ontologies. There is an urgent need to educate a new generation of ontology-savvy healthcare application developers.


Model Consistency

For obvious reasons, healthcare applications require a high degree of model quality and consistency. This is not always possible or easy to do with traditional approaches such as object-oriented design (the HL7 RIM is based on the UML) and data structures such as XML and relational database schemas.

A clear and clean separation of concerns is needed between the semantic model (the ontology) and the information model (the model of how the data is structured in an XML message or the health application's data store). The ontology can be used to verify that the content of a message is accurate in regard to our conceptualization of the world, while the information model is used to validate the data structure in the data stores and XML messages exchanged with other applications.

The HL7 RIM is definitely not an ontology and has been plagued by consistency issues. Futhermore, a consequence of the RIM model refinement process that is used to derive XML message exchange schemas from the RIM is that data structure concerns have leaked into what was touted as the semantic model. This lack of separation of concerns has led to an unwieldy information model and very complex XML message structures (in the CDA and other V3 messages) that are difficult to learn and implement in software applications. The GreenCDA is a possible answer to the message structure simplification challenge (see my previous post on the Greening of the HL7 CDA). However, it is not enough to solve the semantic interoperability challenge.


Ontologies and Medical Terminologies

In a paper titled "Why Do It the Hard Way? The Case for an Expressive Description Logic for SNOMED", Alan Rector and Sebastian Brandt argued in favour of using the OWL ontology language for SNOMED which is currently based on a Description Logic semantics known as EL++. The availability of computing power (particularly the elasticity and massive scalability of the cloud), reasoners, and tools have now made such a migration possible.

In a recent paper published in the Journal of the American Medical Informatics Association (JAMIA) and titled "Getting the foot out of the pelvis: modeling problems affecting use of SNOMED CT hierarchies in practical applications" (subscription required), Alan Rector, Sam Brandt, and Thomas Schneider used an OWL representation of SNOMED CT to unearth errors in SNOMED CT hierarchies for such common conditions as myocardial infarction, diabetes, and hypertension. This has significant practical implications for the use and interpretation of SNOMED codes in electronic health records (EHRs), post-coordination, and queries in software applications.

ICD-11 is being developed using OWL to allow consistency checking and linking to other biomedical terminologies and ontologies.

In addition to OWL, the Simple Knowledge Organization System (SKOS) specification can also used to represent thesauri, classification schemes, taxonomies, controlled vocabularies, and other concept schemes.


Overlap between the HL7 RIM and SNOMED CT

HL7 V3 messages like the CDA typically carry codes from SNOMED CT and other terminologies such as CPT, ICD9, and LOINC. However in certain cases such as family history, an observation can be expressed through a single SNOMED CT code or by using the RIM. To ensure model consistency, HL7 has released an implementation guide on using SNOMED CT in HL7 Version 3 documents such as the HL7 CDA (I refer to this implementation guide as Terminfo). In addition, HITSP C80 specifies vocabularies and terminologies to be used in various sections of a C32 document.

However, these guidelines have been difficult to enforce in practice due to the lack of automated validation tools. In a paper recently published in the Journal of Biomedical Semantics titled "Semantic validation of the use of SNOMED CT in HL7 clinical documents", Stijn Heymans, Matthew McKennirey, and Joshua Phillips described an approach using OWL ontologies to automatically validate Terminfo guidelines. The approach consisted in using the OWL representation of SNOMED CT, lifting (with XSLT) CDA XML instances into OWL individuals based on a CDA OWL ontology, and by expressing Terminfo guidelines as OWL integrity constraints. The latter were validated with the Pellet Integrity Constraint validator or Pellet-ICV.


Clarifying the relationship ("interface") between Ontologies, Coding Systems, and Information Models

I mentioned the need for a clean separation of concerns between the ontology and the information model. So what is the relationship between ontologies, coding systems (like SNOMED CT), and information models? I have long been intrigued by that question. In a paper titled "Binding Ontologies & Coding systems to Electronic Health Records and Messages", Alan Rector, Rahil Qamar, and Tom Marley write:

We contend that codes are also data structures – or more precisely symbols to be used in data structures – and that the model of codes is also at the level of the information model.

Although coding systems are derived from ontologies, we also need a separation of concerns between the coding system and the ontology. Remember that ontologies are based on an "Open World Assumption" (which means essentially that Anyone can say Anything about Any topic or the AAA slogan). Coding systems in contrast contain an enumerated list of codes to choose from.

In the same paper, the authors propose a code binding interface based on OWL DL between the model of meaning (i.e., the ontology), the model of codes (i.e., the terminology) and the information model.

To summarize our findings so far:

  1. We first create an ontology to describe our conceptualization (or understanding) of the world.
  2. We derive an enumerated list of codes called code system (itself a data structure) from the ontology.
  3. We used the codes in EHR applications databases and messages which are data structures.
  4. We can validate the binding between the ontology, the information model, and the code system (using the approach proposed by Alan Rector, Rahil Qamar, and Tom Marley).


Ontology Alignment


An ontology represents a specific world view that reflects the perspective of its origin (application, domain, people, or organization). Alignment consists in mapping concepts across ontologies. For example, in translational medicine, there could be a need to map an ontology used in biomedical research to an ontology used for clinical purposes. Several techniques can be used to achieve Ontology Alignment between two ontologies including:

  • Mapping each ontology to a third shared ontology called a foundational ontology
  • Mapping the two ontologies directly.

OWL facilitates ontology alignment through constructs such as owl:sameAs, owl:equivalentClass, and owl:equivalentProperty. These OWL constructs can be enhanced with a rule-based mapping using SWRL or RIF constructs. XSLT and SPARQL can also be useful in Ontology Alignment.


Clinical Knowledge Management (CKM)

Ontologies as knowledge representation formalism are well suited for modeling the medical knowledge contained in Clinical Practice Guidelines (CPGs) and Care Pathways (CPs). This enables automated reasoning and the execution of those guidelines based on patient data at the point of care.

Several ontology-based approaches to modelling CPGs and CPs have been proposed in the past including PROforma, HELEN, EON, GLIF, and SAGE. However, the lack of tooling has been a major impediment to a wide adoption of those standards. OWL has the advantage of being a widely implemented W3C Recommendation with available open source as well as commercial tools.


Ontologies and Enterprise Master Data Management (MDM)

As healthcare enterprises become larger and integrated (through the ACO model for example), there will be the need to consistently define and manage core business entities such as "patient", "provider", "payor", "care delivery", and "claim" across systems and business processes (e.g. research, clinical, reporting, and financial). The goal of Master Data Management (MDM) is to address those challenges.

One area of particular interest to MDM is the naming, meaning, equivalency, and relationships between those core business entities. Ontology constructs such as owl:sameAs, owl:equivalentClass, and owl:equivalentProperty can help establish common semantics across the enterprise when the same business entity is called by different names in different systems and business processes.


Linked Open Data (LOD)

Ontologies can help in building silo-busting applications that need to link data items (datum) to other data items (as opposed to web page to web page) over the web in order to perform entity correlation (or entity resolution). A datum can be a row in a relational database and technologies exist to provide an RDF view over a relational database table (see the R2RML: RDB to RDF Mapping Language). The RDF view itself can be defined in terms of an OWL ontology or RDFS vocabulary. Hence, LOD can integrate data across health applications and organizations by providing a semantic layer on top of existing applications.

The Linked Data design pattern is based on an open world assumption, uses dereferenceable HTTP URIs for identifying and accessing data items, RDF for describing metadada about those items, and semantic links to describe the relationships between those items. Other standards used in LOD applications include RDFS (for describing RDF vocabularies) and SQARQL (for querying RDF graphs). A practical application of LOD in healthcare is the Clinical Quality Linked Data project on health.data.gov.


Metadata and the PCAST Report

The Office of the National Coordination for Health Information Technology (ONC) recently released an Advance Notice of Proposed Rulemaking (ANPRM) on Metadata Standards to Support Nationwide Electronic Health Information Exchange. The ANPRM was driven by the PCAST Report released in December 2010.

Specifically, the ANPRM called for public comments on patient identity, provenance, and privacy. There are existing ontologies related to identity, provenance, and privacy that can be at least partially reused (ontology reuse is a recommended best practice to avoid the difficulties of ontology alignment). An example is the Provenance Vocabulary Core Ontology. Modeling metadata in healthcare using ontologies will enable reasoning, data integration through Linked Open Data mechanisms, and federated SPARQL queries. Please note that metadata expressed in XML syntax can be lifted into RDF (using techniques like XSLT or XQuery) to provide the same benefits.

Wednesday, July 13, 2011

Service-Oriented Clinical Decision Support in the Cloud

This is the presentation I gave today at the 2011 SOA In Healthcare Conference in Herdnon, VA.

Thursday, March 24, 2011

How Checklists Can Enhance Clinical Decision Support (CDS)

I have been reading "The Checklist Manifesto", a book written by Dr. Atul Gawande on the effectiveness of checklists in healthcare delivery. Another paper recently published in the Milbank Quarterly entitled "Counterheroism, Common Knowledge, and Ergonomics: Concepts from Aviation That Could Improve Patient Safety" suggests that beyond checklists, proven aviation safety practices such as Cockpit Resource Management (CRM), Joint safety briefings, and first-names-only rules could help improve patient safety.

I first became aware of the importance of checklists while I was being trained as a Flight Engineer. I spent a lot of time studying them carefully as an aviation student. Checklists are used during normal, abnormal, and emergency situations and pilots go through practical exercises in flight simulators to use them correctly. Let's not mince words: aviation as we know it today would not be possible without checklists.

In a study entitled "Missed and Delayed Diagnoses in the Emergency Department: A Study of Closed Malpractice Claims From 4 Liability Insurers", researchers found that:
The leading breakdowns in the diagnostic process were failure to order an appropriate diagnostic test (58% of errors), failure to perform an adequate medical history or physical examination (42%), incorrect interpretation of a diagnostic test (37%), and failure to order an appropriate consultation (33%). The leading contributing factors to the missed diagnoses were cognitive factors (96%), patient-related factors (34%), lack of appropriate supervision (30%), inadequate handoffs (24%), and excessive workload (23%).

Checklists can serve as cognitive aid in helping clinicians do their job safely. While the idea of using checklists and standard operating procedures has been fully embraced and adopted by aviation professionals for more than 70 years, it is only now making inroads into the field of medicine particularly in high pressure environments like intensive care units. The use of checklists in medicine has already shown the potential to save patients live and reduce human errors. However, the main challenge remains the acceptance of checklists by clinicians concerned about "Cookbook Medicine".

Checklists are just cognitive aids and the presence of an experienced and competent professional will always make a big difference in critical situations. As Captain Sullenberger (the airline pilot who successfully ditched US Airways Flight 1549 in the Hudson River in New York City, on January 15, 2009) said, "One way of looking at this might be that for 42 years, I've been making small, regular deposits in this bank of experience: education and training. And on January 15 the balance was sufficient so that I could make a very large withdrawal."

On modern airplanes, Electronic Centralised Aircraft Monitor (ECAM) systems or Engine Indicating and Crew Alerting Systems (EICAS) monitor aircraft systems and engines and displays messages in case of failure, as well as recommended remedial actions in the form of checklists. The National Transportation Safety Board (NTSB) accident report on US Airways Flight 1549 indicates that the First Officer "was able to promptly locate the [Engine Dual Failure checklist] procedure listed on the back cover of the [Quick Reference Handbook] QRH, turn to the appropriate page, and start executing the checklist."

In medicine, factors such as comorbidity can complicate the design of effective CDS. However, with the explosion of medical knowledge and evidence-based guidelines, CDS will become an essential tool in healthcare delivery. The design, development, implementation, and use of CDS is knowledge-intensive and require an effective collaborative knowledge management strategy. The challenge will be to integrate checklists into the different CDS modalities such as context-sensitive Infobuttons, order sets, alerts, reminders, data entry and visualization, and clinical workflows.

For example, the evaluation results (in the form of recommendations) of a CDS rule can be presented to the clinician as an electronic checklist. This in turn can be tied directly to quality measures in the era of Meaningful Use, Pay-For-Performance, and Accountable Care Organizations (ACOs). An interesting example would be a checklist that prompts clinicians to generate detailed discharge instructions to satisfy quality measures for patients with heart failure or acute myocardial infection.

There is an important Human Factors aspect to the design and use of cockpit checklists and flight-deck procedures. This has been the subject of advanced research at NASA Ames Research Center more than twenty years ago and the results have been widely disseminated and implemented in the aviation industry.

In an article entitled "The Checklist" published in the New Yorker, Atul Gawande wrote:
"But consider: there are hundreds, perhaps thousands, of things doctors do that are at least as dangerous and prone to human failure as putting central lines into I.C.U. patients. It’s true of cardiac care, stroke treatment, H.I.V. treatment, and surgery of all kinds. It’s also true of diagnosis, whether one is trying to identify cancer or infection or a heart attack. All have steps that are worth putting on a checklist and testing in routine care. The question—still unanswered—is whether medical culture will embrace the opportunity."

Peter Pronovost, an intensivist at Johns Hopkins Hospital and pioneer in the use of checklist in medicine, implemented a checklist at 127 Michigan intensive care units (ICUs) to reduce catheter-related blood stream infections (CRBSI). The project was so successful that it is estimated that it could significantly reduce the 28,000 deaths and 3 billion dollars in costs caused by these hospital-acquired infections.

The HL7 Clinical Decision Support (CDS) workgroup is working on standards for the vMR (Virtual Medical Record), Infobuttons, and Order Sets. There is also an effort at the OMG to publish a Clinical Decision Support Services specification for service-oriented CDS capabilities. The Flight Operation Interests Group (FOIG) of the Air Transport Association (ATA) is developing a data model and XML Schema for flight deck procedures and checklists. Developing a shareable content model for checklists in medicine could be an interesting idea.

Sunday, February 27, 2011

The Greening of the HL7 CDA

I attended the HIMSS 2011 Conference this week in Orlando, FL. The GreenCDA was one of the big themes at the HL7 booth. The goal of the HL7 GreenCDA project is to provide a simple intermediary XML representation of the CDA to facilitate quick learning and ease of use for developers building healthcare data exchange solutions. Using the GreenCDA should not require prior knowledge of the HL7 Reference Information Model (RIM) and the associated model refinement process.

Developers should be able to generate code from the GreenCDA XML schema using data binding tools in any programming language of their choice. It should also be possible to create a round-trip transformation between the GreenCDA and the CDA. These requirements also apply to CDA implementations such as the HITSP C32. The GreenCDA will be available as an HL7 Implementation Guide and the HL7 Structured Document Working Group recently issued a GreenCDA wire format position statement.

In a previous post entitled "XML Processing in Healthcare Applications", I described some of the issues with the HL7 CDA and HITSP C32 XML structure and suggested some ideas on dealing with the complexity of the CDA schema and C32 generation process. In this post, I will share some thoughts on what can be done to ensure that the GreenCDA lives up to its full potential as the answer to the simplification challenge in healthcare data exchange standards.

XML Schemas In the Software Development Lifecycle

The XML schema is an important part of the service contract in Service Oriented Architecture (SOA). Services contracts also include the WSDL and WS-Policy documents. Using the recommended contract-first approach to web services development, developers generate client as well as server code using various tools and APIs in their native programming language and framework. Even when not using a pre-existing industry XML schema, the contract-first approach allows developers to decouple the service contract from platform-specific idiosyncrasies and adhere to cross-platform interoperability standards such as the WSI-Basic Profile.

On the Java platform, JAX-WS and JAXB allow developers to generate code from the WSDL and XML schema with tools like WSDL2Java.

On the .NET platform, the Windows Communication Framework (WCF) and Visual Studio provide data binding tools out-of-the-box like the Svcutil. There is also an open source tool called WSCF.blue specifically designed to facilitate contract-first web services development on the .NET platform.

The GreenCDA XML schema could also be used in support of the "Canonical Data Model" enterprise integration pattern. Enterprise data architects typically extend industry XML schemas components to satisfy custom needs.

Finally, the PCAST Report released in December 2010 recommended a universal exchange language that is "structured as individual data elements, together with metadata that provide an annotation for each data element". The report suggests that the metadata attached to each of these data elements

"...would include (i) enough identifying information about the patient to allow the data to be located (not necessarily a universal patient identifier), (ii) privacy protection information—who may access the mammograms, either identified or de-identified, and for what purposes, (iii) the provenance of the data—the date, time, type of equipment used, personnel (physician, nurse, or technician), and so forth."

Put together, these requirements argue in favor of a GreenCDA XML schema that supports the following:

  • Reusability
  • Extensibility
  • A well-defined versioning strategy
  • Seamless code generation in a variety of programming languages and development frameworks
  • A metadata facility per the PCAST recommendations.


Designing for Reuse and Extensibility

I suggest that the GreenCDA should only use global and named simple and complex types to facilitate reuse and extensibility. In other words, anonymous type definitions should be avoided. Extensibility is typically implemented through the <xsd:extension> element. Reuse can also be achieved by assembling logically related schema components into separate schema documents and using the <xsd:include> and <xsd:import> constructs.

Common XML schema components (also called core components) such as Hl7 datatypes as well as person, address, and organization should be in a separate schema file ideally under a different namespace than the target namespace of the GreenCDA itself.


Component Naming and Documentation

It would be nice to have different naming conventions for types vs. elements and attributes. Also schema component names should be spelled out for readability. A component name like "ivlTs" is not obvious for someone who is not familiar with HL7 datatypes.

Each type, element, or attribute should have a required <xs:annotation> child element which describes the semantics of the element in its child <xs:documentation> element. In other words, all schema components should be documented.


Support for Data Binding Tools

Certain features of the XML Schema language such as mixed content models, <xsd:choice>, and dynamic type substitution with xsi:type are not well supported by various XML databinding tools. The need to use these constructs to accurately express the GreenCDA XML data structure should be balanced against the ability to seamlessly generate code from the GreenCDA XML schema using various XML databinding tools.

Before the GreenCDA is released for production use, I suggest at least two open source reference implementations in two different development platforms (such as Java and .NET) covering the end-to-end web services development cycle using the specific tooling provided by the respective platforms.


What Can Be Learned From the National Information Exchange Model (NIEM)

The ONC Standards and Interoperability Framework is leveraging the NIEM from a process perspective. However, I believe there is much to be learned from the design of the NIEM as an XML data exchange standard. This does not imply that the GreenCDA should use the NIEM Core. It simply means that the healthcare domain can leverage certain NIEM design principles that are not only backed by advanced research (at Georgia Tech Research Institute) in XML schema modeling, but are also proven by the numerous government agencies using the NIEM.

The NIEM embodies recognized XML Schema design patterns in its Naming and Design Rules (NDR). The NIEM provides a schematron-based tool to automatically validate XML schemas against the rules defined in the NDR. For example, the schematron schema can enforce component naming conventions or the requirement to document every schema component.

The PCAST Report says:
"We think that a universal exchange language must facilitate the exchange of metadata tagged elements at a more atomic and disaggregated level, so that their varied assembly into documents or reports can itself be a robust, entrepreneurial marketplace of applications."

The NIEM defines an extensible metadata facility for adding metadata to any data elemeent in the spirit of the PCAST recommendations. The NIEM itself support the exchange of "data items" at any level of granularity. These XML Schema Design Patterns are universal and can be applied to any domain including the healthcare domain.

Thursday, February 3, 2011

A Therapeutic Layered Cake

With all the talk about the PCAST Report, I've been doing some Systems thinking on semantic interoperability in healthcare IT. Trying to put all the pieces together, I remembered Tim Berners-Lee's "Semantic Web Layer Cake".




The Semantic Web layer Cake has gone through several iterations over the years (see James Hendler's presentation on that subject). However, I think it can still be very helpful in visualizing a unified framework for addressing the challenges of semantic interoperability in Healthcare IT.

As we move to Stage 2 of Meaningful Use, I believe Clinical Decision Support (CDS) will take center stage. Beyond currently used XML-based data structures (such as HL7 v3 messages), this will put an increased emphasis on medical terminologies, ontologies, and knowledge representation in OWL. For example, ICD-11 is being developed using OWL to allow consistency checking and linking to other biomedical terminologies and ontologies. Equally important to knowledge representation, but not shown in the layer cake above is the Simple Knowledge Organization System (SKOS) specification.

In a report entitled "Semantic Interoperability Deployment and Research Roadmap", Alan Rector summarized the difference between the notions of ontology, knowledge representation, and data model:

  • Ontology – A representation of what is universally true, including what is true by definition

  • Knowledge Representation or "Background knowledge resource" – a representation of what is generally true, or widely known to be true in some specific instance. In general, the knowledge representation is formulated in terms of and indexed by the Ontology.

  • Information model or Data model a model of how information is structured in a given software system, message, or electronic health record. In general, the data structures carry codes for the ontology as their content.

Clinical guidelines are published in the form of narrative text, sometimes with an evaluation algorithm. The translation of those guidelines into an executable representation is a complex and costly process. Several formalisms and standards have been proposed such as the Arden Syntax, GLIF, GELLO, and GEM. However, none of these standards has been widely adopted. Developed with inputs from the Business Rules, Logic Programming, and Semantic Web communities, the W3C Rule Interchange Format (RIF) can help with the interchange of executable Clinical Decision Support (CDS) rules in addition to adding reasoning capabilities to patient records. This example shows how decision support rules could be exchanged between two rules engines (Drools and Jess) using the RIF PRD syntax, a standard XML serialization format for production rule languages.

Existing patient records marked up in XML HITSP C32 or ASTM CCR can be lifted into RDF statements (with XSLT or XQuery for example) and queried using SPARQL.

Proof, Trust, and Cryptography are being currently addressed by various standards and specifications in the healthcare industry notably the OASIS Cross-Enterprise Security and Privacy Authorization (XSPA) Profiles of XACML, SAML, and WS-Trust.

On the User Interface side, I see HTML5 giving both Flex and Silverlight a run for their money in the next few years. This will be driven in part by the demand for mobile health (mHealth).

Saturday, January 22, 2011

XML Processing in Healthcare Applications

Meaningful Use certification requires the ability to create patient summaries in either C32 or CCR format. One of the most frequently asked questions on the HL7 Structured Document mailing list is related to the processing of the CDA XML schema with data binding tools such as JAXB or Castor. Initially, people are not able to generate Java classes with JAXB at all. After some changes to the schema, JAXB finally works and creates hundreds of classes which are not very easy to work with and maintain. Then someone suggests using the Model-Driven Health Tools (MDHT) CDA tools which are Java-based. You face additional headache if you're not developing on the Java platform.

In a paper presented at the Balisage 2009 conference, a team of engineers who implemented the "Laika" C32 compliance testing tool described the issues with the CDA and C32 XML structure:

  • Repeated use of overly abstract data structures: The HL7 CDA defines a number of very generic objects that are used to represent information in a given document. Differing information, such as medications and conditions, are represented using the same XML elements with very subtle changes in their nesting and attributes. This makes a CDA document difficult to process.

  • Underspecified implementation, including lack of a normative schema: While there is an XML schema for the HL7 CDA, a final schema does not exist for the HITSP C32 or other CDA-based documents due to their use of attributes for selecting templates. Thus, defining schemas for these documents is impossible. As a result, CDA-based constructs such as HITSP C32 cannot be automatically validated by XML parsers; standard object mapping tools, such as XML Beans or JAXB, cannot be used.

  • Ambiguous data types: Data can be represented in multiple ways in a CDA document. Consumers of CDA documents must, therefore, write software that handles any of the numerous permutations of these data types. This leads to bloated software, or more likely, software that does not implement the full specification and experiences interoperability problems when it receives data in an unexpected format.

  • Steep and long learning curve: Mastery of the CDA and its many specifications and constructs takes an experienced software engineer many months to achieve. Once learned, it is very cumbersome to employ in robust software applications and services. These difficulties drive up the cost and time to develop and maintain health care software, thus reducing the pace of innovation.

In a previous post entitled "The Future of Healthcare Data Exchange Standards", I suggested some ideas on how to develop standard XML schemas that support the software development process as opposed to hindering it. Since we're not there yet, in this post I will suggest some ideas on dealing with the complexity of the CDA schema and C32 generation process.

The key is to leverage the power of XML related technologies such as XPath2, XSLT2, XQuery, XProc, ISO Schematron, and even XML Schema 1.1 (for assertions or business rules constraints) to simplify the task. First, generate a simple and perhaps flat XML representation (let's call it simpleC32) of the patient summary from your domain objects or database (through a data transfer object or DTO for example). That simpleC32 contains all the content that is needed to populate the C32 templates and generate a valid C32 document. You can create your own XML schema for your simpleC32 and use it for validation and data binding.

Once you have a valid simpleC32 document, you can use XSLT2 to transform the patient summary from your simpleC32 representation into a C32 document that can be validated against the NIST Meaningful Use C32 Validator. This is roughly the idea behind the GreenCDA project. Use that as an inspiration on how to create a simple representation of the C32. You can even use the GreenCDA XML schema as your simpleC32. But don't hesitate to create your own simpleC32 if the GreenCDA does not work for you, because the target is still the C32, and the idea here is to have an intermediary representation (an Adapter) to make your life easier. It is also an approach that allows you to isolate your domain model and prevent the complexity of the C32 data model from leaking into your domain layer (see my previous post on the concept of Anti-Corruption Layer in Domain Driven Design).

Why is this approach not used more often? Some developers who code with imperative programming languages (such as Java, C#, or JavaScript) are not comfortable with declarative programming using languages like XSLT2 and XQuery. I've recently seen a Java developer use JAXB to create hundreds of classes and thousands of hard to maintain lines of code for a simple transformation from the CDA to a different target XML schema.

The basic difference between declarative (and functional) programming languages and imperative languages is that the former specify the "what" (the intent) as opposed to the "how" (the algorithm). However, declarative programming with XSLT2 and XQuery can be mastered through training and practice: see my previous posts entitled: "In Defense of XSLT", "Why XProc Rocks", and Putting XQuery to Work in Healthcare".

While Java and C# are general purpose languages, processing languages like XSLT2, XQuery, and XProc are actually based on the XQuery 1.0 and XPath 2.0 Data Model (XDM) and specifically designed for the purpose of manipulating XML documents. This is particularly helpful when dealing with a complex and deep structure such as the HL7 CDA and other HL7 V3 messages. These XML-centric processing languages use XPath2 to navigate the XML tree. In general, consider using them in the following cases:

  • Applications that require dealing with a complex industry data exchange XML schema which is not easy to process with your databinding and other development tools. In that case, create an intermediary simpe XML representation and map it to the industry data exchange XML schema using XSLT2 or XQuery (XQuery is not just for querying native XML databases, it is also a powerful language for processing XML documents).

  • Applications that require translation from an XML schema to another target XML schema (for example a mapping from the HL7 CCD to the ASTM CCR or from the C32 to XHTML).

  • Applications that require translation from an XML representation to a non-XML representation and round-trip (for example HL7 v2.x to HL7 V3, C32 XML to JSON, or C32 to a non-XML serialization of RDF).

  • Consider using XProc if you need to chain multiple XML processing steps such as: query a data source with XQuery, expand XIncludes, validate against XML schema, validate against a schematron schema, transform with XSLT2, generate a PDF document with XSL FO, and so on.

The Universal Exchange Language proposed by the PCAST Report could be an opportunity to address the issues listed above.