Adventures in Computing: HL7

Showing posts with label HL7. Show all posts

Sunday, November 2, 2014

Toward a Reference Architecture for Intelligent Systems in Clinical Care

A Software Architecture for Precision Medicine

Intelligent systems in clinical care leverage the latest innovations in machine learning, real-time data stream mining, visual analytics, natural language processing, ontologies, production rule systems, and cloud computing to provide clinicians with the best knowledge and information at the point of care for effective clinical decision making. In this post, I propose a unified open reference architecture that combines all these technologies into a hybrid cognitive system for clinical decision support. Indeed, truly intelligent systems are capable of reasoning. The goal is not to replace clinicians, but instead to provide them with cognitive support during clinical decision making. Furthermore, Intelligent Personal Assistants (IPAs) such as Apple's Siri, Google's Google Now, and Microsoft's Cortana have raised our expectations on how intelligent systems interact with users through voice and natural language.

In the strict sense of the term, a reference architecture should be abstracted away from concrete technology implementation. However in order to enable a better understanding of the proposed approach, I take liberty in explaining how available open source software can be used to realize the intent of the architecture. There is an urgent need for an open and interoperable architecture which can be deployed across devices and platforms. Unfortunately, this is not the case today with solutions like Apple's HealthKit and ResearchKit.

The specific open source software mentioned in this post can be substituted with other tools which provide similar capabilities. The following diagram is a depiction of the architecture (click to enlarge).

Clinical Data Sources

Clinical data sources are represented on the left of the architecture diagram. Examples include electronic medical record systems (EMR) commonly used in routine clinical care, clinical genome databases, genome variant knowledge bases, medical imaging databases, data from medical devices and wearable sensors, and unstructured data sources such as biomedical literature databases. The approach implements the Lambda Architecture enabling both batch and real-time data stream processing and mining.

Predictive Modeling, Real-Time Data Stream Mining, and Big Data Genomics

The back-end provides various tools and frameworks for advanced analytics and decision management. The analytics workbench includes tools for creating predictive models and data streaming mining. The decision management workbench includes a production rule system (providing seamless integration with clinical events and processes) and an ontology editor.

The incoming clinical data likely meet the Big Data criteria of volume, velocity, and variety (this is particularly true for physiological time series from wearable sensors). Therefore, specialized frameworks for large scale cluster computing like Apache Spark are used to analyze and process the data. Statistical computing and Machine Learning tools like R are used here as well. The goal is knowledge and patterns discovery using Machine Learning model builders like Decision Trees, k-Means Clustering, Logistic Regression, Support Vector Machines (SVMs), Bayesian Networks, Neural Networks, and the more recent Deep Learning techniques. The latter hold great promise in applications such as Natural Language Processing (NLP), medical image analysis, and speech recognition.

These Machine Learning algorithms can support diagnosis, prognosis, simulation, anomaly detection, care alerting, and care planning. For example, anomaly detection can be performed at scale using the k-means clustering machine learning algorithm in Apache Spark. In addition, Apache Spark allows the implementation of the Lambda Architecture and can also be used for genome Big Data analysis at scale.

In another post titled How Good is Your Crystal Ball?: Utility, Methodology, and Validity of Clinical Prediction Models, I discuss quantitative measures of performance for clinical prediction models.

Visual Analytics

Visual Analytics tools like D3.js, rCharts, ploty, googleVis, ggplot2, and ggvis can help obtain deep insight for effective understanding, reasoning, and decision making through the visual exploration of massive, complex, and often ambiguous data. Of particular interest is Visual Analytics of real-time data streams like physiological time series. As a multidisciplinary field, Visual Analytics combines several disciplines such as human perception and cognition, interactive graphic design, statistical computing, data mining, spatio-temporal data analysis, and even Art. For example, similar to Minard's map of the Russian Campaign of 1812-1813 (see graphic below), Visual Analytics can help in comparing different interventions and care pathways and their respective clinical outcomes over a certain period of time by displaying causes, variables, comparisons, and explanations.

Production Rule System, Ontology Reasoning, and NLP

The architecture also includes a production rule engine and an ontology editor (Drools and Protégé respectively). This is done in order to leverage existing clinical domain knowledge available from clinical practice guidelines (CPGs) and biomedical ontologies like SNOMED CT. This approach complements machine learning algorithms' probabilistic approach to clinical decision making under uncertainty. The production rule system can translate CPGs into executable rules which are fully integrated with clinical processes (workflows) and events. The ontologies can provide automated reasoning capabilities for decision support.

NLP includes capabilities such as:

Text classification, text clustering, document and passage retrieval, text summarization, and more advanced clinical question answering (CQA) capabilities which can be useful for satisfying clinicians' information needs at the point of care; and
Named entity recognition (NER) for extracting concepts from clinical notes.

The data tier supports the efficient storage of large amounts of time series data and is implemented with tools like Cassandra and HBase. The system can run in the cloud, for example using the Amazon Elastic Compute Cloud (EC2). For real-time processing of distributed data streams, cloud-based solutions like Amazon Kinesis and Lambda can be used.

Clinical Decision Services

The clinical decision services provide intelligence at the point of care typically using deployed predictive models, clinical rules, text mining outputs, and ontology reasoners. For example, Machine Learning algorithms can be exported in predictive markup language (PMML) format for run-time scoring based on the clinical data of individual patients, enabling what is referred to as Personalized Medicine. Clinical decision services include:

Diagnosis and prognosis
Simulation
Anomaly detection
Data visualization
Information retrieval (e.g., clinical question answering)
Alerts and reminders
Support for care planning processes.

The clinical decision services can be deployed in the cloud as well. Other clinical systems can consume these services through a SOAP or REST-based web service interface (using the HL7 vMR and DSS specifications for interoperability) and single sign-on (SSO) standards like SAML2 and OpenID Connect.

Intelligent Personal Assistants (IPAs)

Clinical decision services can also be delivered to patients and clinicians through IPAs. IPAs can accept inputs in the form of voice, images, and user's context and respond in natural language. IPAs are also expanding to wearable technologies such as smart watches and glasses. The precision of speech recognition, natural language processing, and computer vision is improving rapidly with the adoption of Deep Learning techniques and tools. Accelerated hardware technologies like GPUs and FPGAs are improving the performance and reducing the cost of deploying these systems at scale.

Hexagonal, Reactive, and Secure Architecture

Intelligent Health IT systems are not just capable of discovering knowledge and patterns in data. They are also scalable, resilient, responsive, and secure. To achieve these objectives, several architectural patterns have emerged during the last few years:

Domain Driven Design (DDD) puts the emphasis on the core domain and domain logic and recommends a layered architecture (typically user interface, application, domain, and infrastructure) with each layer having well defined responsibilities and interfaces for interacting with other layers. Models exist within "bounded contexts". These "bounded contexts" communicate with each other typically through messaging and web services using HL7 standards for interoperability.

The Hexagonal Architecture defines "ports and adapters" as a way to design, develop, and test an application in a way that is independent of the various clients, devices, transport protocols (HTTP, REST, SOAP, MQTT, etc.), and even databases that could be used to consume its services in the future. This is particularly important in the era of the Internet of Things in healthcare.

Microservices consist in decomposing large monolithic applications into smaller services following good old principles of service-oriented design and single responsibility to achieve modularity, maintainability, scalability, and ease of deployment (for example, using Docker).

CQRS/ES: Command Query Responsibility Segregation (CQRS) and Event Sourcing (ES) are two architectural patterns which consist in the use of event-driven messaging and an Event Store for separating commands (write-side) from queries (read-side) relying on the principle of Eventual Consistency. CQRS/ES can be implemented in combination with microservices to deliver new capabilities such as temporal queries, behavioral analysis, complex audit logs, and real-time notifications and alerts.

Functional Programming: Functional Programming languages like Scala have several benefits that are particularly important for applying Machine Learning algorithms on large data sets. Like functions in mathematics, functions in Scala have no side effects. This provides referential transparency. Machine Learning algorithms are in fact based on Linear Algebra and Calculus. Scala supports high-order functions as well. Variables are immutable witch greatly simplifies concurrency. For all those reasons, Machine Learning libraries like Apache Mahout have embraced Scala, moving away from the Java MapReduce paradigm.

Reactive Architecture: The Reactive Manifesto makes the case for a new breed of applications called "Reactive Applications". According to the manifesto, the Reactive Application architecture allows developers to build "systems that are event-driven, scalable, resilient, and responsive." Leading frameworks that support Reactive Programming include Akka and RxJava. The latter is a library for composing asynchronous and event-based programs using observable sequences. RxJava is a Java port (with a Scala adaptor) of the original Rx (Reactive Extensions) for .NET created by Erik Meijer.

Based on the Actor Model and built in Scala, Akka is a framework for building highly concurrent, asynchronous, distributed, and fault tolerant event-driven applications on the JVM. Akka offers location transparency, fault tolerance, asynchronous message passing, and a non-deterministic share-nothing architecture. Akka Cluster provides a fault-tolerant decentralized peer-to-peer based cluster membership service with no single point of failure or single point of bottleneck.

Also built with Scala, Apache Kafka is a scalable message broker which provides high-throughput, fault-tolerance, built-in partitioning, and replication for processing real-time data streams. In the reference architecture, the ingestion layer is implemented with Akka and Apache Kafka.

Web Application Security: special attention is given to security across all layers, notably the proper implementation of authentication, authorization, encryption, and audit logging. The implementation of security is also driven by deep knowledge of application security patterns, threat modeling, and enforcing security best practices (e.g., OWASP Top Ten and CWE/SANS Top 25 Most Dangerous Software Errors) as part of the continuous delivery process.

An Interface that Works across Devices and Platforms

The front-end uses a Mobile First approach and a Single Page Application (SPA) architecture with Javascript-based frameworks like AngularJS to create very responsive user experiences. It also allows us to bring the following software engineering best practices to the front-end:

Dependency Injection
Test-Driven Development (Jasmine, Karma, PhantomJS)
Package Management (Bower or npm)
Build system and Continuous Integration (Grunt or Gulp.js)
Static Code Analysis (JSLint and JSHint), and
End-to-End Testing (Protractor).

For mobile devices, Apache Cordova can be used to access native functions when desired. The main goal is to provide a user interface that works across devices and platforms such as iOS, Android, and Windows Phone.

Interoperability

Interoperability will always be a key requirement in clinical systems. Interoperability is needed between all players in the healthcare ecosystem including providers, payers, labs, knowledge artifact developers, quality measure developers, and public health agencies like the CDC. These standards exist today and are implementation-ready. However, only health IT buyers have the leverage to demand interoperability from their vendors.

Standards related to clinical decision support (CDS) include:

The HL7 Fast Healthcare Interoperability Resources (FHIR)
The HL7 virtual Medical Record (vMR)
The HL7 Decision Support Services (DSS) specification
The HL7 CDS Knowledge Artifact specification
The DMG Predictive Model Markup Language (PMML) specification.

Overcoming Barriers to Adoption

In a previous post, I discussed a practical approach to addressing challenges to the adoption of clinical decision support (CDS) systems.

Monday, August 25, 2014

Ontologies for Addiction and Mental Disease: Enabling Translational Research and Clinical Decision Support

In a previous post titled Why do we need ontologies in healthcare applications, I elaborated on what ontologies are and why they are different from information models of data structures like relational database schemas and XML schemas commonly used in healthcare informatics applications. In this post, I discuss two interesting applications of ontology engineering related to addiction and mental disease treatment. The first is the use of ontologies for achieving semantic interoperability in translational research. The second is the use of ontologies for modeling complex medical knowledge in clinical practice guidelines (CPGs) for the purpose of automated reasoning during execution in clinical decision support systems (CDS) at the point of care.

Why Semantic Interoperability is needed in biomedical translational research?

In order to accelerate the discovery of new effective therapeutics for mental health and addiction treatment, there is a need to integrate data across disciplines spanning biomedical research and clinical care delivery [1]. For example, linking data across disciplines can facilitate a better understanding of treatment response variability among patients in addiction treatment. These disciplines include:

Genetics, the study of genes.
Chemistry, the study of chemical compounds including substances of abuse like heroin.
Neuroscience, the study of the nervous system and the brain (addiction is a chronic disease of the brain)
Psychiatry which is focused on the diagnosis, treatment, and prevention of addiction and mental disorders.

Each of these disciplines has its own terminology or controlled vocabularies. In the clinical domain for example, DSM5 and RrxNorm are used for documenting clinical care. In biomedical research, several ontologies have been developed over the last few years including:

The Gene Ontology (GO)
The Chemical Entities of Biological Interest Ontology (CHEBI)
NeuroLex, an OWL ontology covering major domains of neuroscience: anatomy, cell, subcellular, molecule, function, and dysfunction.

To facilitate semantic interoperability between these ontologies, there are best practices established by the Open Biomedical Ontology (OBO) community. An example of best practice is the use of an upper-level ontology called the Basic Formal Ontology (BFO) which acts as a common foundational ontology upon which new ontologies can be created. OBO ontologies and principles are available on the OBO Foundry web site.

Among the ontologies available on the OBO Foundry is the Mental Functioning Ontology (MF) [2, 3]. The MF is being developed as a collaboration between the University of Geneva in Switzerland and the University at Buffalo in the United States. The project also includes a Mental Disease Ontology (MD) which extends the MF and the Ontology for General Medical Science (OGMS). The Basic Formal Ontology (BFO) is an upper-level ontology for both the MF and the OGMS. The picture below is a view of the class hierarchy of the MD showing details of the class "Paranoid Schizophrenia" in the right pane of the windows of the beta release of Protege 5, an open source ontology development environment (click on the image to enlarge it).

The following is a tree view of the "Mental Disease Course" class (click on the image to enlarge it):

Ontology constructs defined by the OWL2 language can help establish common semantics (meaning) and relationships between entities across domains. These constructs provide automated inferencing capabilities such as equivalence (e.g., owl:sameAs and owl:equivalentClass) and subsumption (e.g., rdfs:subClassOf) relationships between entities.

In addition, publishing data sources following Linked Open Data (LOD) principles and semantic search using federated SPARQL queries can help answer new research questions. Another application is semantic annotation for natural language processing (NLP) applications.

Ontologies as knowledge representation formalism for clinical decision support (CDS)

As knowledge representation formalism, ontologies are well suited for modeling complex medical knowledge and can facilitate reasoning during the automated execution of clinical practice guidelines (CPGs) and Care Pathways (CPs) based on patient data at the point of care. Several approaches to modelling CPGs and CPs have been proposed in the past including PROforma, HELEN, EON, GLIF, PRODIGY, and SAGE. However, the lack of free and open source tooling has been a major impediment to a wide adoption of these knowledge representation formalisms. OWL has the advantage of being a widely implemented W3C Recommendation with available mature open source tools.

In practice, the medical knowledge contained in CPGs can be manually translated into IF-THEN statements in most programming languages. Executable CDS rules (like other complex types of business rules) can be implemented with a production rule engine using forward chaining. This is the approach taken by OpenCDS and some large scale CDS implementations in real world healthcare delivery settings. This allows CDS software developers to externalize the medical knowledge contained in clinical guidelines in the form of declarative rules as opposed to embedding that knowledge in procedural code. Many viable open source business rule management systems (BRMS) are available today and provide capabilities such as a rule authoring user interface, a rules repository, and a testing environment.

However, production rule systems have a limitation. They do not scale because they require writing a rule for each clinical concept code (there are more than 311,000 active concepts in SNOMED CT alone). An alternative is to exploit the class hierarchy in an ontology so that subclasses of a given superclass can inherit the clinical rules that are applicable to the superclass (this is called subsumption). In addition to subsumption, an OWL ontology also support reasoning with description logic (DL) axioms [4].

An ontology designed for a clinical decision support (CDS) system can integrate the clinical rules from a CPG, a domain ontology like the Mental Disorder (MD) ontology, and the patient medical record from an EHR database in order to provide inferences in the form of treatment recommendations at the point of care. The OWL API [5] facilitates the integration of ontologies into software applications. It supports inferencing using reasoners like Pellet and HermiT. OWL2 reasoning capabilites can be enhanced with rules represented in SWRL (Semantic Web Rule Language) which is implemented by reasoners like Pellet as well as the Protege OWL development environement. In addition to inferencing, another benefit of an OWL-based approach is transparency: the CDS system can provide an explanation or justification of how it arrives at the treatment recommendations.

Nonetheless, these approaches are not mutually exclusive: a production rule system can be integrated with business processes, ontologies, and predictive analytics models. Predictive analytics models provide a probabilistic approach to treatment recommendations to assist in the clinical decision making process.

References

[1] Janna Hastings, Werner Ceusters, Mark Jensen, Kevin Mulligan and Barry Smith. Representing mental functioning: Ontologies for mental health and disease. Proceedings of the Mental Functioning Ontologies workshop of ICBO 2012, Graz, Austria.

[2] Ceusters, W. and Smith, B. (2010a). Foundations for a realist ontology of mental disease. Journal of Biomedical Semantics, 1(1), 10.

[3] Hastings, J., Smith, B., Ceusters, W., and Mulligan, K. (2012). The mental functioning ontology. http://code.google.com/p/mental-functioning-ontology/, last accessed August 24, 2014

[4] Sesen MB, Peake MD, Banares-Alcantara R, Tse D, Kadir T, Stanley R, Gleeson F, Brady M. 2014 Lung Cancer Assistant: a hybrid clinical decision support application for lung cancer care. J. R. Soc. Interface 11: 20140534.

[5] Matthew Horridge, Sean Bechhofer. The OWL API: A Java API for OWL Ontologies Semantic Web Journal 2(1), Special Issue on Semantic Web Tools and Systems, pp. 11-21, 2011.

Sunday, August 17, 2014

Natural Language Processing (NLP) for Clinical Decision Support: A Practical Approach

A significant portion of the electronic documentation of clinical care is captured in the form of unstructured narrative text like psychotherapy and progress notes. Despite the big push to adopt structured data entry (as required by the Meaningful Use incentive program for example), many clinicians still like to document care using free narrative text. The advantage of using narrative text as opposed to coded entries is that narrative text can tell the story of the patient and the care provided particularly in complex cases. My opinion is that free narrative text should be used to complement coded entries when necessary to capture relevant information.

Furthermore, medical knowledge is expanding very rapidly. For example, PubMed has more than 24 millions citations for biomedical literature from MEDLINE, life science journals, and online books. It is impossible for the human brain to keep up with that amount of knowledge. These unstructured sources of knowledge contain the scientific evidence that is required for effective clinical decision making in what is referred to as Evidence-Based Medicine (EBM).

In this blog, I discuss two practical applications of Natural Language Processing (NLP). The first is the use of NLP tools and techniques to automatically extract clinical concepts and other insight from clinical notes for the purpose of providing treatment recommendations in Clinical Decision Support (CDS) systems. The second is the use of text analytics techniques like clustering and summarization for Clinical Question Answering (CQA).

The emphasis of this post is on a practical approach using freely available and mature open source tools as opposed to an academic or theoretical approach. For a theoretical treatment of the subject, please refer to the book Speech and Language Processing by Daniel Jurafsky and James Martin.

Clinical NLP with Apache cTAKES

Based on the Apache Unstructured Information Management Architecture (UIMA) framework and the Apache OpenNLP natural language processing toolkit, Apache cTAKES provides a modular architecture utilizing both rule-based and machine learning techniques for information extraction from clinical notes. cTAKES can extract named entities (clinical concepts) from clinical notes in plain text or HL7 CDA format and map these entities to various dictionaries including the following Unified Medical Language System (UMLS) semantic types: diseases/disorders, signs/symptoms, anatomical sites, procedures, and medications.

cTAKES includes the following key components which can be assembled to create processing pipelines:

Sentence boundary detector based on the OpenNLP Maximum Entropy (ME) sentence detector.
Tokenizor
Normalizer using the National Library of Medicine's Lexical Variant Generation (LVG) tool
Part-of-speech (POS) tagger
Shallow parser
Named Entity Recognition (NER) annotator using dictionary look-up to UMLS concepts and semantic types. The Drug NER can extract drug entities and their attributes such as dosage, strength, route, etc.
Assertion module which determines the subject of the statement (e.g., is the subject of the statement the patient or a parent of the patient) and whether a named entity or event is negated (e.g., does the presence of the word "depression" in the text implies that the patient has depression).

Apache cTAKES 3.2 has added YTEX, a set of extensions developed at Yale University which provide integration with MetaMap, semantic similarity, export to Machine Learning packages like Weka and R, and feature engineering.

The following diagram from the Apache cTAKES Wiki provides an overview of these components and their dependencies (click to enlarge):

Source: Apache cTAKES wiki

Massively Parallel Clinical Text Analytics in the Cloud with GATECloud

The General Architecture for Text Engineering (GATE) is a mature, comprehensive, and open source text analytics platform. GATE is a family of tools which includes:

GATE Developer: an integrated development environment (IDE) for language processing components with a comprehensive set of available plugins called CREOLE (Collection of REusable Objects for Language Engineering).
GATE Embedded: an object library for embedding services developed with GATE Developer into third-party applications.
GATE Teamware: a collaborative semantic annotation environment based on a workflow engine for creating manually annotated corpora for applying machine learning algorithms.
GATE Mímir: the "Multi-paradigm Information Management Index and Repository" which supports a multi-paradigm approach to index and search over text, ontologies, and semantic metadata.
GATE Cloud: a massively parallel clinical text analytics platform (Platform as a Service or PaaS) built on the Amazon AWS Cloud.

What makes GATE particularly attractive is the recent addition of GATECloud.net PaaS which can boost the productivity of people involved in large scale text analytics tasks.

Clustering, Classification, Text Summarization, and Clinical Question Answering (CQA)

An unsupervised machine learning approach called Clustering can be used to classify large volumes of medical literature into groups (clusters) based on some similarity measure (such as the Euclidean distance). Clustering can be applied at the document, search result, and word/topic levels. Carrot2 and Apache Mahout are open source projects that provide several methods for document clustering. For example, the Latent Dirichlet Allocation learning algorithm in Apache Mahout automatically clusters words into topics and documents into mixtures of topics. Other clustering algorithms in Apache Mahout include: Canopy, Mean-Shift, Spectral, K-Means and Fuzzy K-Means. Apache Mahout is part of the Hadoop ecosystem and can therefore scale to very large volumes of unstructured text.

Document classification essentially consists in assigning predefined set of labels to documents. This can be achieved through supervised machine learning algorithms. Apache Mahout implements the Naive Bayes classifier.

Text summarization techniques can be used to present succinct and clinically relevant evidence to clinicians at the point of care. MEAD (http://www.summarization.com/mead/) is an open source project that implements multiple summarization algorithms. In the biomedical domain, SemRep is a program that extracts semantic predications (subject-relation-object triples) from biomedical free text. Subject and object arguments of each predication are concepts from the UMLS Metathesaurus and the relation is from the UMLS Semantic Network (e.g., TREATS, Co-OCCURS_WITH). The SemRep summarization provides a short summary of these concepts and their semantic relations.

AskHermes (Help clinicians to Extract and aRrticulate Multimedia information for answering clinical quEstionS) is a project that attempts to implement these techniques in the clinical domain. It allows clinicians to enter questions in natural language and uses the following unstructured information sources: MEDLINE abstracts, PubMed Central full-text articles, eMedicine documents, clinical guidelines, and Wikipedia articles.

The processing pipeline in AskHermes includes the following: Question Analysis, Related Questions Extraction, Information Retrieval, Summarization and Answer Presentation. AskHermes performs question classification using MMTx (MetaMap Technology Transfer) to map keywords to UMLS concepts and semantic types. Classification is achieved through supervised machine learning algorithms such as Support Vector Machine (SVM) and conditional random fields (CFRs). Summarization and answer presentation are based on clustering techniques. AskHermes is powered by open source components including: JBoss Seam, Weka, Mallet , Carrot2 , Lucene/Solr, and WordNet (a lexical database for the English language).

Thursday, August 15, 2013

Health IT Innovations for Care Coordination

The Business Case

According to an article by Bodenheimer et al. published in the January/February 2009 issue of Health Affairs and titled Confronting The Growing Burden Of Chronic Disease: Can The U.S. Health Care Workforce Do The Job?:

In 2005, 133 million americans were living with at least one chronic condition. In 2020, this number is expected to grow to 157 million. In 2005, sixty-three million people had multiple chronic illnesses, and that number will reach eighty-one million in 2020.

Patients with co-morbidities are typically treated by multiple clinicians working for different healthcare organizations. Care Coordination is necessary for the effective treatment of these patients and reducing costs. Effective Care Coordination can reduce the number of redundant tests and procedures, hospital admissions and readmissions, medical errors, and patient safety issues related to the lack of medication reconciliation.

According to a paper by Dennison and Hugues published in the Journal of Cardiovascular Nursing and titled Progress in Prevention Imperative to Improve Care Transitions for Cardiovascular Patients, direct communication between the hospital and primary care setting occurred only 3 percent of the time. According to the same paper, at discharge, a summary was provided only 12 percent of the time, and this occurrence remained poor at 4 weeks post-discharge, with only 51 percent of practitioners providing a summary. The paper concluded that this standard affected quality of care in 25 percent of follow-up visits.

Health Information Exchanges (HIEs) and emerging delivery models like the Accountable Care Organization (ACO) and the Patient-Centered Medical Home (PCMH) were designed to promote care coordination. However, according to an article by Furukawa et al. published in the August 2013 issue of Health Affairs and titled Hospital Electronic Health Information Exchange Grew Substantially In 2008–12:

In 2012, 51 percent of hospitals exchanged clinical information with unaffiliated ambulatory care providers, but only 36 percent exchanged information with other hospitals outside the organization. . . . In 2012 more than half of hospitals exchanged laboratory results or radiology reports, but only about one-third of them exchanged clinical care summaries or medication lists with outside providers.

Furthermore, the financial sustainability of many HIEs remains an issue. According to another article by Adler-Milstein et al. published in the same issue of Health Affairs and titled Operational Health Information Exchanges Show Substantial Growth, But Long-Term Funding Remains A Concern, "74 percent of health information exchange efforts report struggling to develop a sustainable business model".

There are other obstacles to care coordination including the existing fee-for-service healthcare delivery model (as opposed to a value-based model), the lack of interoperability between healthcare information systems, and the lack of adoption of effective collaboration tools.

According to a report by the Institute of Medicine (IOM) titled The Healthcare Imperative: Lowering Costs and Improving Outcomes, a program designed to improve care coordination could result in national annual savings of $240.1 billions.

What Can We Learn From High Risk Operations in Other Industries?

Similar breakdowns in communication during shift handovers have also been observed in risky operating environments, sometimes with devastating consequences. In the aerospace industry, human factors research and training have played an important role in successfully addressing the issue. A research paper by Parke and Mishkin titled Best Practices in Shift Handover Communication: Mars Exploration Rover Surface Operations included the following recommendations:

Two-way Communication, Preferably Face-to-Face. . . . Two-way communication enables the incoming worker to ask questions and rephrase the material to be handed over, so as to expose these differences [in mental model].

Face-to-Face Handovers with Written Support. Face-to-face handovers are improved if they are supported by structured written material—e.g., a checklist of items to convey, and/or a position log to review.

Content of Handover Captures Intent. Handover communication works best if it captures problems, hypotheses, and intent, rather than simply lists what occurred.

While the logistics of healthcare delivery does not always permit physical face-to-face communication between clinicians during transitions of care, the web has seen an explosion in online collaboration tools. Innovative organizations have embraced these technologies giving rise to a new breed of enterprise software known as Enterprise 2.0 or Social Enterprise Software. This new breed of software is not only social, but also mobile, and cloud-based.

Care Coordination in the Health Enterprise 2.0

Collaborative Authoring of a Longitudinal Care Plan. From a content perspective, the Care Plan is the backbone of Care Coordination. The Care Plan should be comprehensive and standardized (similar to the checklist in aerospace operations). It should include problems, medications, orders, results, care goals (taking into consideration the patient's wishes and values), care team members and their responsibilities, and actual patient outcomes (e.g., functional status). Clinical Decision Support (CDS) tools can be used to dynamically generate a basic Care Plan based on the patient's specific clinical data. This basic Care Plan can be used by members of the care team to build a more elaborate Longitudinal Care Plan. CDS tools can also automatically generate alerts and reminders for the care team.

Communication and Collaboration using Enterprise 2.0 Software. These tools should be used to enable collaboration between all members of the care team which include not only clinicians, but also non-clinician caregivers, and the patient herself. Beyond email, these tools allow conversations and knowledge sharing through instant messaging, video conferencing (for virtual two-way face-to-face communication), content management, file syncing (allowing the longitudinal care plan to be synchronized and shared among all members of the care team), search, and enterprise social networking (because clinical work is a social activity like most human activities). A providers directory should make it easy for users to find a specific provider and all their contact information based on search criteria such as location, specialty, knowledge, experience, and telephone number.

Light Weight Standards and Protocols for Content, Transport, Security, and Privacy. The foundation standards are: REST, JSON, OAuth2, and OpenID Connect. An emerging approach that could really help put patients in control of the privacy of their electronic medical record is the OAuth2.0-based User-Managed Access (UMA) Protocol of the Kantara Initiative (see my previous post titled Patient Privacy at Web Scale). Initiatives like the ONC-sponsored RESTful Health Exchange (RHEX) project and the HL7 Fast Healthcare Interoperability Resources (FHIR) hold great promise.

Case Management Tools. They are typically used by Nurse Practionners (Case Managers) in Medical Homes, a concept popularized by the Patient-Centered Medical Home healthcare delivery model to coordinate care. These tools integrate various capabilities such as risk stratification (using predictive modeling) to identify at-risk patients, content management (check-in, check-out, versioning), workflows (human tasks), communication, business rule engine, and case reporting/analytics capabilities.

Sunday, April 14, 2013

Addressing Challenges to the Adoption of Clinical Decision Support (CDS) Systems: A Practical Approach

Laptop and stethoscope by jfcherry is licensed under CC BY-SA 2.0

This post has been updated on February 15, 2015.

Despite its potential to improve the quality of care, CDS is not widely used in health care delivery today. In technology marketing parlance, CDS has not crossed the chasm. There are several issues that need to be addressed including:

Clinicians' acceptance of the concept of automated execution of evidence-based clinical practice guidelines.

Integration into clinical workflows and care protocols.

Usability and human factors issues including alert fatigue and the human factors that influence alert acceptance.

With the expanding use of clinical prediction models for diagnosis and prognosis, there is a need for a better understanding of the probabilistic approach to clinical decision making under uncertainty.

Standardization to enable the interoperability and reuse of CDS knowledge artifacts and executable clinical guidelines.

The challenges associated with the automatic concurrent execution of multiple clinical practice guidelines for patients with co-morbidities.

Integration with modern information retrieval capabilities which allow clinicians to access up-to-date biomedical literature. These capabilities includes text mining, Natural Language Processing (NLP), and more advanced Clinical Question Answering (CQA) tools. CQA allows clinicians to ask clinical questions in natural language and extracts answers from very large amounts of unstructured sources of medical knowledge. PubMed has more than 23 millions citations for biomedical literature from MEDLINE, life science journals, and online books. The typical Clinical Practice Guideline (CPG) is 50 to 150 pages long. It is impossible for the human brain to keep up with that amount of knowledge.

The use of mathematical simulations in CDS to explore and compare the outcomes of various treatment alternatives.

Integration of genomics to enable personalized medicine as the cost of whole-genome sequencing (WGS) continues to fall.

Integration of outcomes research in the context of a shift to a value-based healthcare delivery model. This can be achieved by incorporating the results of Comparative Effectiveness Research (CER) and Patient-Centered Outcome Research (PCOR) into CDS systems. Increasingly, outcomes research will be performed using observational studies (based on real world clinical data) which are recognized as complementary to randomized control trials (RCTs) for discovering what works and what doesn't work in practice. This is a form of Practice-Based Evidence (PBE) that is necessary to close the evidence loop.

Support for a shared decision making process which takes into account the values, goals, and wishes of the patient.

The use of Visual Analytics in CDS to facilitate analytical reasoning over very large amounts of structured and unstructured data sources.

Finally, the challenges associated with developing hybrid decision support systems which seamlessly integrate all the technologies mentioned above including: machine learning predictive algorithms, real-time data stream mining, visual analytics, ontology reasoning, and text mining.

In response to a paper titled Grand challenges in clinical decision support by Sittig et al. [1], Fox et al. [2] outlined four theoretical foundations for the design and implementation of CDS systems: decision theory, theories of knowledge representation, process design, and organizational modeling. The practical approach discussed in this post is grounded in those four theoretical foundations.

CDS Interoperability

The complexity and cost inherent in capturing the medical knowledge in clinical guidelines and translating that knowledge into executable code remain a major impediment to the widespread adoption of CDS software. Therefore, there is a need for standards for the interchange and reuse of CDS knowledge artifacts and executable clinical guidelines.

Different formalisms, methodologies, and architectures have been proposed over the years for representing the medical knowledge in clinical guidelines. Examples include but are not limited to the following:

The Arden Syntax
GLIF (Guideline Interchange Format)
GELLO (Guideline Expression Language Object-Oriented)
GEM (Guidelines Element Model)
The Web Ontology Language (OWL)
PROforma
EON
PRODIGY
Asbru
GUIDE
SAGE.

More recently, HL7 has published the Clinical Decision Support (CDS) Knowledge Artifact Specification which provides guidance on shareable CDS knowledge artifacts including event-condition-action rules, order sets, and documentation templates.

The HL7 Context-Aware Knowledge Retrieval (Infobutton) specifications provide a standard mechanism for clinical information systems to request context-specific clinical knowledge to satisfy clinicians and patients' information needs at the point of care.

Enabling the interoperability of executable clinical guidelines requires a standardized domain model for representing the medical information of patients and other contextual clinical information. The HL7 Virtual Medical Record (vMR) is a standardized domain model for representing the inputs and outputs of CDS systems. The ability to transform an HL7 CCDA document into an HL7 vMR document means that EHR systems that are Meaningful Use Stage 2 certified can consume these standard-compliant decision support services.

Because of the complexity and cost of developing CDS software, CDS software capabilities can be exposed as a set of services (part of a service-oriented architecture [16]) which can be consumed by other clinical systems such as EHR and Computerized Physician Order Entry (CPOE) systems. When deployed in the cloud, these CDS software services can be shared by several health care providers to reduce costs. The HL7 Decision Support Service (DSS) specification defines a REST and a SOAP web service interfaces using the vMR as message payload for accessing interoperable decision support services.

In practice, executable CDS rules (like other complex types of business rules) can be implemented with a production rule system using forward chaining. This is the approach taken by OpenCDS and some other large scale CDS implementations in real-world healthcare delivery settings. This allows CDS software developers to externalize the medical knowledge contained in clinical practice guidelines in the form of declarative rules as opposed to embedding that knowledge in procedural code. Many viable open source business rule management systems (BRMS) are available today and provide capabilities such as a rule authoring user interface, a rules repository, and a testing environment. Furthermore, a rule execution environment can be integrated with business processes, ontologies, and predictive analytics models (more on that later).

The W3C Rule Interchange Format (RIF) specification is a possible solution to the interchange of executable CDS rules. The RIF Production Rule Dialect (RIF PRD) is designed as a common XML serialization syntax for multiple rule languages to enable rule interchange between different BRMS. For example, RIF-PRD would allow the exchange of executable rules between existing BRMS like JBoss Drools, IBM ILOG JRules, and Jess. RIF is currently a W3C Recommendation and is backed by several BRMS vendors. Consentino et al. [3] described a model-driven approach for interoperability between IBM's proprietary ILOG rule language and RIF.

Seamless Integration into Clinical Workflows and Care Pathways

One of the main complaints against CDS systems is that they are not well integrated into clinical workflows and care protocols. Existing business process management standards like the Business Process Modeling Notation (BPMN) can provide a proven, practical, and adaptable approach to the integration of CDS rules and clinical pathways and protocols. Some existing open source and commercial BRMS already provide an integration of business rules and business processes out-of-the box and there are well-known patterns for integrating rules and processes [4, 5, 6] in business applications.

In 2014, the Object Management Group (OMG) released the Decision Model and Notation (DMN) specification which defines various constructs for modeling decision logic. The combination of BPMN and DMN [7, 8] provides a practical approach for modeling the decisions in clinical practice guidelines while integrating these decisions with clinical workflows. BPMN and DMN also support the modeling of decisions and processes that span functional and organizational boundaries.

Human Factors in the Use of Clinical Decision Support Systems

We need to do a better job in understanding the human factors that influence alert acceptance by clinicians in CDS. We also need clear and proven usability guidelines (backed by scientific research) that can be implemented by CDS software developers. Several research projects have sought to understand why clinicians accept or ignore alerts in medication-related CDS [9, 10]. Zacharia et al. [11] developed and validated an Instrument for Evaluating Human-Factors Principles in Medication-Related Decision Support Alerts (I-MeDeSA). I-MeDeSA measures CDS alerts on the following nine human factors principles: alarm philosophy, placement, visibility, prioritization, color learnability and confusability, text-based information, proximity of task components being displayed, and corrective actions.

The British National Health Service (NHS) Common User Interface (CUI) Program has created standards and guidance in support of the usability of clinical applications with inputs from user interface design specialists, usability experts, and hundreds of clinicians with a diversity of background in using health information technology. The program is based on a rigorous development process which includes: research, design, prototyping, review, usability testing, and patient safety assessment by clinicians. In the US, the National Institute of Standards and Technology (NIST) has also published some guidance on the usability of clinical applications.

Studies have also shown that like in aviation, checklists can provide cognitive support to clinicians in the decision making process.

Integrating Genomic Data with CDS

The costs of whole-genome sequencing (WGS) and whole-exome sequencing (WES) continue to fall. Increasingly, both WGS and WES will be used in clinical practice for inherited disease risk assessment and pharmacogenomic findings [21]. There is a need for a modern CDS architecture that can support and facilitate the introduction and use of WGS and WES in clinical practice.

In a paper titled Technical desiderata for the integration of genomic data with clinical decision support [14], Welch et al. proposed technical requirements for the integration of genomic data with clinical decision support. In another paper titled A proposed clinical decision support architecture capable of supporting whole genome sequence information [15], Welch et al. proposed the following clinical decision support architecture for supporting whole genome sequence information (click to enlarge):

Proposed service-oriented architecture (SOA) for whole genome sequence (WGS)-enabled CDS by Brandon M. Welch, Salvador Rodriguez Loya, Karen Eilbeck, and Kensaku Kawamoto is licensed under CC BY 3.0

The proposed architecture includes the following components: the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). The authors suggest separating the genome data from the EHR data.

Practice-Based Evidence (PBE) needed for closing the evidence loop

Prospective randomized controlled trials (RCTs) are still considered the gold standard in evidence-based medicine. Although they can control for biases, RCTs are costly, time consuming, and must be performed under carefully controlled conditions.

The retrospective analysis of existing clinical data sources is increasingly recognized as complementary to RCTs for discovering what works and what doesn't work in real world clinical practice [23]. These retrospective studies will allow the creation of clinical prediction models which can provide personalized absolute risk and treatment outcome predictions for patients. They also facilitate what has been referred to as "rapid learning health care." [24]

Toward Data-Driven Clinical Decision Support (CDS)

Williams Osler (1849-1919) [19] famously said that "Medicine is a science of uncertainty and an art of probability."

The use of clinical prediction models for diagnosis and prognosis is becoming common practice in clinical care. These models can predict the health risks of patients based on their individual health data. Clinical Prediction Models provide absolute risk and treatment outcome prediction for conditions such as diabetes, kidney disease, cancer, cardiovascular disease, and depression. These models are built with statistical learning techniques and introduce new challenges related to their probabilistic approach to clinical decision making under uncertainty [20]. In his book titled Super Crunchers: Why Thinking-By-Numbers Is The New Way To be Smart, Ian Ayres wrote:

Traditional experts make better decisions when they are provided with the results of statistical prediction. Those who cling to the authority of traditional experts tend to embrace the idea of combining the two forms of knowledge by giving the experts 'statistical support'. The purveyors of diagnostic software are careful to underscore that its purpose is only to provide support and suggestions. They want the ultimate decision and discretion to lie with the doctor. [12]

Furthermore, in order to leverage existing clinical domain knowledge from clinical practice guidelines and biomedical ontologies [22], machine learning algorithms' probabilistic approach to decision making under uncertainty must be complemented by technologies like production rule systems and ontology reasoners. Sesen et al. [18] designed a hybrid CDS for lung cancer care based on probabilistic reasoning with a Bayesian Network model and guideline-based recommendations using a domain ontology and an ontology reasoner.

Fox et al. [2] proposed an argumentation approach based on the construction, summarization, and prioritization of arguments for and against each generated candidate decision. These arguments can be both qualitative or quantitative in nature. On the importance of presenting evidence-based rationale in CDS systems, Fox et al. wrote:

In short, to improve usability of clinical user interfaces we advocate basing design around a firm theoretical understanding of the clinician’s perspective on the medical logic in a decision, the qualitative as well as quantitative aspects of the decision, and providing an evidence-based rationale for all recommendations offered by a CDS. [2]

In a paper titled A canonical theory of dynamic decision-making [13], Fox et al. proposed a canonical theory of dynamic decision-making and presented the PROforma clinical guideline modeling language as an instance of the canonical theory.

Clinical predictive model presentation techniques include traditional score charts, nomograms, and clinical rules [17]. However Clinical Prediction Models are easier to use and maintain when deployed as scoring services (part of a service-oriented software architecture) and integrated into CDS systems. The scoring service can be deployed in the cloud to allow integration with multiple client clinical systems [20]. The Predictive Model Markup Language (PMML) specification published by the Data Mining Group (DMG) supports the interoperable deployment of predictive models in heterogeneous software environments.

Visual Analytics or data visualization techniques can also play an important role in the effective presentation of Clinical Prediction Models to nonstatisticians particularly in the context of shared decision making.

Concurrent execution of multiple guidelines for patients with co-morbidities

According to the Medicare 2012 chart book, "over two-thirds of beneficiaries having two or more chronic conditions and 14% having 6 or more chronic conditions." [25]

In Grand Challenges in Clinical Decision Support [1], Sittig et al. wrote:

The challenge is to create mechanisms to identify and eliminate redundant, contraindicated, potentially discordant, or mutually exclusive guideline-based recommendations for patients presenting with co-morbid conditions.

Michalowski et al. [26] proposed a mitigation framework based on a first-order logic (FOL) approach.

A CDS Architecture for the era of Precision Medicine

I proposed a scalable CDS architecture for Precision Medicine in another post titled Toward a Reference Architecture for Intelligent Systems in Clinical Care.

References

[1] Sittig D, Wright A, Osheroff JA, Middletone B, Jteich J, Ash J, et al. Grand challenges in clinical Decision support. J Biomed Inform 2008;41(2):387–92.

[2] Fox, J., Glasspool, D.W., Patkar, V., Austin, M., Black, L., South, M., et al. (2010). Delivering clinical decision support services: there is nothing as practical as a good theory. J. Biomed. Inform. 43, 831–843

[3] Valerio Cosentino, Marcos Didonet del Fabro, Adil El Ghali. A model driven approach for bridging ILOG Rule Language and RIF. RuleML, Aug 2012, Montpellier, France.

[4] Mauricio Salatino. (Processes & Rules) OR (Rules & Processes) 1/X. http://salaboy.com/2012/07/19/processes-rules-or-rules-processes-1x/. Retrieved February 15, 2015.

[5] Mauricio Salatino. (Processes & Rules) OR (Rules & Processes) 2/X. http://salaboy.com/2012/07/28/processes-rules-or-rules-processes-2x/. Retrieved February 15, 2015.

[6] Mauricio Salatino. (Processes & Rules) OR (Rules & Processes) 3/X. http://salaboy.com/2012/07/29/processes-rules-or-rules-processes-3x/. Retrieved February 15, 2015.

[7] Sylvie Dan. Modeling Clinical Rules with the Decision Modeling and Notation (DMN) Specification. http://sylviedanba.blogspot.com/2014/05/modeling-clinical-rules-with-decision.html. Retrieved February 15, 2015.

[8] Dennis Andrzejewski, Eberhard Beck, Eberhard Beck, Laura Tetzlaff. The transparent representation of medical decision structures based on the example of breast cancer treatment. 9th International Conference on Health Informatics.

[9] Phansalkar S, Zachariah M, Seidling HM, Mendes C, Volk L, Bates DW. Evaluation of medication alerts in electronic health records for compliance with human factors principles. J Am Med Inform Assoc. 2014 Oct;21(e2):e332-40. doi: 10.1136/amiajnl-2013-002279.

[10] Seidling HM, Phansalkar S, Seger DL, et al. Factors influencing alert acceptance: a novel approach for predicting the success of clinical decision support. J Am Med Inform Assoc 2011;18:479–84.

[11] Zachariah M, Phansalkar S, Seidling HM, et al. Development and preliminary evidence for the validity of an instrument assessing implementation of human-factors principles in medication-related decision-support systems--I-MeDeSA. J Am Med Inform Assoc 2011;18(Suppl 1):i62–72.

[12] Ayres I. Super Crunchers: Why Thinking-By-Numbers Is The New Way To be Smart. Bantam.

[13] Fox J., Cooper R. P., Glasspool D. W. (2013). A canonical theory of dynamic decision-making. Front. Psychol. 4:150 10.3389/fpsyg.2013.00150.

[14] Welch, B.M.; Eilbeck, K.; Del Fiol, G.; Meyer, L.; Kawamoto, K. Technical desiderata for the integration of genomic data with clinical decision support. 2014,

[15] Welch BM, Loya SR, Eilbeck K, Kawamoto K. A proposed clinical decision support architecture capable of supporting whole genome sequence information. J Pers Med. 2014 Apr 4;4(2):176-99. doi: 10.3390/jpm4020176.

[16] Loya SR, Kawamoto K, Chatwin C, Huser V. Service oriented architecture for clinical decision support: a systematic review and future directions. J Med Syst. 2014 Dec;38(12):140. doi: 10.1007/s10916-014-0140-z.

[17] Ewout W. Steyerberg. Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. New York: Springer, 2010.

[18] Sesen MB, Peake MD, Banares-Alcantara R, Tse D, Kadir T, Stanley R, Gleeson F, Brady M. 2014 Lung Cancer Assistant: a hybrid clinical decision support application for lung cancer care. J. R. Soc. Interface 11: 20140534. http://dx.doi.org/10.1098/rsif.2014.0534

[19] Bean RB, Bean, BW. Sir William Osler: aphorisms from his bedside teachings and writings. New York; 1950.

[20] Joel Amoussou. How good is your crystal ball?: Utility, Methodology, and Validity of Clinical Prediction Models. http://efasoft.blogspot.com/2015/01/how-good-is-your-crystal-ball-utility.html. Retrieved February 15, 2015.

[21] Dewey FE, Grove ME, Pan C, et al. Clinical Interpretation and Implications of Whole-Genome Sequencing. JAMA. 2014;311(10):1035-1045. doi:10.1001/jama.2014.1717.

[22] Joel Amoussou. Ontologies for Addiction and Mental Disease: Enabling Translational Research and Clinical Decision Support. http://efasoft.blogspot.com/2014/08/ontologies-for-addiction-and-mental.html. Retrieved February 2015.

[23] Future radiotherapy practice will be based on evidence from retrospective interrogation of linked clinical data sources rather than prospective randomized controlled clinical trials. Dekker, Andre L. A. J. and Gulliford, Sarah L. and Ebert, Martin A. and Orton, Colin G., Medical Physics, 41, 030601 (2014), DOI:http://dx.doi.org/10.1118/1.4832139

[24] Lambin, Philippe et al. 'Rapid Learning health care in oncology' – An approach towards decision support systems enabling customised radiotherapy. Radiotherapy and Oncology , Volume 109 , Issue 1 , 159 - 164.

[25] Centers for Medicare & Medicaid Services. Chronic Conditions Among Medicare Beneficiaries, Chartbook: 2012 Edition. http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Chronic-Conditions/Downloads/2012Chartbook.pdf. Accessed Feb. 15, 2015.

[26] Szymon Wilk, Martin Michalowski, Xing Tan, Wojtek Michalowski: Using First-Order Logic to Represent Clinical Practice Guidelines and to Mitigate Adverse Interactions. KR4HC@VSL 2014: 45-61.

Sunday, March 10, 2013

How Not to Build A Big Ball of Mud, Part 2

In a previous post entitled How not to build a big ball of mud, I described the complexity of modern software systems and the challenges faced today by software developers and architects. Domain Driven Design (DDD) is a proven pattern language that can foster a disciplined approach to software development. DDD was first introduced by Eric Evans nine years ago in a seminal book entitled: Domain-Driven Design: Tackling Complexity in the Heart of Software. Over the last 9 years, a community of practice has emerged around DDD and many lessons have been learned in applying DDD to real world complex software development projects. During that time, software complexity has also increased significantly. Changes in the field of software development during the last few years include:

The proliferation of client devices which requires a Responsive Web Design (RWD) approach. RWD is made possible by open web standards like HTML5, CSS3, and Javascript which have displaced proprietary user interface technologies like Flex and Silverlight. RWD frameworks like Twitter Boostrap and Javascript Libraries like JQuery have become very popular with developers. The demands put on Javscript on the client side have created the need for Javascript MVC frameworks like AngularJS and EmberJS.

The importance of the user experience in a competitive online marketplace. Performing usability testing early in the software development life cycle (using wireframes or mockups) to test design ideas and obtain early feedback from future users is extremely valuable for creating the right solution. Metrics such as the System Usability Scale (SUS) can be used to assess the results of usability testing.

The prevalence of REST, JSON, OAuth2, and Web APIs for achieving web scale.

The emergence of Polyglot Persistence or the use of different persistence mechanisms such as relational, document, and graph databases within the same application. Developers are discovering that modeling data for NoSQL databases has many benefits, but also its own peculiarities.

The demands for quality and faster time-to-market have led to new techniques like test automation and continuous delivery.

The open source community has responded to these challenges by creating many frameworks which are supposed to facilitate the work of developing software. Software developers spend a considerable amount of time researching, learning, and integrating these various frameworks to build a system. Some of these frameworks can indeed be very helpful when used properly. However, DDD puts a big emphasis on understanding the domain. Here is what I learned from applying DDD over the last few years:

DDD is a significant intellectual investment, but with a potential for big rewards. To be successful in applying DDD, one must take the time to understand and digest the underlying principles, from the building blocks (entities, aggregates, value objects, modules, domain events, services, repositories, and factories) to the strategic aspects of applying DDD. For example, understanding the difference between an aggregate, a value object, and an entity is essential. Learning the right approach to designing aggregates is also very important as this can significantly impact transactions and performance. I highly recommend reading the recently published Implementing Domain Driven Design by Vaughn Vernon. The book provides a contemporary approach to applying DDD. For example, it covers important topics in applying DDD to modern software systems such as: sub-domains, domain events, event stores and event sourcing, rules for aggregate design, transactions, eventual consistency, REST, NoSQL, and enterprise application integration with concrete examples.

Proper application layering (user interface, application, domain, and infrastructure), understanding the responsibility of each layer (for example, an anemic domain model and a fat application layer are anti-pattern), and coding to interfaces. DDD is object-oriented (OO) design done right. The SOLID Principles of OO design are still applicable.

Determine if DDD is right for your project. Most of my work during the last few years has been in the healthcare domain. The HL7 CCDA and the Virtual Medical Record (vMR) define an information model for Electronic Healthcare Records (EHR) and Clinical Decision Support (CDS) systems respectively. Interoperability is an important and challenging issue in healthcare. DDD concepts such as "Strategic Design", "Context Map", "Bounded Context", and "Published Language" are very helpful in addressing and navigating this type of complexity.

As I mentioned earlier, DDD puts a big emphasis on understanding the domain. Developers applying DDD should be prepared to dedicate a considerable amount of time to learning about the domain, for example by collaborating and carefully listening to domain experts and by reading as much as they can about the domain. This is also the key to creating a rich domain model with behavior (as opposed to an anemic one). I found that simply reading industry standards and regulations is a great way to understand a domain. So understanding the domain is not just the responsibility of the Business Analyst. The code is the expression of the domain, so the coder needs to understand the domain in order to express it with code.

Some developers blame popular frameworks for encouraging anemic domain models. I found that a lack of understanding of the domain and its business rules is a major contributing factor to anemia in the domain model. A rule engine like Drools can help externalize these business rules in the form of declarative rules that can be maintained by domain experts through a DSL, spreadsheet, or web-based user interface.

There are opportunities in using recent ideas like Event Sourcing and the Command Query Responsibility Segregation (CQRS). These opportunities include: scalability, true audit trails, data mining, temporal queries, application integration. However, being pragmatic can help avoid unnecessary complexity.

I recommend exploring tools that are specifically designed to support a DDD or Model-Driven Development (MDD) approach. Apache Isis, Roma Meta Framework, Tynamo, and Naked Objects are examples of such tools. These tools can automatically generate all the layers of an application based on the specification of a domain model. By doing so, these tools allow you to really focus your time and attention on exploring and understanding the domain as opposed to framework and infrastructure concerns. For architects, these tools can serve as design pattern automation, constraining the development process to conform to DDD principles and patterns. I believe this is part of a larger trend in automating software development which also includes the essential practice of test automation. We software developers like to automate the job of other people. However, many tasks that we perform (including coding itself) are still very manual. Aspect-Oriented Programming (AOP) (using AspectJ for example) can also be used to enable this type of design pattern automation through compile-time weaving.

Check my previous post for 20 techniques for achieving software excellence.

Saturday, May 5, 2012

How to Add Arbitrary Metadata to Any Element of an HL7 CDA Document

There has been a lot of buzz lately about metadata tagging in the health IT community. In this blog, I describe an approach to annotating HL7 CDA documents (or any other XML documents) without actually editing the document that is being annotated. Metadata tagging is just an example of annotation. The underlying principle of this approach is that Anyone can say Anything about Anything (the AAA slogan) which is well know in the Semantic Web community. In other words, anyone (e.g., patient, care giver, physician, provider organization) should have the ability to add arbitrary metadata to any element of a CDA document. For the sake of "Separation of Concerns" which is a fundamental principle in software engineering, the metadata should be kept out of the CDA document. The benefits of keeping the metadata or annotations out of the CDA document include:

Reuse of the same metadata by distinct elements from potentially multiple clinical documents.
The ability to update the metadata without affecting the target CDA documents.
The ability for any individual, organization, or community of interest (e.g., privacy or medical device manufacturers) to create a metadata vocabulary without going through the process of modifying the normative CDA specification (or one of its offsprings like the CCD, the C32, or the Consolidated CDA) or the XDS metadata specifications.

History and Current Status of Metadata Standards in Health IT

The CDA specification defines some metadata in the header of a CDA document. In addition, the XD* family of specifications (XDS, XDR, and XDM) also defines a comprehensive set of metadata to be used in cross enterprise document exchange. NIEM is being used currently in several health IT projects. In a previous post titled "Toward a Universal Exchange Language for Healthcare", I described how the NIEM metadata approach could be adapted to the healthcare domain.

The President's Council of Advisors on Science and Technology (PCAST) published a report in December 2010 entitled: "Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward". To describe the proposed approach to metadata tagging, the report provides an example based on the exchange of mammograms:

"The physician would be able to securely search for, retrieve, and display these privacy protected data elements in much the way that web surfers retrieve results from a search engine when they type in a simple query.

What enables this result is the metadata attached to each of these data elements (mammograms), which would include (i) enough identifying information about the patient to allow the data to be located (not necessarily a universal patient identifier), (ii) privacy protection information-who may access the mammograms, either identified or deidentified, and for what purposes, (iii) the provenance of the data-the date, time, type of equipment used, personnel (physician, nurse, or technician), and so forth."

The HIT Standards Committee (HITSC) Metadata Tiger Team made specific recommendations to the ONC in June 2011. These recommendations included the use of:

Policy Pointers: URLs that point to external policy documents affecting the tagged data element.
Content Metadata: the actual metadata with datatype (clinical category) and sensitivity (e.g., substance abuse and mental health).
Use of the HL7 CDA R2 with headers.

Based on those recommendations, the ONC published a Notice of Proposed Rule Making (NPRM) in August 2011 to receive comments on proposed metadata standards.

The Data Segmentation Working Group of the ONC Standards and Interoperability Framework is currently working on metadata tagging for compliance with privacy policies and consent directives.

The Annotea Protocol

The capability to add arbitrary metadata to documents without modifying them has been available in the Semantic Web for at least a decade. Indeed, it is hard to talk about metadata without a reference to the Semantic Web. I will use the W3C Annotea Protocol (which is implemented by the Amaya open source project) to demonstrate this capability. I will also show that this approach does not require the use of the Resource Description Framework (RDF) format and related Semantic Web technologies like OWL and SPARQL. The approach can be adapted to alternative representation formats such as XML, JSON, or the Atom syndication format. Let's assume that I need to add metadata tags to the CDA document below. The CDA document has only one problem entry for substance abuse disorder (SNOMED CT code 66214007) and my goal is to attach privacy metatada prohibiting the disclosure of that information (the most relevant elements are highlighted in red):

<ClinicalDocument>
.....
<component>
<structuredBody>
<component>

<section>
<templateId root="2.16.840.1.113883.3.88.11.83.103"
   assigningAuthorityName="HITSP/C83"/>
<templateId root="1.3.6.1.4.1.19376.1.5.3.1.3.6"
   assigningAuthorityName="IHE PCC"/>
<templateId root="2.16.840.1.113883.10.20.1.11" assigningAuthorityName="HL7 CCD"/>

<code code="11450-4" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC"
   displayName="Problem list"/>
<title>Problems</title>
<text>...</text>
<entry typeCode="DRIV">
<act classCode="ACT" moodCode="EVN">
   <templateId root="2.16.840.1.113883.3.88.11.83.7"
       assigningAuthorityName="HITSP C83"/>
   <templateId root="2.16.840.1.113883.10.20.1.27"
       assigningAuthorityName="CCD"/>
   <templateId root="1.3.6.1.4.1.19376.1.5.3.1.4.5.1"
       assigningAuthorityName="IHE PCC"/>
   <templateId root="1.3.6.1.4.1.19376.1.5.3.1.4.5.2"
       assigningAuthorityName="IHE PCC"/>
   
   <id root="6a2fa88d-4174-4909-aece-db44b60a3abb"/>
   <code nullFlavor="NA"/>
   <statusCode code="completed"/>
   <effectiveTime>
       <low value="1950"/>
       <high nullFlavor="UNK"/>
   </effectiveTime>
   <performer typeCode="PRF">
       <assignedEntity>
           <id extension="PseudoMD-2" root="2.16.840.1.113883.3.72.5.2"/>
           <addr/>
           <telecom/>
       </assignedEntity>
   </performer>
   <entryRelationship typeCode="SUBJ" inversionInd="false">
       <observation classCode="OBS" moodCode="EVN">
           <templateId root="2.16.840.1.113883.10.20.1.28"
               assigningAuthorityName="CCD"/>
           <templateId root="1.3.6.1.4.1.19376.1.5.3.1.4.5"
               assigningAuthorityName="IHE PCC"/>
           
           <id root="d11275e7-67ae-11db-bd13-0800200c9a66"/>
           <code code="64572001" displayName="Condition"
               codeSystem="2.16.840.1.113883.6.96"
               codeSystemName="SNOMED-CT"/>
           <text>
               <reference value="#PROBSUMMARY_1"/>
           </text>
           <statusCode code="completed"/>
           <effectiveTime>
               <low value="1950"/>
           </effectiveTime>
           <value displayName="Substance Abuse Disorder" code="66214007" codeSystemName="SNOMED" codeSystem="2.16.840.1.113883.6.96"/>
           <entryRelationship typeCode="REFR">
               <observation classCode="OBS" moodCode="EVN">
                   <templateId root="2.16.840.1.113883.10.20.1.50"/>
                   
                   <code code="33999-4" codeSystem="2.16.840.1.113883.6.1"
                       displayName="Status"/>
                   <statusCode code="completed"/>
                   <value code="55561003"
                       codeSystem="2.16.840.1.113883.6.96"
                       displayName="Active">
                       <originalText>
                       <reference value="#PROBSTATUS_1"/>
                       </originalText>
                   </value>
               </observation>
           </entryRelationship>
       </observation>
   </entryRelationship>
</act>
</entry>
</section>
</component>
</structuredBody>
</component>
</ClinicalDocument>

The following is a separate annotation document containing some metadata pointing to the Substance Abuse Disorder entry in the target CDA document:

<r:RDF xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

    xmlns:a="http://www.w3.org/2000/10/annotation-ns#"

    xmlns:d="http://purl.org/dc/elements/1.1/">

    <r:Description>

        <r:type r:resource="http://www.w3.org/2000/10/annotation-ns#Annotation"/>

        <r:type r:resource="http://www.w3.org/2000/10/annotationType#Metadata"/>
<a:annotates r:resource="http://hospitalx.com/ehrs/cda.xml"/>

        <a:context>http://hospitalx.com/ehrs/cda.xml#xpointer(/ClinicalDocument/component/structuredBody/component[1]/section[1]/entry[1])</a:context>

        <d:title>Sample Metadata Tagging</d:title>

        <d:creator>Bob Smith</d:creator>

        <a:created>2011-10-14T12:10Z</a:created>

        <d:date>2011-10-14T12:10Z</d:date>
 <a:body>Do Not Disclose</a:body>

    </r:Description>

</r:RDF>

Please note a few interesting facts about the annotation document:

As explained by the original specification: "The Annotea protocol works without modifying the original document; that is, there is no requirement that the user have write access to the Web page being annotated."
The annotation itself has metadata using the well known Dublin Core metadata specification to specify who created this annotation and when.
The document being annotated is cda.xml located at http://hospitalx.com/ehrs/cda.xml. This is described by the element <a:annotates r:resource="http://hospitalx.com/ehrs/cda.xml"/>.
The specific element that is being annotated within the target CDA document is specified by the context element: <a:context>http://hospitalx.com/ehrs/cda.xml#xpointer(/ClinicalDocument/component/structuredBody/component[1]/section[1]/entry[1])</a:context> using XPointer, a specification described by the W3C as "the language to be used as the basis for a fragment identifier for any URI reference that locates a resource whose Internet media type is one of text/xml, application/xml, text/xml-external-parsed-entity, or application/xml-external-parsed-entity."
The XPath expression /ClinicalDocument/component/structuredBody/component[1]/section[1]/entry[1] within the XPointer is used to target the entry element in the CDA document.
Using XPath (1.0 or 2.0) allows us to address any element (or node) in an XML document. For example, this XPath //value[@code='66214007']/ancestor::entry will point to any entry element which contains a value element with an attribute code='66214007' (essentially targeting all entry elements which contain a Substance Abuse Observation). The combination of XPath, XPointer, and standard medical terminology codes gives the ability to attach any annotation or metadata to any element having interoperable semantics.
The body element contains the actual annotation: <a:body>Do Not Disclose</a:body>. However, the body of the annotation can also be located outside of the annotation (e.g., in a shared metadata registry) in which case the body element will be marked up as in the following example: <a:body r:resource="http://metadataregistry.com/myconsentdirectives.xml"/>

Alternative Representations

As mentioned before, for those who for one reason or another don't want to use RDF and related Semantic Web technologies, the annotation can be easily converted to a pure XML (as opposed to RDF/XML), JSON, or Atom representation. The original Annotea Protocol describes a RESTful protocol which includes the following operations: posting, querying, downloading, updating, and deleting annotations. The Atom Publishing Protocol (APP) is a newer RESTful protocol that is well adapted to the Atom syndication format.

Processing Annotations with XPointer

How the annotations are processed and consumed is only limited by the requirements of a specific application and the imagination of the developers writing it. For example, an application can read both the annotation document and the target CDA document and overlay the annotations on top of the entries in the CDA document while displaying the latter in a web browser. Another example is the enforcement of privacy policies and preferences prior to exchanging the CDA document. The issue that will be raised is how to process the XPointer fragment identifiers. XPointer uses XPath which is a well established XML addressing mechanism supported by many XML processing APIs across programming languages. For those of you who use XSLT2 to process CDA documents, there is the open source XPointer Framework for XSLT2 for use with the Saxon XSLT2 engine.

Thursday, March 24, 2011

How Checklists Can Enhance Clinical Decision Support (CDS)

I have been reading "The Checklist Manifesto", a book written by Dr. Atul Gawande on the effectiveness of checklists in healthcare delivery. Another paper recently published in the Milbank Quarterly entitled "Counterheroism, Common Knowledge, and Ergonomics: Concepts from Aviation That Could Improve Patient Safety" suggests that beyond checklists, proven aviation safety practices such as Cockpit Resource Management (CRM), Joint safety briefings, and first-names-only rules could help improve patient safety.

I first became aware of the importance of checklists while I was being trained as a Flight Engineer. I spent a lot of time studying them carefully as an aviation student. Checklists are used during normal, abnormal, and emergency situations and pilots go through practical exercises in flight simulators to use them correctly. Let's not mince words: aviation as we know it today would not be possible without checklists.

In a study entitled "Missed and Delayed Diagnoses in the Emergency Department: A Study of Closed Malpractice Claims From 4 Liability Insurers", researchers found that:

The leading breakdowns in the diagnostic process were failure to order an appropriate diagnostic test (58% of errors), failure to perform an adequate medical history or physical examination (42%), incorrect interpretation of a diagnostic test (37%), and failure to order an appropriate consultation (33%). The leading contributing factors to the missed diagnoses were cognitive factors (96%), patient-related factors (34%), lack of appropriate supervision (30%), inadequate handoffs (24%), and excessive workload (23%).

Checklists can serve as cognitive aid in helping clinicians do their job safely. While the idea of using checklists and standard operating procedures has been fully embraced and adopted by aviation professionals for more than 70 years, it is only now making inroads into the field of medicine particularly in high pressure environments like intensive care units. The use of checklists in medicine has already shown the potential to save patients live and reduce human errors. However, the main challenge remains the acceptance of checklists by clinicians concerned about "Cookbook Medicine".

Checklists are just cognitive aids and the presence of an experienced and competent professional will always make a big difference in critical situations. As Captain Sullenberger (the airline pilot who successfully ditched US Airways Flight 1549 in the Hudson River in New York City, on January 15, 2009) said, "One way of looking at this might be that for 42 years, I've been making small, regular deposits in this bank of experience: education and training. And on January 15 the balance was sufficient so that I could make a very large withdrawal."

On modern airplanes, Electronic Centralised Aircraft Monitor (ECAM) systems or Engine Indicating and Crew Alerting Systems (EICAS) monitor aircraft systems and engines and displays messages in case of failure, as well as recommended remedial actions in the form of checklists. The National Transportation Safety Board (NTSB) accident report on US Airways Flight 1549 indicates that the First Officer "was able to promptly locate the [Engine Dual Failure checklist] procedure listed on the back cover of the [Quick Reference Handbook] QRH, turn to the appropriate page, and start executing the checklist."

In medicine, factors such as comorbidity can complicate the design of effective CDS. However, with the explosion of medical knowledge and evidence-based guidelines, CDS will become an essential tool in healthcare delivery. The design, development, implementation, and use of CDS is knowledge-intensive and require an effective collaborative knowledge management strategy. The challenge will be to integrate checklists into the different CDS modalities such as context-sensitive Infobuttons, order sets, alerts, reminders, data entry and visualization, and clinical workflows.

For example, the evaluation results (in the form of recommendations) of a CDS rule can be presented to the clinician as an electronic checklist. This in turn can be tied directly to quality measures in the era of Meaningful Use, Pay-For-Performance, and Accountable Care Organizations (ACOs). An interesting example would be a checklist that prompts clinicians to generate detailed discharge instructions to satisfy quality measures for patients with heart failure or acute myocardial infection.

There is an important Human Factors aspect to the design and use of cockpit checklists and flight-deck procedures. This has been the subject of advanced research at NASA Ames Research Center more than twenty years ago and the results have been widely disseminated and implemented in the aviation industry.

In an article entitled "The Checklist" published in the New Yorker, Atul Gawande wrote:

"But consider: there are hundreds, perhaps thousands, of things doctors do that are at least as dangerous and prone to human failure as putting central lines into I.C.U. patients. It’s true of cardiac care, stroke treatment, H.I.V. treatment, and surgery of all kinds. It’s also true of diagnosis, whether one is trying to identify cancer or infection or a heart attack. All have steps that are worth putting on a checklist and testing in routine care. The question—still unanswered—is whether medical culture will embrace the opportunity."

Peter Pronovost, an intensivist at Johns Hopkins Hospital and pioneer in the use of checklist in medicine, implemented a checklist at 127 Michigan intensive care units (ICUs) to reduce catheter-related blood stream infections (CRBSI). The project was so successful that it is estimated that it could significantly reduce the 28,000 deaths and 3 billion dollars in costs caused by these hospital-acquired infections.

The HL7 Clinical Decision Support (CDS) workgroup is working on standards for the vMR (Virtual Medical Record), Infobuttons, and Order Sets. There is also an effort at the OMG to publish a Clinical Decision Support Services specification for service-oriented CDS capabilities. The Flight Operation Interests Group (FOIG) of the Air Transport Association (ATA) is developing a data model and XML Schema for flight deck procedures and checklists. Developing a shareable content model for checklists in medicine could be an interesting idea.

Sunday, November 2, 2014

A Software Architecture for Precision Medicine

Clinical Data Sources

Predictive Modeling, Real-Time Data Stream Mining, and Big Data Genomics

Visual Analytics

Production Rule System, Ontology Reasoning, and NLP

Clinical Decision Services

Intelligent Personal Assistants (IPAs)

Hexagonal, Reactive, and Secure Architecture

An Interface that Works across Devices and Platforms

Interoperability

Overcoming Barriers to Adoption

Monday, August 25, 2014

Why Semantic Interoperability is needed in biomedical translational research?

Ontologies as knowledge representation formalism for clinical decision support (CDS)

References

Sunday, August 17, 2014

Clinical NLP with Apache cTAKES

Massively Parallel Clinical Text Analytics in the Cloud with GATECloud

Clustering, Classification, Text Summarization, and Clinical Question Answering (CQA)

Thursday, August 15, 2013

The Business Case

What Can We Learn From High Risk Operations in Other Industries?

Care Coordination in the Health Enterprise 2.0

Sunday, April 14, 2013

CDS Interoperability

Seamless Integration into Clinical Workflows and Care Pathways

Human Factors in the Use of Clinical Decision Support Systems

Integrating Genomic Data with CDS

Practice-Based Evidence (PBE) needed for closing the evidence loop

Toward Data-Driven Clinical Decision Support (CDS)

Concurrent execution of multiple guidelines for patients with co-morbidities

A CDS Architecture for the era of Precision Medicine

References

Sunday, March 10, 2013

Saturday, May 5, 2012

History and Current Status of Metadata Standards in Health IT

The Annotea Protocol

Alternative Representations

Processing Annotations with XPointer

Thursday, March 24, 2011

License

About Me

Disclaimer

Subscribe To

Blog Archive