Monday, February 6, 2012

Toward Intelligent Health IT (iHIT) Systems: Getting Out of the Box

In this post, I describe a new type of application that I refer to as iHIT. iHIT stands for Intelligent Health IT.

The Architecture of Traditional Health IT systems

Traditional software architectures for health IT systems typically include the following:

  • Dependency Injection (DI)

  • Object Relational Mapping (ORM)

  • An architectural pattern for the presentation layer such as the Model View Controller (MVC) pattern

  • HTML5, CSS3, and a JavaScript library like JQuery/Mobile

  • Other architectural patterns including GoF Design Patterns, SOLID Principles, and Domain Driven Design (DDD)

  • Structured Query Language (SQL)

  • Enterprise Integration Patterns (EIPs) implemented through an Enterprise Service Bus (ESB) using HL7 messages as the "Published Language"

  • REST or SOAP-based web services.

An entire generation of developers has been trained in these techniques. They represent proven best practices accumulated over several decades of object-oriented design and relational data management. Although pervasive in today's clinical systems, these applications lack basic intelligent features such as the ability to capture and execute expert knowledge, make inferences, or make predictions about the future based on the analysis of historical data. Some of these systems actually look like glorified data entry systems.

With the availability and explosion of medical knowledge and real world observational EHR data, these intelligent features will become important in assisting clinicians in the medical decision making process at the point of care by reducing their cognitive load.

Intelligent Health IT (iHIT) Systems

iHIT systems process huge quantities of both structured and unstructured data to provide clinicians with specific recommendations. iHIT systems play an important role in translating Comparative Effectiveness Research (CER) findings into clinical practice. Comparative effectiveness Research (CER), an emerging trend in Evidence Based Medicine (EBM), has been defined by the Federal Coordinating Council for CER as "the conduct and synthesis of research comparing the benefits and harms of different interventions and strategies to prevent, diagnose, treat and monitor health conditions in 'real world' settings." For example, based on the clinical profile of a patient, CER can help determine the best treatment option for breast cancer among the various options available such as: chemotherapy, radiation therapy, and surgery (Masectomy and Lumpectomy).

The following are examples of key characteristics displayed by iHIT systems:

  • The ability to analyze patient data as well as very large historical observational data sets in order to make probability-based predictions about the future and recommend specific actions that can yield the best clinical outcomes given the clinical profile of a patient.

  • The ability to capture and execute expert knowledge such as the medical knowledge contained in Clinical Practice Guidelines (CPGs). This includes the ability to mediate between different CPGs to arrive at a specific recommendation by merging and reconciling the medical knowledge in multiple CPGs as is the case with patients with comorbidities.

  • The ability to perform automated reasoning by inferring new implicit clinical facts from existing explicit facts and by exploiting semantic relationships between concepts and entities.

  • The ability to retrieve knowledge from unstructured data sources such as the biomedical research literature from sources like PubMed in order to answer clinical questions sometimes posed in natural language.

  • The ability to learn over time (and hence become smarter) as the amount of processed data continues its exponential growth.

  • Very fast response time to queries over very large data sets.


Sounds like Artificial Intelligence (AI)? I believe we are indeed witnessing the resurgence of AI and even the ideas of the Semantic Web in the healthcare industry. As healthcare costs and quality become national priorities for many countries around the world, the boundaries of computing will continue to be pushed further. Actually, some of the underlying principles of intelligent systems were originally developed decades and even centuries ago in the field of biomedical research. Williams Osler (1849-1919) famously said:

Medicine is a science of uncertainty and an art of probability.

Technologically advanced and competitive industries like financial services (e.g., credit eligibility and fraud detection), online retail (e.g., recommendation engine), and logistics (e.g., delivery route optimization) have adopted some of these technologies. Health IT developers now need to embrace them as well. This will require thinking out of the box.


The Ingredients of iHIT Systems

iHIT systems represent not one, but the integration of many different technologies. Mathematical Models, Statistical Analysis, and Machine Learning algorithms play an important role in iHIT systems. Examples include:

  • Logistic Regression models

  • Decision Trees

  • Association Rules

  • Bayesian Network

  • Neural Networks

  • Random Forests

  • Time Series for temporal reasoning

  • k-means Clustering

  • Support Vector Machines (SVM)

  • Probabilistic Graphical Models (PGMs) based on methods such as Bayesian networks and Markov Networks for making clinical decisions under uncertainty.

These algorithms can be used not only for making therapeutic predictions (e.g., the future hospitalization risk of a patient with Asthma), but also for dividing a population into subgroups based on the clinical profile of patients in order to achieve the best treatment outcomes.

Clinical Practice Guidelines (CPGs) are usually-based on Systematic Reviews (SRs) of Randomized Controlled Trials (RCTs) which are essentially scientific experiments. According to a report titled "Clinical Practice Guidelines (CPGs) We Can Trust" which was published last year by the Institute Of Medicine (IoM):

However, even when studies are considered to have high internal validity, they may not be generalizable to or valid for the patient population of guideline relevance. Randomized trials commonly have an under representation of important subgroups, including those with comorbidities, older persons, racial and ethnic minorities, and low-income, less educated, or low-literacy patients. Many RCTs and observational studies fail to include such "typical patients" in their samples; even when they do, there may not be sufficient numbers of such patients to assess them separately or the subgroups may not be properly analyzed for differences in outcomes.

On the other hand, observational studies using statistical analysis and machine learning algorithms operate on large real world observational data and can therefore provide feedback on the effectiveness of the actual use of different therapeutic interventions. Although very costly, RCTs are still considered the strongest form of evidence in EBM. Despite their inherent methodological challenges (lack of randomization leading to possible bias and confounding), observational studies are increasingly recognized as complementary to RCTs and an important tool in clinical decision making and health policy. iHIT systems play an important role in translating Comparative Effectiveness Research (CER) findings into clinical practice in the form of clinical decision support (CDS) interventions at the point of care.

iHIT systems also use business rules engines to capture and execute expert knowledge such as the medical knowledge contained in Clinical Practice Guidelines (CPGs). Examples include rules engines based on forward chaining inference, also known as production rule systems. These rules engines can be combined with Complex Event Processing (CEP) and Business Process Management (BPM) for intelligent decision making.

iHIT systems support ontologies such as those represented by the web ontology language (OWL) providing reasoning capabilities as well as the ability to navigate semantic relationships between concepts and entities.

More advanced iHIT systems have Natural Language Processing (NLP) and Automatic Speech Recognition (ASR) capabilities in order to answer clinical questions posed in natural language. They rely on Information Retrieval techniques like probabilistic methods for scoring the relevance of a document given a query and the application of supervised machine learning classification methods such as decision trees, Naive Bayes, K-Nearest Neighbors (kNN), and Support Vector Machines (SVM).

In some cases, the responsibilities of an iHIT system are performed by Intelligent Agents which are autonomous entities capable of observing the clinical environment and acting upon those observations.

For scalability and performance, iHIT systems often sit on NoSQL databases and run on massively parallel computing platforms like Apache Hadoop while leveraging the elasticity of the cloud.

Integrating these technologies is the main challenge posed by iHIT systems. An example is the integration between statistical and machine learning models, business rules, ontologies, and more traditional forms of computing such as object-oriented programming. Various solutions to these challenges have been proposed and implemented.

Human-Centered Design

Finally, iHIT systems fully embrace a human-centered design approach. They provide a seamless integration between automated decision logic and clinical workflows. They provide the clinician with detailed explanations of the rationale behind the actions they recommend. In addition, they use techniques like Visual Analytics to enhance human cognitive abilities in order to facilitate analytical reasoning over very large data sets.