Adventures in Computing: Intelligent Systems

Showing posts with label Intelligent Systems. Show all posts

Monday, August 25, 2014

Ontologies for Addiction and Mental Disease: Enabling Translational Research and Clinical Decision Support

In a previous post titled Why do we need ontologies in healthcare applications, I elaborated on what ontologies are and why they are different from information models of data structures like relational database schemas and XML schemas commonly used in healthcare informatics applications. In this post, I discuss two interesting applications of ontology engineering related to addiction and mental disease treatment. The first is the use of ontologies for achieving semantic interoperability in translational research. The second is the use of ontologies for modeling complex medical knowledge in clinical practice guidelines (CPGs) for the purpose of automated reasoning during execution in clinical decision support systems (CDS) at the point of care.

Why Semantic Interoperability is needed in biomedical translational research?

In order to accelerate the discovery of new effective therapeutics for mental health and addiction treatment, there is a need to integrate data across disciplines spanning biomedical research and clinical care delivery [1]. For example, linking data across disciplines can facilitate a better understanding of treatment response variability among patients in addiction treatment. These disciplines include:

Genetics, the study of genes.
Chemistry, the study of chemical compounds including substances of abuse like heroin.
Neuroscience, the study of the nervous system and the brain (addiction is a chronic disease of the brain)
Psychiatry which is focused on the diagnosis, treatment, and prevention of addiction and mental disorders.

Each of these disciplines has its own terminology or controlled vocabularies. In the clinical domain for example, DSM5 and RrxNorm are used for documenting clinical care. In biomedical research, several ontologies have been developed over the last few years including:

The Gene Ontology (GO)
The Chemical Entities of Biological Interest Ontology (CHEBI)
NeuroLex, an OWL ontology covering major domains of neuroscience: anatomy, cell, subcellular, molecule, function, and dysfunction.

To facilitate semantic interoperability between these ontologies, there are best practices established by the Open Biomedical Ontology (OBO) community. An example of best practice is the use of an upper-level ontology called the Basic Formal Ontology (BFO) which acts as a common foundational ontology upon which new ontologies can be created. OBO ontologies and principles are available on the OBO Foundry web site.

Among the ontologies available on the OBO Foundry is the Mental Functioning Ontology (MF) [2, 3]. The MF is being developed as a collaboration between the University of Geneva in Switzerland and the University at Buffalo in the United States. The project also includes a Mental Disease Ontology (MD) which extends the MF and the Ontology for General Medical Science (OGMS). The Basic Formal Ontology (BFO) is an upper-level ontology for both the MF and the OGMS. The picture below is a view of the class hierarchy of the MD showing details of the class "Paranoid Schizophrenia" in the right pane of the windows of the beta release of Protege 5, an open source ontology development environment (click on the image to enlarge it).

The following is a tree view of the "Mental Disease Course" class (click on the image to enlarge it):

Ontology constructs defined by the OWL2 language can help establish common semantics (meaning) and relationships between entities across domains. These constructs provide automated inferencing capabilities such as equivalence (e.g., owl:sameAs and owl:equivalentClass) and subsumption (e.g., rdfs:subClassOf) relationships between entities.

In addition, publishing data sources following Linked Open Data (LOD) principles and semantic search using federated SPARQL queries can help answer new research questions. Another application is semantic annotation for natural language processing (NLP) applications.

Ontologies as knowledge representation formalism for clinical decision support (CDS)

As knowledge representation formalism, ontologies are well suited for modeling complex medical knowledge and can facilitate reasoning during the automated execution of clinical practice guidelines (CPGs) and Care Pathways (CPs) based on patient data at the point of care. Several approaches to modelling CPGs and CPs have been proposed in the past including PROforma, HELEN, EON, GLIF, PRODIGY, and SAGE. However, the lack of free and open source tooling has been a major impediment to a wide adoption of these knowledge representation formalisms. OWL has the advantage of being a widely implemented W3C Recommendation with available mature open source tools.

In practice, the medical knowledge contained in CPGs can be manually translated into IF-THEN statements in most programming languages. Executable CDS rules (like other complex types of business rules) can be implemented with a production rule engine using forward chaining. This is the approach taken by OpenCDS and some large scale CDS implementations in real world healthcare delivery settings. This allows CDS software developers to externalize the medical knowledge contained in clinical guidelines in the form of declarative rules as opposed to embedding that knowledge in procedural code. Many viable open source business rule management systems (BRMS) are available today and provide capabilities such as a rule authoring user interface, a rules repository, and a testing environment.

However, production rule systems have a limitation. They do not scale because they require writing a rule for each clinical concept code (there are more than 311,000 active concepts in SNOMED CT alone). An alternative is to exploit the class hierarchy in an ontology so that subclasses of a given superclass can inherit the clinical rules that are applicable to the superclass (this is called subsumption). In addition to subsumption, an OWL ontology also support reasoning with description logic (DL) axioms [4].

An ontology designed for a clinical decision support (CDS) system can integrate the clinical rules from a CPG, a domain ontology like the Mental Disorder (MD) ontology, and the patient medical record from an EHR database in order to provide inferences in the form of treatment recommendations at the point of care. The OWL API [5] facilitates the integration of ontologies into software applications. It supports inferencing using reasoners like Pellet and HermiT. OWL2 reasoning capabilites can be enhanced with rules represented in SWRL (Semantic Web Rule Language) which is implemented by reasoners like Pellet as well as the Protege OWL development environement. In addition to inferencing, another benefit of an OWL-based approach is transparency: the CDS system can provide an explanation or justification of how it arrives at the treatment recommendations.

Nonetheless, these approaches are not mutually exclusive: a production rule system can be integrated with business processes, ontologies, and predictive analytics models. Predictive analytics models provide a probabilistic approach to treatment recommendations to assist in the clinical decision making process.

References

[1] Janna Hastings, Werner Ceusters, Mark Jensen, Kevin Mulligan and Barry Smith. Representing mental functioning: Ontologies for mental health and disease. Proceedings of the Mental Functioning Ontologies workshop of ICBO 2012, Graz, Austria.

[2] Ceusters, W. and Smith, B. (2010a). Foundations for a realist ontology of mental disease. Journal of Biomedical Semantics, 1(1), 10.

[3] Hastings, J., Smith, B., Ceusters, W., and Mulligan, K. (2012). The mental functioning ontology. http://code.google.com/p/mental-functioning-ontology/, last accessed August 24, 2014

[4] Sesen MB, Peake MD, Banares-Alcantara R, Tse D, Kadir T, Stanley R, Gleeson F, Brady M. 2014 Lung Cancer Assistant: a hybrid clinical decision support application for lung cancer care. J. R. Soc. Interface 11: 20140534.

[5] Matthew Horridge, Sean Bechhofer. The OWL API: A Java API for OWL Ontologies Semantic Web Journal 2(1), Special Issue on Semantic Web Tools and Systems, pp. 11-21, 2011.

Sunday, August 17, 2014

Natural Language Processing (NLP) for Clinical Decision Support: A Practical Approach

A significant portion of the electronic documentation of clinical care is captured in the form of unstructured narrative text like psychotherapy and progress notes. Despite the big push to adopt structured data entry (as required by the Meaningful Use incentive program for example), many clinicians still like to document care using free narrative text. The advantage of using narrative text as opposed to coded entries is that narrative text can tell the story of the patient and the care provided particularly in complex cases. My opinion is that free narrative text should be used to complement coded entries when necessary to capture relevant information.

Furthermore, medical knowledge is expanding very rapidly. For example, PubMed has more than 24 millions citations for biomedical literature from MEDLINE, life science journals, and online books. It is impossible for the human brain to keep up with that amount of knowledge. These unstructured sources of knowledge contain the scientific evidence that is required for effective clinical decision making in what is referred to as Evidence-Based Medicine (EBM).

In this blog, I discuss two practical applications of Natural Language Processing (NLP). The first is the use of NLP tools and techniques to automatically extract clinical concepts and other insight from clinical notes for the purpose of providing treatment recommendations in Clinical Decision Support (CDS) systems. The second is the use of text analytics techniques like clustering and summarization for Clinical Question Answering (CQA).

The emphasis of this post is on a practical approach using freely available and mature open source tools as opposed to an academic or theoretical approach. For a theoretical treatment of the subject, please refer to the book Speech and Language Processing by Daniel Jurafsky and James Martin.

Clinical NLP with Apache cTAKES

Based on the Apache Unstructured Information Management Architecture (UIMA) framework and the Apache OpenNLP natural language processing toolkit, Apache cTAKES provides a modular architecture utilizing both rule-based and machine learning techniques for information extraction from clinical notes. cTAKES can extract named entities (clinical concepts) from clinical notes in plain text or HL7 CDA format and map these entities to various dictionaries including the following Unified Medical Language System (UMLS) semantic types: diseases/disorders, signs/symptoms, anatomical sites, procedures, and medications.

cTAKES includes the following key components which can be assembled to create processing pipelines:

Sentence boundary detector based on the OpenNLP Maximum Entropy (ME) sentence detector.
Tokenizor
Normalizer using the National Library of Medicine's Lexical Variant Generation (LVG) tool
Part-of-speech (POS) tagger
Shallow parser
Named Entity Recognition (NER) annotator using dictionary look-up to UMLS concepts and semantic types. The Drug NER can extract drug entities and their attributes such as dosage, strength, route, etc.
Assertion module which determines the subject of the statement (e.g., is the subject of the statement the patient or a parent of the patient) and whether a named entity or event is negated (e.g., does the presence of the word "depression" in the text implies that the patient has depression).

Apache cTAKES 3.2 has added YTEX, a set of extensions developed at Yale University which provide integration with MetaMap, semantic similarity, export to Machine Learning packages like Weka and R, and feature engineering.

The following diagram from the Apache cTAKES Wiki provides an overview of these components and their dependencies (click to enlarge):

Source: Apache cTAKES wiki

Massively Parallel Clinical Text Analytics in the Cloud with GATECloud

The General Architecture for Text Engineering (GATE) is a mature, comprehensive, and open source text analytics platform. GATE is a family of tools which includes:

GATE Developer: an integrated development environment (IDE) for language processing components with a comprehensive set of available plugins called CREOLE (Collection of REusable Objects for Language Engineering).
GATE Embedded: an object library for embedding services developed with GATE Developer into third-party applications.
GATE Teamware: a collaborative semantic annotation environment based on a workflow engine for creating manually annotated corpora for applying machine learning algorithms.
GATE Mímir: the "Multi-paradigm Information Management Index and Repository" which supports a multi-paradigm approach to index and search over text, ontologies, and semantic metadata.
GATE Cloud: a massively parallel clinical text analytics platform (Platform as a Service or PaaS) built on the Amazon AWS Cloud.

What makes GATE particularly attractive is the recent addition of GATECloud.net PaaS which can boost the productivity of people involved in large scale text analytics tasks.

Clustering, Classification, Text Summarization, and Clinical Question Answering (CQA)

An unsupervised machine learning approach called Clustering can be used to classify large volumes of medical literature into groups (clusters) based on some similarity measure (such as the Euclidean distance). Clustering can be applied at the document, search result, and word/topic levels. Carrot2 and Apache Mahout are open source projects that provide several methods for document clustering. For example, the Latent Dirichlet Allocation learning algorithm in Apache Mahout automatically clusters words into topics and documents into mixtures of topics. Other clustering algorithms in Apache Mahout include: Canopy, Mean-Shift, Spectral, K-Means and Fuzzy K-Means. Apache Mahout is part of the Hadoop ecosystem and can therefore scale to very large volumes of unstructured text.

Document classification essentially consists in assigning predefined set of labels to documents. This can be achieved through supervised machine learning algorithms. Apache Mahout implements the Naive Bayes classifier.

Text summarization techniques can be used to present succinct and clinically relevant evidence to clinicians at the point of care. MEAD (http://www.summarization.com/mead/) is an open source project that implements multiple summarization algorithms. In the biomedical domain, SemRep is a program that extracts semantic predications (subject-relation-object triples) from biomedical free text. Subject and object arguments of each predication are concepts from the UMLS Metathesaurus and the relation is from the UMLS Semantic Network (e.g., TREATS, Co-OCCURS_WITH). The SemRep summarization provides a short summary of these concepts and their semantic relations.

AskHermes (Help clinicians to Extract and aRrticulate Multimedia information for answering clinical quEstionS) is a project that attempts to implement these techniques in the clinical domain. It allows clinicians to enter questions in natural language and uses the following unstructured information sources: MEDLINE abstracts, PubMed Central full-text articles, eMedicine documents, clinical guidelines, and Wikipedia articles.

The processing pipeline in AskHermes includes the following: Question Analysis, Related Questions Extraction, Information Retrieval, Summarization and Answer Presentation. AskHermes performs question classification using MMTx (MetaMap Technology Transfer) to map keywords to UMLS concepts and semantic types. Classification is achieved through supervised machine learning algorithms such as Support Vector Machine (SVM) and conditional random fields (CFRs). Summarization and answer presentation are based on clustering techniques. AskHermes is powered by open source components including: JBoss Seam, Weka, Mallet , Carrot2 , Lucene/Solr, and WordNet (a lexical database for the English language).

Sunday, June 9, 2013

Essential IT Capabilities Of An Accountable Care Organization (ACO)

The Certification Commission for Health Information Technology (CCHIT) recently published a document entitled A Health IT Framework for Accountable Care. The document identifies the following key processes and functions necessary to meet the objectives of an ACO:

Care Coordination
Cohort Management
Patient and Caregiver Relationship Management
Clinician Engagement
Financial Management
Reporting
Knowledge Management.

The key to success is a shift to a data-driven healthcare delivery. The following is my assessment of the most critical IT capabilities for ACO success:

Comprehensive and standardized care documentation in the form of electronic health records including as a minimum: patients' signs and symptoms, diagnostic tests, diagnoses, allergies, social and familiy history, medications, lab results, care plans, interventions, and actual outcomes. Disease-specific Documentation Templates can support the effective use of automated Clinical Decision Support (CDS). Comprehensive electronic documentation is the foundation of accountability and quality improvement.

Care coordination through the secure electronic exchange and the collaborative authoring of the patient's medical record and care plan (this is referred to as clinical information reconciliation in the CCHIT Framework). This also requires health IT interoperability standards that are easy to use and designed following rigorous and well-defined software engineering practices. Unfortunately, this has not always been the case, resulting in standards that are actually obstacles to interoperability as opposed to enablers of interoperability. Case Management tools used by Medical Homes (a concept popularized by the Patient-Centered Medical Home model) can greatly facilitate collaboration and Care Coordination.

Patients' access to and ownership of their electronic health records including the ability to edit, correct, and update their records. Patient portals can be used to increase patients' health literacy with health education resources. Decision aids comparing the benefits and harms of various interventions (Comparative Effectiveness Research) should be available to patients. Patients' health behavior change remains one of the greatest challenges in Healthcare Transformation. mHealth tools have demonstrated their ability to support Patient Activation.

Secure communication between patients and their providers. Patients should have the ability to specify with whom, for what purpose, and the kind of medical information they want to share. Patients should have access to an audit trail of all access events to their medical records just as consumers of financial services can obtain their credit record and determine who has inquired about their credit score.

Clinical Decision Support (CDS) as well as other forms of cognitive aids such as Electronic Checklists, Data Visualization, Order Sets, Infobuttons, and more advanced Clinical Question Answering (CQA) capabilities (see my previous post entitled Automated Clinical Question Answering: The Next Frontier in Healthcare Informatics). The unaided mind (as Dr. Lawrence Weed, the father of the Problem-Oriented Medical Record calls it) is no longer able to cope with the large amounts of data and knowledge required in clinical decision making today. CDS should be used to implement clinical practice guidelines (CPGs) and other forms of Evidence-Based Medicine (EBM).

However, the delivery of care should also take into account the unique clinical characteristics of individual patients (e.g., co-morbidities and social history) as well as their preferences, wishes, and values. Standardized Clinical Assessment And Management Plans (SCAMPs) promote care standardization while taking into account patient preferences and the professional judgment of the clinician. CDS should be well integrated with clinical workflows (see my my previous post entitled Addressing Challenges to the Adoption of Clinical Decision Support (CDS) Systems).

Predictive risk modeling to identity at-risk populations and provide them with pro-active care including early screening and prevention. For example, predictive risk modeling can help identify patients at risk of hospital re-admission, an important ACO quality measure.

Outcomes measurement with an emphasis on patient outcomes in addition to existing process measures. Examples of patient outcome measures include: mortality, functional status, and time to recovery.

Clinical Knowledge Management (CKM) to disseminate knowledge throughout the system in order to support a learning health system. The Institute of Medicine (IOM) released a report titled Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care. The report describes the learning health system as:

"delivery of best practice guidance at the point of choice, continuous learning and feedback in both health and health care, and seamless, ongoing communication among participants, all facilitated through the application of IT."

Applications of Human Factors research to enable the effective use of technology in clinical settings. Examples include: implementation of usability guidelines to reduce Alert Fatigue in Clinical Decision Support (CDS), Checklists, and Visual Analytics. There are many lessons to be learned from other mission-critical industries that have adopted automation. Following several incidents and accidents related to the introduction of the Glass Cockpit about 25 years ago, Human Factors training known as Cockpit Resource Management (CRM) is now standard practice in the aviation industry.

Sunday, February 24, 2013

State of the Semantic Web in the Clinical Domain

In a previous post entitled Why Do We Need Ontologies in Healthcare Applications, I explained the important difference between ontologies, coding systems, and information models of data structures. I also outlined the benefits of using Semantic Web technologies like RDF, RDFS, OWL, SWRL, R2RML, SPARQL, SKOS, and Linked Open Data (LOD). These benefits include:

Reasoning and inferencing which are essential characteristics of intelligent Health IT Systems (iHIT)
Model consistency checking
Open World Assumption (OWA) and Non-Unique Naming Assumption enabling the integration of heterogeneous data sources and knowledge bases using Linked Open Data (LOD) principles. This integration can be accomplished by providing an RDF view over existing relational databases using R2RML (RDB to RDF Mapping Language) and by performing SPARQL federated queries. Intelligent queries can retrieve inferred facts using SPARQL 1.1 Entailment Regimes.
Linking to other biomedical ontologies like SNOMED and the Translational Medicine Ontology
Clinical Knowledge Management (CKM) using OWL to model and execute Clinical Practice Guidelines (CPGs) and Care Pathways (CPs).

Semantic Web in Clinical and Translational Research

The following are papers on how Semantic Web technologies are being used to realize these benefits in the healthcare domain:

A semantic-web oriented representation of the clinical element model for secondary use of electronic health records data in the Journal of the American Medical Informatics Association (JAMIA)
Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank in the Journal of Biomedical Semantics
Ontology-based Modeling of Clinical Practice Guidelines: A Clinical Decision Support System for Breast Cancer Follow-up Interventions at Primary Care Settings

Apache Stanbol

I recently came across Apache Stanbol, a new Apache project which is described as "a set of reusable components for semantic content management". What I really like about Apache Stanbol is that it not only works on unstructured data sources, but also integrates a number of other popular Apache open source software which can be used to add a semantic layer to modern RESTful content-oriented applications. These components include:

Apache Tika for text and metadata extraction from a variety of commonly used document formats
Apache OpenNLP for natural language processing and named entity recognition (NER)
Apache Solr for document store and semantic search
Apache Jena as the RDF and Semantic Web framework.

Other open source components like Apache Mahout (a scalable Machine Learning library) can be integrated to provide document recommendation and clustering services.

The Content Enhancers in Stanbol can perform named entity recognition (NER) and link text annotations to external datasets such as DBPedia. In the clinical domain, these enhancers can be used to extract entities from medical records, journal articles, and clinical guidelines. These entities can then be linked to other clinical data sources such as drug and disease databases using Linked Data techniques.

Apache Stanbol also provides Reasoners based on Jena RDFS, OWL, and OWLMini Reasoners as well as the HermiT OWL Reasoner. These reasoners can perform consistency checking and classification. Stanbol supports Inference Rules in the following formats: SWRL, Jena Rules, and SPARQL (by converting Stanbol Rules into SPARQL CONSTRUCTs).

Sunday, February 17, 2013

Automated Clinical Question Answering: The Next Frontier in Healthcare Informatics

In a previous post, I predicted that 2013 will be the year I ntelligent Health IT Systems (iHIT) go mainstream. I based my prediction on a number of factors, notably the transformation of healthcare to a value-based delivery system driven by the latest scientific evidence (evidence-based practice and practice-based evidence).

Last week, IBM together with health insurer WellPoint Inc., and New York’s Memorial Sloan-Kettering Cancer Center announced the commercialization of Watson (the supercomputer which beat human champions in "Jeopardy!" on February 16, 2011) for question answering (QA) in the clinical domain. The following are some interesting facts released by IBM as part of this announcement:

The supercomputer has ingested 1,500 lung cancer cases from Sloan-Kettering records, plus 2 million pages of text from journals, textbooks and treatment guidelines. This is what I called Big Data in medicine.
In 2012, Watson became 240 percent faster and 75 percent smaller so it can run on a single server. No surprise here and I expect this trend to continue.

The following YouTube video entitled Oncology Diagnosis and Treatment explains how IBM envisions using Watson for Clinical Question Answering (CQA):

The User Experience in the Watson Demo

Clinical questions can be posed in natural language (spoken or typed in by the clinician using a keyboard).
The sources used for answering clinical questions include both structured (EMR databases) and unstructured information (journal articles, clinical guidelines, etc.).
Personalized medicine: the proposed interventions are driven by the data in the patient's medical record and the system can prompt the clinician for additional information on the patient if necessary. The displayed evidence and recommendations are updated to reflect changes in the patient's clinical data.
Human Factors: the clinician is always in the loop. She can ask Watson how it arrives at a specific care recommendation and can even remove a specific evidence (if deemed irrelevant or not appropriate).
The use of confidence scoring and evidence highlighting.
Patient-centeredness and shared decision making: the treatment plans take into account the values, goals, and wishes of the patient (patient preferences). Treatment options are discussed with the patient.
Comparative effectiveness is used to compare the benefits and harms of different interventions.
Information is displayed using data visualization (dashboard) to help meet key performance indicators in the context of a value-based payment model.

The Science Behind Watson

The real question is how do we make intelligent health IT systems like Watson widely available to all patients. A landmark report published by the Institute of Medicine in 2001 and titled Crossing the Quality Chasm - A New Health System for the 21st Century contained the following recommendation:

Patients should receive care based on the best available scientific knowledge. Care should not vary illogically from clinician to clinician or from place to place.

For the scientifically (and Artificial Intelligence) inclined, the following are some pointers on the science behind Watson:

The picture below represents a high level architecture of Watson (click on the image to enlarge it).

AskHermes and MiPACQ

IBM Watson is not the only effort to develop automated CQA capabilities. Some earlier CQA efforts used the PICO framework (Problem/Population, Intervention, Comparison, Outcome) to facilitate processing. More recent efforts have focused on the use of clinical questions posed in natural language.

AskHermes (Help clinicians to Extract and aRrticulate Multimedia information for answering clinical quEstionS) allows clinicians to enter questions in natural language and uses the following unstructured information sources: MEDLINE abstracts, PubMed Central full-text articles, eMedicine documents, clinical guidelines, and Wikipedia articles.

The processing pipeline in AskHermes includes the following: Question Analysis, Related Questions Extraction, Information Retrieval, Summarization and Answer Presentation. AskHermes performs question classification using MMTx (MetaMap Technology Transfer) to map keywords to UMLS concepts and semantic types. Classification is also achieved through supervised machine learning algorithms such as Support Vector Machine (SVM) and conditional random fields (CFRs). Summarization and answer presentation are based on clustering techniques.

MiPACQ (Multi-source Integrated Platform for Answering Clinical Questions) is based on Natural Language Processing (NLP) and Information Retrieval (IR) and utilizes data sources such as Electronic Medical Record (EMR) databases and online medical encyclopedia like Medpedia. MiPACQ uses a processing pipeline based on UIMA (Unstructured Information Management Architecture) and machine learning-based as well as rule-based scoring. NLP capabilities are provided by ClearTK and cTakes (clinical Text Analysis and Knowledge Extraction System).

The Road Ahead

Automated Clinical Question Answering (CQA) is really hard. However, that is the future of computing: intelligent machines we can have meaningful conversations with. CQA is a multidisciplinary field which combines disciplines like statistical computing, information retrieval, natural language processing, machine learning, rule engines, semantic web technologies, knowledge representation and reasoning, visual analytics, and massively parallel computing. There are several open source projects that provide the building blocks. Many EHR software today are glorified data entry systems. We need to move to the next level and that will require technical leadership.

Sunday, December 30, 2012

Prediction for 2013: Intelligent Health IT Systems (iHIT) Go Mainstream

iHIT systems represent an evolution of clinical decision support (CDS) systems. Traditionally, CDS systems have provided functionalities such as Alerts and Reminders, Order Sets, Infobuttons, and Documentation Templates. iHIT systems go beyond these basic functionalities and are poised to go mainstream in 2013. This evolution is enabled by recent developments in both computing and healthcare. Notably in computing:

The emergence of Big Data and massively parallel computing platforms like Hadoop.
The entrance of the following disciplines into the mainstream of computing: Machine Learning (a branch of Artificial Intelligence), Statistical Computing, Visual Analytics, Natural Language Processing, Information Retrieval, Rule engines, and Semantic Web Technologies (like RDF, OWL, SPARQL, and SWRL). These disciplines have been around for many years, but have been largely confined into Academia, very large organizations, and niche markets.
The availability of open source tools, platforms, and resources to support the technologies mentioned above. Examples include: R (a statistical engine), Apache Hadoop, Apache Mahout, Apache Jena, Apache Stanbol, Apache OpenNLP, and Apache UIMA. The number of books, courses, and conferences dedicated to these topics has increased dramatically over the last two years signalling an entrance into the mainstream.

In addition, the healthcare industry itself is currently going through a significant transformation from a business model based on the number of patients treated to a value-based payment model. The Accountable Care Organization (ACO) is an example of this new model. This model puts an increased emphasis on meeting certain quality and performance metrics driven by the latest scientific evidence (this is called Evidence Based Practice or EBP).

Although very costly, Randomized Control Trials (RCTs) are considered the strongest form of evidence in EBP. Despite their inherent methodological challenges (lack of randomization leading to possible bias and confounding), observational studies (using real world data) are increasingly recognized as complementary to RCTs and an important tool in clinical decision making and health policy. According to a report titled "Clinical Practice Guidelines (CPGs) We Can Trust" published by the Institute Of Medicine (IoM):

"Randomized trials commonly have an under representation of important subgroups, including those with comorbidities, older persons, racial and ethnic minorities, and low-income, less educated, or low-literacy patients."

Investments into Comparative Effectiveness Research (CER) are increasing as well. CER, an emerging trend in Evidence Based Practice (EBP), has been defined by the Federal Coordinating Council for CER as "the conduct and synthesis of research comparing the benefits and harms of different interventions and strategies to prevent, diagnose, treat and monitor health conditions in 'real world' settings." CER is important not only for discovering what works and what doesn't work in practice, but also for an informed shared decision making process between the patient and her provider.

The use of predictive risk models for personalized medicine is becoming a common practice. These models can predict the health risks of patients based on their individual health profiles (including genetic profiles). These models often take the form of logistic regression models. Examples include models for predicting cardiovascular disease, ICU mortality, and hospital readmission (an important ACO performance measure).

Thanks to the Meaningful Use incentive program, adoption of electronic health record (EHR) systems by providers is rapidly increasing. This translates into the availability of huge amount of EHR data which can be harvested to provide Practice Based Evidence (PBE) necessary to close the evidence loop. PBE is the key to a learning health system. The Institute of Medicine (IOM) released a report last year titled "Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care". The report describes a learning health system as:

"...delivery of best practice guidance at the point of choice, continuous learning and feedback in both health and health care, and seamless, ongoing communication among participants, all facilitated through the application of IT."

Both EBP and PBE will require not only rigorous scientific methodologies, but also a computing platform suitable for the era of Big Data in medicine. As Williams Osler (1849-1919) famously said:

"Medicine is a science of uncertainty and an art of probability."

Lastly, to be successful, the emergence of iHIT systems will require a human-centered design approach. This will be facilitated by the use of techniques that can enhance human cognitive abilities. Examples are: Electronic Checklists (an approach that originates from the aviation industry and has been proven to save lives in healthcare delivery as well) and Visual Analytics.

Happy New Year to You and Your Family!

Adventures in Computing

Monday, August 25, 2014

Ontologies for Addiction and Mental Disease: Enabling Translational Research and Clinical Decision Support

Why Semantic Interoperability is needed in biomedical translational research?