Clinical Data Sources
Clinical data sources are represented at the top right of the architecture diagram. Examples include electronic medical record systems commonly used in routine clinical care, data from medical devices and wearable sensors, and unstructured data sources such as biomedical literature databases like PubMed. The architecture supports both batch and real-time streaming, enabling the implementation of the Lambda Architecture.
Intelligent Development Environment
The Intelligent Development Environment (IDE) provides various tools and frameworks for advanced analytics. The incoming clinical data are likely to meet the Big Data criteria of volume, velocity, and variety (this is particularly true for sensor data). Therefore, specialized frameworks for large scale cluster computing like Hadoop, Mahout, and Spark are used to analyze and process the data. Statistical computing and Machine Learning tools like R are used here as well. The goal is knowledge and patterns discovery using Machine Learning model builders like Decision Trees, k-Means Clustering, Logistic Regressions, Bayesian Networks, Neural Networks, and the more recent Deep Learning techniques which hold great promise in applications such as Natural Language Processing (NLP), medical image analysis, and speech recognition. These Machine Learning algorithms can provide care alerting, diagnosis, care planning, prediction, and anomaly detection. For example, anomaly detection can be performed at scale using the k-means clustering machine learning algorithm in Apache Spark.
Visual Analytics tools like D3.js, rCharts, ggplot2, and ggvis can also help obtain deep insight for effective understanding, reasoning, and decision making through the visual exploration of massive, complex, and often ambiguous data. As a multidisciplinary field, Visual Analytics combines several disciplines such as human perception and cognition, interactive graphic design, statistical computing, data mining, spatio-temporal data analysis, and even Art. For example, similar to Minard's map of the Russian Campaign of 1812-1813 (see graphic below), Visual Analytics can help in comparing different interventions and care pathways and their respective clinical outcomes over a certain period of time through the vivid showing of causes, variables, comparisons, and explanations.
The Intelligent Development Environment also features tools for Production Rules authoring (for translating clinical practice guidelines into computer-executable rules), Ontology engineering (for the purpose of automated reasoning), and Text Analytics. The latter include capabilities such as:
- Text classification, text clustering, text summarization, and clinical question answering (CQA) which can be useful for satisfying clinicians' information needs at the point of care; and
- Named entity recognition (NER) for extracting concepts from clinical notes.
The Run-Time Services provide intelligence at the point of care typically using deployed predictive models, clinical rules, text analytics outputs, and ontologies developed in the IDE. For example, Machine Learning algorithms can be exported in predictive markup language (PMML) format for run-time scoring based on the clinical data of individual patients, enabling what is referred to as Personalized Medicine. Other services include:
- Clinical Rules and Business Process Execution services enabling the seamless integration of clinical rules and clinical workflows,
- Ontology Reasoning services; and
- Alerts and Notifications services (for example, through SMS messaging).
Hexagonal, Reactive, and Secure
Intelligent Health IT systems are not just capable of discovering knowledge and patterns in data. They are also scalable, resilient, responsive, and secure. To achieve these objectives, several architectural patterns have emerged during the last few years:
- Domain Driven Design (DDD) recommends a layered architecture (typically user interface, application, domain, and infrastructure) with each layer having well defined responsibilities and interfaces for interacting with other layers. Models exist within "bounded contexts". These "bounded contexts" communicate with each other typically through messaging and web services using HL7 standards for interoperability.
- The Hexagonal Architecture defines "ports and adapters" as a way to design, develop, and test an application in a way that is independent of the various clients, devices, transport protocols, and even databases that could be used to consume its services in the future. This is particularly important in the era of the Internet of Things.
- Microservices consist in decomposing large monolithic applications into smaller services following good old principles of service-oriented design and single responsibility to achieve modularity, maintainability, scalability, and ease of deployment (for example, using Docker).
- Functional Programming: Functional Programming languages like Scala have several benefits that are particularly important for applying Machine Learning algorithms on large data sets. Like functions in mathematics, functions in Scala have no side effects. This provides Referential Transparency. Machine Learning algorithms are in fact based on Linear Algebra and Calculus. Scala supports high-order functions as well. Variables are immutable witch greatly simplifies concurrency. For all those reasons, Machine Learning libraries like Apache Mahout and MLlib in Spark have embraced Scala, moving away from the Java MapReduce paradigm.
- Reactive Architecture: The Reactive Manifesto makes the case for a new breed of applications called "Reactive Applications". According to the manifesto, the Reactive Application architecture allows developers to build "systems that are event-driven, scalable, resilient, and responsive." Leading frameworks that support Reactive Programming include Akka and RxJava. The latter is a library for composing asynchronous and event-based programs using observable sequences. RxJava is a Java port (with a Scala adaptor) of the original Rx (Reactive Extensions) for .NET created by Erik Meijer. Based on the Actor Model, Akka is a framework for building highly concurrent, asynchronous, distributed, and fault tolerant event-driven applications on the JVM. The Scala-based Play web application development framework has an embedded Java NIO (New I/O) non-blocking server based on JBoss Netty, supports asynchronous responses (based on the concepts of "Future" and "Promise"), reactive programming with Akka, caching, iteratees (for processing large streams of data), and real-time push-based technologies like WebSockets and Server-Sent Events. Node.js is another example of an event-driven, non-blocking I/O, and asynchronous architecture designed for scalability.
- Web Application Security: special attention is given to the OWASP Top Ten, threat modeling, enforcing secure coding guidelines, static analysis, and penetration testing.
Single Page Application
- Dependency Injection
- Test-Driven Development (Jasmine, Karma, PhantomJS)
- Package Management (Bower or npm)
- Build system and Continuous Integration (Grunt or Gulp.js)
- Static Code Analysis (JSLint and JSHint), and
- End-to-End Testing (Protractor).
Interoperability will always be a key requirement in clinical systems. Interoperability is needed between all players in the healthcare ecosystem including providers, payers, labs, knowledge artifact developers, quality measure developers, and public health agencies like the CDC. These standards exist today and are implementation-ready. However, only health IT buyers have the leverage to demand interoperability from their vendors.
Standards related to clinical decision support (CDS) include:
- The HL7 Fast Healthcare Interoperability Resources (FHIR)
- The HL7 virtual Medical Record (vMR)
- The HL7 Decision Support Services (DSS) specification
- The HL7 CDS Knowledge Artifact specification
- The DMG Predictive Model Markup Language (PMML) specification.