Sunday, February 2, 2014

Building business sustainability: why Agile needs Lean Startup principles?

In his book on leadership titled On Becoming a Leader, Warren Bennis wrote: "Managers do things right while leaders do the right thing". This quote can help explain why Agile needs Lean Startup principles.

Toward Product Leadership


Lean Startup is about Product Leadership. In business, the ultimate goal of the enterprise is to eventually generate revenue and profit and that requires having enough customers who are willing to pay for a product or service. This is the key to sustainability in a free market system. The concept of the Lean Startup was first introduced by Eric Ries in his book titled: The Lean Startup: How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Steve Blank wrote an article in the Harvard Business Review last year titled: Why the Lean Start-Up Changes Everything.

Agile is about the management of Product Development. When applied properly using techniques such as Test Automation and Continuous Delivery, Agile (and its various flavors like XP, Scrum, or Kanban) is a proven methodology for successful software delivery. A purely ceremonial approach (daily standup, sprint planning, review, retrospective, story pointing, etc.) may not yield best results.

However, having the best software developers in town and building a well-designed and well-tested software on time and under budget will not necessarily translate into market success and business growth and sustainability. So what is the missing piece? How do we ensure that Agile is delivering a product that users are willing to buy? How do we know that the software itself and the many features that we work hard to implement every sprint are actually going to be needed, paid for, and used by our customers?

How Agile projects can become big failed experiments


Agile promotes the use of cross-functional teams that include business users, subject matter experts (SMEs), software developers, and testers. There is also the role of Product Owner in Agile teams. The Product Owner and the business users help the team define and prioritize backlog items. The issue is that most of the time, neither the Product Owner nor the business users are the people who are actually going to sign a check or use their credit card to buy the product. This is the case when the product will be marketed and sold to the market at large. Implementing the wrong design and building the wrong product can be very costly to the enterprise. So the result is that the design and the features that are being implemented by the development team are just assumptions and hypotheses (by so called experienced experts and product visionaries) that have never been validated. Not surprisingly, many Agile projects have become big failed experiments.


Untested assumptions and hypotheses


What we call vision and strategy are often just untested assumptions and hypotheses. Yet, we invest significant time and resources pursuing those ideas. A deep understanding (acquired through lengthy industry experience) of the business, customers' pain points, regulatory environment, and the competitive landscape will not always produce the correct assumptions about a product. This is because the pace of change has accelerated dramatically over the last two decades.

Traditional management processes and tools like strategic planning and the balanced scorecard do not always provide a framework for validating those assumptions. Even newer management techniques like the Blue Ocean Strategy taught by business schools to MBA candidates contain significant elements of risk and uncertainty when confronted with the brutal reality of the marketplace.

This reminds me of my days in aviation training. Aviation operations are characterized by a high level of planning. The Flight Plan contains details about the departure time, route, estimated time enroute, fuel onboard, cruising altitude, airspeed, destination, alternate airports, etc. However, pilots are also trained to respond effectively to uncertainty. Examples of these critical decision points are the well known "Go/No Go" decision during takeoff and the "Go-Around" during the final approach to landing. According to the Flight Safety Foundation, a lack of go-arounds is the number one risk factor in approach and landing accidents and the number one cause of runway excursions.

 

Applying Lean Startup engineering Principles


A software development culture that tolerates risk-taking and failure as a source of rapid learning and growth is actually a good thing. The question is how do we perform fail-safe experiments early and often to validate our assumptions and hypotheses about customers' pain points and the business model (how money is made), quickly learn from those experiments, and pivot to new ideas and new experiments until we get the product right? The traditional retrospective in Agile usually involves discussion about what went wrong and how we can improve with a focus on the activities performed by team members. The concept of pivot in Lean Startup engineering is different. The pivot is about being responsive to customer feedback and demands in order to build a sustainable and resilient product and business. The pivot has significant implications on the architecture, design, and development of a product. As Peter Senge once said:

The only sustainable competitive advantage is an organization's ability to learn faster than the competition.

The Lean Startup recipe is to create Minimum Viable Products (MVPs) that are tested early and often with future customers of the product. An MVP can be created through rapid prototyping or by incrementally delivering new product features through Continuous Delivery, while leveraging cloud-based capabilities such as Platform as a Service (PaaS) to remain lean. Testing MVPs requires the team (including software developers) to get out of their cubicles or workstations and meet with customers face-to-face whenever possible to obtain direct feedback and validation. MVPs can also be tested through A/B testing or end user usability testing. Actionable metrics like the System Usability Scale (SUS) should be collected during these fail-safe experiments and subsequently analyzed.

These fail-safe experiments allows the Product Owner and team to refine the product vision and business model through validated learning. Lean Startup principles are not just for startups. They can also make a big difference in established enterprises where resource constraints combined with market competition and uncertainty can render the traditional strategic planning exercise completely useless.

Sunday, January 12, 2014

Navigating in Scala Land

In a previous post titled Toward Polyglot Programming on the JVM, I described my preliminary  exploration of the other languages and frameworks on the JVM including Groovy, Gradle, Grails, Scala, Akka, Clojure, and the Play Framework. I made the switch from Maven to Gradle, a Groovy-based build language that combines the best of Ant and Maven. I was seduced by the scaffolding capabilities of Grails, but decided to make the jump to Scala. So I have been navigating in Scala land recently. In this post, I described my journey.

I am still learning, but I can tell you that I never had so much fun learning a new programming language. It probably has something to do with the use of pure mathematical functions in Scala. I did spend a year studying pure mathematics at the University of Abomey-Calavi after graduating from high school. My first exposure to functional programming was with XSLT and XQuery and I very much enjoyed programming without side effects when using those languages. XSLT 3.0 is a fully-fledged functional programming language with support for functions as first class values and high-order functions. XQuery 3.0 is a typed functional language for processing and querying XML data.

Scala is a complex and ambitious language as it supports both object-oriented and functional programming. When I started to learn Scala, I took the wrong directions several times and had to make several U-turns. So the following steps have been effective for me:

  • The Coursera class Functional Programming Principles in Scala taught by Martin Odersky (the designer of Scala) is a good place to start. It explains the motivations behind Scala and emphasizes its mathematical and functional nature without trying to map pre-existing knowledge (of Java or Python, or any other language) to Scala. This is a refreshing approach because Scala is a different language although it can interoperate with Java. A good companion to this course is the book Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition by Martin Odersky, Lex Spoon and Bill Venners.

  • If you're a Java programmer moving to Scala, then the book Scala for the Impatient by Cay S. Horstmann would be a good reference. 

  • If you're interested in building a web application in Scala, then I would recommend the book Play for Scala by Peter Hilton, Erik Bakker and Francisco Canedo. Scala and Play come with their own ecosystem of tools. This includes a Scala-based build system called sbt (simple build tool), testing tools (like Spec2 and ScalaTest), IDEs (Eclipse-based Scala IDE and IntelliJ), database drivers (like ReactiveMongo and Slick), and authentication/authorization (SecureSocial and Deadbolt). Play has first-class support for JSON and REST, supports asynchronous responses (based on the concepts of "Future" and "Promise"), reactive programming with Akka, caching, iteratees (for processing large streams of data), and real-time push-based technologies like WebSockets and Server-Sent Events.

  • A this point, if you decide to dive into the deep waters of Scala, you might want to consider learning Reactive Programming and purely functional data structures. 

  • There is a second Scala course at Coursera titled Principles of Reactive Programming taught by Martin Odersky, Erik Meijer, and Roland Kuhn. In Scala land, there are currently two leading frameworks that support Reactive Programming: Akka and RxJava. The latter is a library for composing asynchronous and event-based programs using observable sequences. RxJava is a Java port (with a Scala adaptor) of the original Rx (Reactive Extensions) for .NET created by Erik Meijer. Based on the Actor Model, Akka is a framework for building highly concurrent, asynchronous, distributed, and fault tolerant event-driven applications on the JVM (it supports both Java and Scala).

  • For purely functional data structures, there is a Scala-based library called Scalaz. The book Functional Programming in Scala by Paul Chiusano and RĂșnar Bjarnason is a good resource for exploring Scalaz.

  • Readers of this blog probably know that I am a proponent of Domain Driven Design (DDD) in building complex software systems. So I have been investigated how DDD principles can be implemented with a functional and reactive approach. Vaughn Vernon recently presented a podcast on Reactive DDD with Scala and Akka.

  • For me, Big Data is real, not just a buzzword. I believe in analyzing humongous amounts of data to find hidden patterns and obtain insight for solving complex problems. Dean Wampler called copious data, the killer app for functional programming. Scalding by Twitter can be used for writing MapReduce jobs in Scala.

Sunday, December 29, 2013

Improving the quality of mental health and substance use treatment: how can Informatics help?


According to the 2012 National Survey on Drug Use and Health:

  • In 2012, an estimated 43.7 million adults aged 18 or older in the United States had any mental illness (AMI) in the past year. This represents 18.6 percent of all adults in this country.

  • Among the 43.7 million adults aged 18 or older in 2012 with AMI in the past year, 19.2 percent (8.4 million adults) met criteria for a substance use disorder (i.e., illicit drug or alcohol dependence or abuse).

  • Among the 43.7 million adults aged 18 or older with AMI in 2012, 17.9 million (41.0 percent) received mental health services in the past year.

  • In 2012, an estimated 9.0 million adults (3.9 percent) aged 18 or older had serious thoughts of suicide in the past year.
Mental health and substance use are often associated with other issues such as:

  • Co-morbidity involving other chronic diseases like HIV, hepatitis, diabetes, and cardiovascular disease.

  • Overdose and emergency care utilization.

  • Social issues like incarceration, violence, homelessness, and unemployment.
It is now well established that addiction is a chronic disease of the brain and should be treated as such from a clinical and health and social policy standpoint.


The regulatory framework

  • The Affordable Care Act (ACA) requires non-grandfathered health plans in the individual and small group markets to provide essential health benefits (EHBs) including mental health and substance use disorder benefits.  

  • Starting in 2014, insurers can no longer deny coverage because of a pre-existing mental health condition.

  • The ACA requires health plans to cover recommended evidence-based prevention and screening services including depression screening for adults and adolescents and behavioral assessments for children.

  • On November 8, 2013, HHS and the Departments of Labor and Treasury released the final rules implementing the Paul Wellstone and Pete Domenici Mental Health Parity and Addiction Equity Act of 2008 (MHPAEA).

Technologies in support of a Collaborative Care Model


There is sufficient evidence to support the efficacy of the collaborative care model in the treatment of chronic mental health and substance use conditions. An example is the Chronic Care Management (CCM) approach based on the following principles: 
  • Coordinated care involving a multi-disciplinary care team.

  • Longitudinal care plan as the backbone of care coordination.

  • Co-location of primary care and mental health and substance use specialists.

  • Case management by a Nurse Care Manager. 
This approach has been popularized by the Patient-Centered Medical Home (PCMH) model of coordinated care. It will require a new breed of clinical collaboration tools and capabilities such as:
  • Conversations and knowledge sharing through instant messaging and video conferencing for virtual two-way face-to-face communication.

  • Clinical content management and case management tools.

  • File sharing and syncing allowing the longitudinal care plan to be synchronized and shared among all members of the care team.

  • Light-weight and simple clinical data exchange standards and protocols for content, transport, security, and privacy.

 

Implementing Clinical Practice Guidelines (CPGs) with Clinical Decision Support (CDS) systems

 

Clinical Decision Support (CDS) can help address key challenges in mental health and substance use treatment such as:

  • Shortages and high turnover in the addiction treatment workforce.

  • Insufficient or lack of adequate clinician education in mental health and addiction medicine.

  • Lack of implementation of available evidence-based clinical practice guideline (CPGs) in mental health and addiction medicine.
CPGs can be translated into executable CDS rules using business rule engines. The complexity and costs inherent in capturing the medical knowledge in clinical guidelines and translating that knowledge into executable code remains an impediment to the widespread adoption of CDS software. Therefore, there is a need for standards for the sharing and interchange of CDS knowledge artifacts and executable clinical guidelines. The ONC Health eDecision Initiative has published specifications to support the interoperability of CDS knowledge artifacts and services.

Learning what works and what does not work in clinical practice is important for building a learning health system. This can be achieved by incorporating the results of Comparative Effectiveness Research (CER) and Patient-Centered Outcome Research (PCOR) into CDS systems. Increasingly, outcomes research will be performed using observational studies (based on real world clinical data) which are recognized as complementary to randomized control trials (RCTs). This is a form of Practice-Based Evidence (PBE) that is necessary to close the evidence loop.

The typical Clinical Practice Guideline (CPG) is 50 to 150 pages long. Clinical Decision Support (CDS) should also include other forms of cognitive aid such as Electronic Checklists, Data Visualization, Order Sets, and Infobuttons.

Genomics of Addiction and Personalized Medicine


Advances in genomics and pharmacogenomics are helping researchers understand treatment response variability among patients in addiction treatment. Clinical Decision Support (CDS) systems can also be used to provide cognitive support to clinicians in providing genetically guided treatment interventions.


Quality Measurement for Mental Health and Substance Use Treatment


An important implication of the shift from a fee-for-service to a value-based healthcare delivery model is that existing process measures and the regulatory requirements to report them are no longer sufficient. An example of process measure is the requirement to use a standardized assessment tool like PHQ9 for depression screening.

Patient-reported outcomes (PROs) and patient-centered measures include essential metrics such as mortality, functional status, time to recovery, severity of side effects, and remission (depression remission at six and twelve months). These measures should take into account the values, goals, and wishes of the patient. Therefore patient-centered outcomes should also include the patient's own evaluation of the care received. For Accountable Care Organizations (ACOs), an important performance measure is the 30-day hospital readmission for individuals hospitalized for a mental health or substance use condition.

Another issue to be addressed is the lack of data elements in Electronic Medical Record (EMR) systems for capturing, reporting, and analyzing PROs. This is the key to accountability and quality improvement in mental health and substance use treatment.


Using Natural Language Processing (NLP) for the automated processing of clinical narratives


Electronic documentation in mental health and substance use treatment is often captured in the form of narrative text such as psychotherapy notes. Natural Language Processing (NLP) and machine learning tools and techniques (such as named entity recognition or NER) can be used to extract clinical concepts and other insight from clinical notes.

Another area of interest is Clinical Question Answering (CQA) that would allow clinicians to ask questions in natural language and extract clinical answers from very large amounts of unstructured sources of medical knowledge. PubMed has more than 23 millions citations for biomedical literature from MEDLINE, life science journals, and online books. It is impossible for the human brain to keep up with that amount of knowledge.



Computer-Based Cognitive Behavioral Therapy (CCBT) and mHealth


According to a report published last year by the California HealthCare Foundation and titled The Online Couch: Mental Health Care on the Web:

"Computer-based cognitive behavioral therapy (CCBT) cost-effectively leverages the Internet for coaching patterns in self-driven or provider-assisted programs. Technological advances have enabled computer systems designed to replicate aspects of cognitive behavior therapy for a growing range of mental health issues".
An example of a successful nationwide adoption of CCBT is the online behavioral therapy site Beating the Blues in the United Kingdom which has been proven to help patients suffering from anxiety and mild to moderate depression. Beating the Blues has been recommended for use in the NHS by the National Institute for Health and Clinical Excellence (NICE).

In addition, there is growing evidence to support the efficacy of mobile health (mHealth) technologies for supporting patent engagement and activation in health behavior change (e.g., smoking cessation).

 

Informed Patient Consent and Privacy


Because of the stigma associated with mental health and substance use, it is important to give patients control over the sharing of the medical record. Patients consent should be obtained about what type information is shared, with whom, and for what purpose. The patient should also have access to an audit trail of all data exchange-related events. Current paper-based consent processes are inefficient and lack accountability. Web-based consent management applications facilitate the capture and automated enforcement of patient consent directives (see my previous post titled Patient privacy at web scale).

Sunday, November 10, 2013

Toward Polyglot Programming on the JVM

In my previous post titled Treating Javascript as a first class language, I wrote about how the Java Virtual Machine (JVM) is evolving with new languages and frameworks like Groovy, Grails, Scala, Akka, and the Play Framework. In this post, I report on my experience in learning and evaluating these emerging technologies and their roles in the Java ecosystem.

A KangaRoo on the JVM


On a previous project, I used Spring Roo to jumpstart the software development process. Spring Roo was created by Ben Alex, an Australian engineer who is also the creator of Spring Security. Spring Roo was a big productivity boost and generated a significant amount of code and configuration based on the specification of the domain model. Spring Roo automatically generated the following:

  • The domain entities with support for JPA annotations.
  • Repository and service layers. In addition to JPA, Spring Roo also supports NoSQL persistence for MongoDB based on the Spring Data repository abstraction.
  • A web layer with Spring MVC controllers and JSP views with support for Tiles-based layout, theming, and localization. The JSP views were subsequently replaced with a combination of Thymeleaf (a next generation server-side HTML5 template engine) and Twitter Boostrap to support a Responsive Web Design (RWD) approach. Roo also supports GWT and JSF.
  • REST and JSON remoting for all domain types.
  • Basic configuration for Spring Security, Spring Web Flow, Spring Integration, JMS, Email, and Apache Solr.
  • Entity mocking, automatic generation of test data ("Data on Demand"),  in-container integration testing, and end-to-end Selenium integration tests.
  • A Maven build file for the project and full integration with Spring STS.
  • Deployment to Cloud Foundry.
Roo also supports other features such as database reverse engineering and Ajax . Another benefit of using Roo is that it helped enforce Spring best practices and other architectural concerns such as proper application layering.

For my future projects, I am looking forward to taking developer's productivity and innovation to the next level. There are several criteria in my mind:

  • Being able to do more with less. This means being able to write code that is concise, expressive, requires less configuration and boilerplate coding, and is easier to understand and maintain (particularly for difficult concerns like concurrency which is a key factor in scalability).
  • Interoperability with the Java language and being able to run on the JVM, so that I can take advantage of the larger and rich Java ecosystem of tools and frameworks.
  • Lastly, my interest in responsive, massively scalable, and fault-tolerant systems has picked up recently.


Getting Groovy


Maven has been a very powerful build system for several projects that I have worked on. My goal now is to support continuous delivery pipelines as a pattern for achieving high quality software. Large open source projects like Hibernate, Spring, and Android have already moved to Gradle. Gradle builds are written in a Groovy DSL and are more concise than Maven POM files which are based on a more verbose XML syntax. Gradle supports Java, Groovy, and Scala out-of-the box. It also has other benefits like incremental builds, multi-project builds, and plugins for other essential development tools like Eclipse, Jenkins, SonarQube, Ivy, and Artifactory.

Grails is a full-stack framework based on Groovy, leveraging its concise syntax (which includes Closures), dynamic language programming, metaprogramming, and DSL support. The core principle of Grails is "convention over configuration". Grails also integrates well with existing and popular Java projects like Spring Security, Hibernate, and Sitemesh. Roo generates code at development time and makes use of AOP. Grails on the other hand generates code at run-time, allowing the developer to do more with less code. The scaffolding mechanism is very similar in Roo and Grails.

Grails has its own view technology called Groovy Server Pages (GSP) and its own ORM implementation called Grails Object Relational Mapping (GORM) which uses Hibernate under the hood. There is also decent support for REST/JSON and URL routing to controller actions. This makes it easy to use Grails together with Javascript MVC frameworks like AngularJS in creating more responsive user experiences based on the Single Page Application (SPA) architectural pattern.

There are many factors that can influence the decision to use Roo vs. Grails (e.g., the learning curve associated with Groovy and Grails for a traditional Java team). There is also a new high-productivity framework called Spring Boot that is emerging as part of the soon to be released Spring Framework 4.0.


Becoming Reactive


I am also interested in massively scalable and fault-tolerant systems. This is no longer a requirement solely for big internet players like Google, Twitter, Yahoo, and LinkedIn that need to scale to millions of users. These requirements (including response time and up time) are also essential in mission-critical applications such as healthcare.

The recently published "Reactive Manifesto" makes the case for a new breed of applications called "Reactive Applications". According to the manifesto, the Reactive Application architecture allows developers to build "systems that are event-driven, scalable, resilient, and responsive." That is the premise of the other two prominent languages on the JVM: Scala and Clojure. They are based on a different programming paradigm (than traditional OOP) called Functional Programming that is becoming very popular in the multi-core era.

Twitter uses Scala and has open-sourced some of their internal Scala resources like "Effective Scala" and "Scala School". One interesting framework based on Scala is Akka, a concurrency framework built on the Actor Model.

The Play Framework 2 is a full-stack web application framework based on Scala which is currently used by LinkedIn (which has over 225 millions registered users worldwide). In addition to its elegant design, Play's unique benefits include:

  • An embedded Java NIO (New I/O) non-blocking server based on JBoss Netty, providing the ability to call collaborating services asynchronously without relying on thread pools to handle I/O. This new breed of servers is called "Evented Servers" (NodeJS is another implementation) as opposed to the old "Threaded Servers". Older frameworks like Spring MVC use a threaded and synchronous approach which is more difficult to scale.
  • The ability to make changes to the source code and just refresh the browser page to see the changes (this is called hot reload).
  • Type-safe Scala templates (errors are displayed in the browser during development).
  • Integrated support for Akka which provides (among other benefits) fault-tolerance, the ability to quickly recover from failure.
  • Asynchronous responses (based on the concepts of "Future" and "Promise" also found in AngularJS), caching, iteratees (for processing large streams of data), and support for real-time push-based technologies like WebSockets and Server-Sent Events.
The biggest challenge in moving to Scala is that the move to Functional Programming can be a significant learning curve for developers with a traditional OOP background in Java. Functional Programming is not new. Languages like Lisp and Haskell are functional programming languages. More recently, XML processing languages like XSLT and XQuery have adopted functional programming ideas.


Bringing Clojure to the JVM


Clojure is a dialect of LISP and a dynamically-type functional programming language which compiles to JVM bytecode. Clojure supports multithreaded programming and immutable data structures. One interesting application of Clojure is Incanter, a statistical computing and data visualization environment enabling big data analysis on the JVM.

Sunday, October 20, 2013

Treating Javascript as a first-class language

With the emergence of the Single Page Application (SPA) architecture as an approach to creating more fluid and responsive user experiences in the browser, Javascript is gaining prominence as a platform for modern application development. Paypal, a large online payment service, announced recently that it has achieved significant performance and productivity gains by shifting its server-side development from Java to Javascript. From a software architecture and development perspective, what do expressions like "Javascript as a first-class language" or "Javascript as a platform" actually mean?

Let's consider a well-established first-class language and platform like Java. By the way, I still consider Java a strong and safe bet for developing applications. What makes Java strong is not just the language, but the rich ecosystem of free and open source tools and frameworks built around it (e.g., Eclipse, Tomcat, JBoss Application Server, Drools, Maven, Jenkins, Solr, Hibernate, Spring, Hadoop to name just a few). The JVM is evolving with new languages and frameworks like Groovy, Grails, Clojure, Scala, Akka, and the Play Framework which aim to enhance developer's productivity. It is also well-known that big internet companies like Twitter have achieved significant gains in performance, scalability, and other architectural concerns by shifting a lot of back-end code from Ruby on Rails to the JVM. There are a number of architectural patterns and software development practices that have been adopted over the years in successfully building quality Java applications. These include:

  • Design patterns such as the Gang of Four (GoF), Dependency Injection, Model View Controller (MVC), Enterprise Integration Patterns (EIP), Domain Driven Design (DDD), and modularity patterns like those based on OSGi.
  • Test-Driven Development (TDD) using tools like JUnit, TestNG, Mockito (mocking), Cucumber-JVM (for behavior-driven development or BDD), and Selenium (for automated end-to-end testing).
  • Build tools like Maven and Gradle.
  • Static analysis with tools like FindBugs, Checkstyle, PMD, and Sonar.
  • Continuous integration and delivery with tools like Jenkins.
  • Performance testing with JMeter.
  • Web application vulnerability testing with Burp.

As we move to a rich client application paradigm based on Javascript and the Single Page Application (SPA) architecture, it is clear that Javascript can no longer be considered a toy language for front-end developers and so we need to bring the same engineering discipline to Javascript. As I said previously, the JVM remains my platform of choice for back-end development. For example, I find that AngularJS (a client-side Javascript MVC framework) works well with Spring back-end capabilities (like Spring Security and REST support in Spring MVC, HATEOAS, or Grail). However, I also keep an eye on server-side Javascript frameworks like Node.js.

The good news is that the community is coming up with patterns, tools, and practices that are helping elevate Javascript to the status of first-class language. The following is a list of patterns and tools that I find interesting and promising so far:
  • Javascript design patterns including the application of the GoFs to Javascript. The MVC and Dependency Injection patterns are both implemented in AngularJS, my favorite Javascript MVC framework. There are also modularity patterns like Asyncronous Module Definition (AMD) supported by RequireJS.
  • Functional programming support in Javascript (e.g., higher-order functions and closures) is emerging as a best practice in writing quality Javascript code. 
  • Behavior-Driven Development (BDD) testing with Jasmine.
  • Static analysis with Javascript code quality tools like JSLint and JSHint.
  • Build with Grunt, a Javascript task runner.
  • Karma, a test runner for Javascript.
  • Protractor, an end-to-end test framework built on top of Selenium WedDriverJS.
  • Single Page Applications are subject to common web application vulnerabilities like Cookie Snooping, Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), and JSON Injection. Security is mainly the responsibility of the server, although client-side frameworks like AngularJS also provide some features to enhance the security of Single Page Applications.

Thursday, August 15, 2013

Health IT Innovations for Care Coordination

The Business Case


According to an article by Bodenheimer et al. published in the January/February 2009 issue of Health Affairs and titled Confronting The Growing Burden Of Chronic Disease: Can The U.S. Health Care Workforce Do The Job?:

In 2005, 133 million americans were living with at least one chronic condition. In 2020, this number is expected to grow to 157 million. In 2005, sixty-three million people had multiple chronic illnesses, and that number will reach eighty-one million in 2020. 

Patients with co-morbidities are typically treated by multiple clinicians working for different healthcare organizations. Care Coordination is necessary for the effective treatment of these patients and reducing costs. Effective Care Coordination can reduce the number of redundant tests and procedures, hospital admissions and readmissions, medical errors, and patient safety issues related to the lack of medication reconciliation. 

According to a paper by Dennison and Hugues published in the Journal of Cardiovascular Nursing and titled Progress in Prevention Imperative to Improve Care Transitions for Cardiovascular Patients, direct communication between the hospital and primary care setting occurred only 3 percent of the time. According to the same paper, at discharge, a summary was provided only 12 percent of the time, and this occurrence remained poor at 4 weeks post-discharge, with only 51 percent of practitioners providing a summary. The paper concluded that this standard affected quality of care in 25 percent of follow-up visits.

Health Information Exchanges (HIEs) and emerging delivery models like the Accountable Care Organization (ACO) and the Patient-Centered Medical Home (PCMH) were designed to promote care coordination. However, according to an article by Furukawa et al. published in the August 2013 issue of Health Affairs and titled Hospital Electronic Health Information Exchange Grew Substantially In 2008–12:

In 2012, 51 percent of hospitals exchanged clinical information with unaffiliated ambulatory care providers, but only 36 percent exchanged information with other hospitals outside the organization. . . . In 2012 more than half of hospitals exchanged laboratory results or radiology reports, but only about one-third of them exchanged clinical care summaries or medication lists with outside providers.                      


Furthermore, the financial sustainability of many HIEs remains an issue. According to another article by Adler-Milstein et al. published in the same issue of Health Affairs and titled Operational Health Information Exchanges Show Substantial Growth, But Long-Term Funding Remains A Concern, "74 percent of health information exchange efforts report struggling to develop a sustainable business model".  

There are other obstacles to care coordination including the existing fee-for-service healthcare delivery model (as opposed to a value-based model), the lack of interoperability between healthcare information systems, and the lack of adoption of effective collaboration tools.

According to a report by the Institute of Medicine (IOM) titled  The Healthcare Imperative: Lowering Costs and Improving Outcomes, a program designed to improve care coordination could result in national annual savings of $240.1 billions.

What Can We Learn From High Risk Operations in Other Industries?


Similar breakdowns in communication during shift handovers have also been observed in risky operating environments, sometimes with devastating consequences. In the aerospace industry, human factors research and training have played an important role in successfully addressing the issue. A research paper by Parke and Mishkin titled Best Practices in Shift Handover Communication: Mars Exploration Rover Surface Operations included the following recommendations:

  • Two-way Communication, Preferably Face-to-Face. . . . Two-way communication enables the incoming worker to ask questions and rephrase the material to be handed over, so as to expose these differences [in mental model].


  • Face-to-Face Handovers with Written Support. Face-to-face handovers are improved if they are supported by structured written material—e.g., a checklist of items to convey, and/or a position log to review. 


  • Content of Handover Captures Intent. Handover communication works best if it captures problems, hypotheses, and intent, rather than simply lists what occurred.
While the logistics of healthcare delivery does not always permit physical face-to-face communication between clinicians during transitions of care, the web has seen an explosion in online collaboration tools. Innovative organizations have embraced these technologies giving rise to a new breed of enterprise software known as Enterprise 2.0 or Social Enterprise Software. This new breed of software is not only social, but also mobile, and cloud-based.

Care Coordination in the Health Enterprise 2.0


  • Collaborative Authoring of a Longitudinal Care Plan. From a content perspective, the Care Plan is the backbone of Care Coordination. The Care Plan should be comprehensive and standardized (similar to the checklist in aerospace operations). It should include problems, medications, orders, results, care goals (taking into consideration the patient's wishes and values), care team members and their responsibilities, and actual patient outcomes (e.g., functional status). Clinical Decision Support (CDS) tools can be used to dynamically generate a basic Care Plan based on the patient's specific clinical data. This basic Care Plan can be used by members of the care team to build a more elaborate Longitudinal Care Plan. CDS tools can also automatically generate alerts and reminders for the care team.


  • Communication and Collaboration using Enterprise 2.0 Software.  These tools should be used to enable collaboration between all members of the care team which include not only clinicians, but also non-clinician caregivers, and the patient herself. Beyond email, these tools allow conversations and knowledge sharing through instant messaging, video conferencing (for virtual two-way face-to-face communication), content management, file syncing (allowing the longitudinal care plan to be synchronized and shared among all members of the care team), search, and enterprise social networking (because clinical work is a social activity like most human activities). A providers directory should make it easy for users to find a specific provider and all their contact information based on search criteria such as location, specialty, knowledge, experience, and telephone number.


  • Light Weight Standards and Protocols for Content, Transport, Security, and Privacy. The foundation standards are: REST, JSON, OAuth2, and OpenID Connect. An emerging approach that could really help put patients in control of the privacy of their electronic medical record is the OAuth2.0-based User-Managed Access (UMA) Protocol of the Kantara Initiative (see my previous post titled Patient Privacy at Web Scale). Initiatives like the ONC-sponsored RESTful Health Exchange (RHEX) project and the HL7 Fast Healthcare Interoperability Resources (FHIR) hold great promise.


  • Case Management Tools. They are typically used by Nurse Practionners (Case Managers) in Medical Homes, a concept popularized by the Patient-Centered Medical Home healthcare delivery model to coordinate care. These tools integrate various capabilities such as risk stratification (using predictive modeling) to identify at-risk patients, content management (check-in, check-out, versioning), workflows (human tasks), communication, business rule engine, and case reporting/analytics capabilities.

Sunday, June 9, 2013

Essential IT Capabilities Of An Accountable Care Organization (ACO)

The Certification Commission for Health Information Technology (CCHIT) recently published a document entitled A Health IT Framework for Accountable Care. The document identifies the following key processes and functions necessary to meet the objectives of an ACO:

  • Care Coordination
  • Cohort Management
  • Patient and Caregiver Relationship Management
  • Clinician Engagement
  • Financial Management
  • Reporting
  • Knowledge Management.

The key to success is a shift to a data-driven healthcare delivery. The following is my assessment of the most critical IT capabilities for ACO success:

  • Comprehensive and standardized care documentation in the form of electronic health records including as a minimum: patients' signs and symptoms, diagnostic tests, diagnoses, allergies, social and familiy history, medications, lab results, care plans, interventions, and actual outcomes. Disease-specific Documentation Templates can support the effective use of automated Clinical Decision Support (CDS). Comprehensive electronic documentation is the foundation of accountability and quality improvement.

  • Care coordination through the secure electronic exchange and the collaborative authoring of the patient's medical record and care plan (this is referred to as clinical information reconciliation in the CCHIT Framework). This also requires health IT interoperability standards that are easy to use and designed following rigorous and well-defined software engineering practices. Unfortunately, this has not always been the case, resulting in standards that are actually obstacles to interoperability as opposed to enablers of interoperability. Case Management tools used by Medical Homes (a concept popularized by the Patient-Centered Medical Home model) can greatly facilitate collaboration and Care Coordination.

  • Patients' access to and ownership of their electronic health records including the ability to edit, correct, and update their records. Patient portals can be used to increase patients' health literacy with health education resources. Decision aids comparing the benefits and harms of various interventions (Comparative Effectiveness Research) should be available to patients. Patients' health behavior change remains one of the greatest challenges in Healthcare Transformation. mHealth tools have demonstrated their ability to support Patient Activation.

  • Secure communication between patients and their providers. Patients should have the ability to specify with whom, for what purpose, and the kind of medical information they want to share. Patients should have access to an audit trail of all access events to their medical records just as consumers of financial services can obtain their credit record and determine who has inquired about their credit score.

  • Clinical Decision Support (CDS) as well as other forms of cognitive aids such as Electronic Checklists, Data Visualization, Order Sets, Infobuttons, and more advanced Clinical Question Answering (CQA) capabilities (see my previous post entitled Automated Clinical Question Answering: The Next Frontier in Healthcare Informatics). The unaided mind (as Dr. Lawrence Weed, the father of the Problem-Oriented Medical Record calls it) is no longer able to cope with the large amounts of data and knowledge required in clinical decision making today. CDS should be used to implement clinical practice guidelines (CPGs) and other forms of Evidence-Based Medicine (EBM).

    However, the delivery of care should also take into account the unique clinical characteristics of individual patients (e.g., co-morbidities and social history) as well as their preferences, wishes, and values. Standardized Clinical Assessment And Management Plans (SCAMPs) promote care standardization while taking into account patient preferences and the professional judgment of the clinician. CDS should be well integrated with clinical workflows (see my my previous post entitled Addressing Challenges to the Adoption of Clinical Decision Support (CDS) Systems).

  • Predictive risk modeling to identity at-risk populations and provide them with pro-active care including early screening and prevention. For example, predictive risk modeling can help identify patients at risk of hospital re-admission, an important ACO quality measure.

  • Outcomes measurement with an emphasis on patient outcomes in addition to existing process measures. Examples of patient outcome measures include: mortality, functional status, and time to recovery.

  • Clinical Knowledge Management (CKM) to disseminate knowledge throughout the system in order to support a learning health system. The Institute of Medicine (IOM) released a report titled  Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care. The report describes the learning health system as:

    "delivery of best practice guidance at the point of choice, continuous learning and feedback in both health and health care, and seamless, ongoing communication among participants, all facilitated through the application of IT."

  • Applications of Human Factors research to enable the effective use of technology in clinical settings. Examples include: implementation of usability guidelines to reduce Alert Fatigue in Clinical Decision Support (CDS), Checklists, and Visual Analytics. There are many lessons to be learned from other mission-critical industries that have adopted automation. Following several incidents and accidents related to the introduction of the Glass Cockpit about 25 years ago, Human Factors training known as Cockpit Resource Management (CRM) is now standard practice in the aviation industry.

Sunday, April 28, 2013

How I Make Technology Decisions

The open source community has responded to the increasing complexity of software systems by creating many frameworks which are supposed to facilitate the work of developing software. Software developers spend a considerable amount of time researching, learning, and integrating these frameworks to build new software products. Selecting the wrong technology can cost an organization millions of dollars. In this post, I describe my approach to selecting these frameworks. I also discuss the frameworks that have made it to my software development toolbox.

Understanding the Business


The first step is to build a strong understanding of the following:

  • The business goals and challenges of the organization. For example, the healthcare industry is currently shifting to a value-based payment model in an increasingly tightening regulatory environment. Healthcare organizations are looking for a computing infrastructure that support new demands such as the Accountable Care Organization (ACO) model, patient-centered outcomes, patient engagement, care coordination, quality measures, bundled payments, and Patient-Centered Medical Homes (PCMH).

  • The intended buyers and users of the system and their concerns. For example, what are their pain points? which devices are they using? and what are their security and privacy concerns?

  • The standards and regulations of the industry.

  • The competitive landscape in the industry. To build a system that is relevant, it is important to have some ideas about the following: what is the competition? what are the current capabilities of their systems? what is on their road map? and what are customers saying about their products. This knowledge can help shape a Blue Ocean Strategy.

  • Emerging trends in technologies.

This type of knowledge comes with industry experience and a habit of continuously paying attention to these issues. For example, on a daily basis, I read industry news as well as scientific and technical publications. As a member of the American Medical Informatics Association (AMIA), I receive the latest issue of the Journal of the American Medical Informatics Association (JAMIA) which allows me to access cutting-edge research in medical informatics. I speak at industry conferences when possible and this allows me not only to hone my presentation skills, but also attend all sessions for free or at a discounted price. For the latest in software development, I turn to publications like InfoQ, DZone, and TechCrunch.

To better understand the users and their needs and concerns, I perform early usability testing (using sketches, wireframes, or mockups) to test design ideas and obtain feedback before actual development starts. For generating innovative design ideas, I recommend the following book: Universal Methods of Design: 100 Ways to Research Complex Problems, Develop Innovative Ideas, and Design Effective Solutions by Bruce Hanington and Bella Martin.

 

Architecting the Solution


Armed with a solid understanding of the business and technological landscape as well as the domain, I can start creating a solution architecture. Software development projects can be chaotic. Based on my experience working on many software development projects across industries, I found that Domain Driven Design (DDD) can help foster a disciplined approach to software development. For more on my experience with DDD, see my previous post entitled How Not to Build A Big Ball of Mud, Part 2.

Frameworks evolve over time. So, I make sure that the architecture is framework-agnostic and focused on supporting the domain. This allows me to retrofit the system in the future with new frameworks as they emerge.


 

Due Diligence


Software development is a rapidly evolving field. I keep my eyes on the radar and try not to drink the vendors Kool-Aid. For example, not all vendors have a good track record in supporting standards, interoperability, and cross-platform solutions.

The ThoughtWorks Technology Radar is an excellent source of information and analysis on emerging trends in software. Its contributors include software thought leaders like Martin Fowler and Rebecca Parson. I also look at surveys of the developers community to determine the popularity, community size, and usage statistics of competing frameworks and tools. Sites like InfoQ often conduct these types of surveys like the recent InfoQ survey on Top JavaScript MVC Frameworks. I also like Matt Raible's Comparing JVM Web Frameworks.

I value the opinion of recognized experts in the field of interest. I read their books, blogs, and watch their presentations. Before formulating my own position, I make sure that I read expert opinions on opposing sides of the argument. For example, in deciding on a pure Java EE vs. Spring Framework approach, I read arguments by experts on both sides (experts like Arun Gupta, Java EE Evangelist at Oracle and Adrian Colyer, CTO at SpringSource).

Finally, consider a peer review of the architecture using a methodology like the Architecture Tradeoff Analysis Method (ATAM). Simply going through the exercise of explaining the architecture to stakeholders and receiving feedback can significantly help in improving it.


Rapid Prototyping 

 

It's generally a good idea to create a rapid prototype to quickly learn and demonstrate the capabilities and value of the framework to the business. This can also generate excitement in the development team, particularly if the framework can enhance the productivity of developers and make their life easier.

 

The Frameworks I've Selected


The Spring Framework

I am a big fan of the Spring Framework. I believe it is really designed to support the need of developers from a productivity standpoint. In addition to dependency injection (DI), Aspect Oriented Programming (AOP), and Spring MVC, I like the Spring Data repository abstraction for JPA, MongoDB, Neo4J, and Hadoop. Spring supports Polyglot Persistence and Big Data today. I use Spring Roo for rapid application development and this allows me to focus on modeling the domain. I use the Roo scaffolding feature to generate a lot of Spring configuration and Java code for the domain, repository (Roo supports JPA and MongDB), service, and web layers (Roo supports Spring MVC, JSF, and GWT). Spring also support for unit and integration testing with the recent release of Spring MVC Test.

I use Spring Security which allows me to use AOP and annotations to secure methods and supports advanced features like Remenber Me and regular expressions for URLs. I think that JAAS is too low-level. Spring Security allows me to meet all OWASP Top Ten requirements (see my previous post entitled  Application-Level Security in Health IT Systems: A Roadmap).

Spring Social makes it easy to connect a Spring application to social network sites like Facebook, Twitter, and LinkedIn using the OAuth2 protocol. From a tooling standpoint, Spring STS supports many Spring features and I can deploy directly to Cloud Foundry from Spring STS. I look forward to evaluating Grails and the Play Framework which use convention over configuration and are built on Groovy and Scala respectively.

Thymeleaf, Twitter Boostrap, and JQuery

I use Twitter Boostrap because it is based on HTML5, CSS3, JQuery, LESS, and also supports a Responsive Web Design (RWD) approach. The size of the components library and the community is quite impressive.

Thymeleaf is an HTML5 templating engine and a replacement for traditional JSP. It is well integrated with Spring MVC and supports a clear division of labor between back-end and front-end developers. Twitter Boostrap and Thymeleaf work well together.


AngularJS

For Single Page Applications (SPA) my definitive choice is AngularJS. It provides everything I need including a clean MVC pattern implementation, directives, view routing, Deep Linking (for bookmarking), dependency injection, two-way databinding, and BDD-style unit testing with Jasmine. AngularJS has its own dedicated debugging tool called Batarang. There are also several learning resources (including books) on AngularJS.

Check this page comparing the performance of AngulaJS vs. KnockoutJS. This is a survey of the popularity of  Top JavaScript MVC Frameworks.

 

D3.js 

D3.js is my favorite for data visualization in data-intensive applications. It is based on HTML5, SVG, and Javascript. For simple charting and plotting, I use jqPlot which is based on JQuery. See my previous post entitled Visual Analytics for Clinical Decision Making.

 

I use R for statistical computing, data analysis, and predictive analytics. See my previous post entitled Statistical Computing and Data Mining with R.


Development Tools


My development tools include: Git (Distributed Version Control), Maven or Gradle (build), Jenkins (Continuous Integration), Artifactory (Repository Manager), and Sonar (source code quality management). My testing toolkit includes Mockito, DBUnit, Cucumber JVM, JMeter, and Selenium.

Sunday, April 14, 2013

Addressing Challenges to the Adoption of Clinical Decision Support (CDS) Systems


Despite its potential to improve the quality of care, CDS is not widely used in health care delivery today. In tech marketing parlance, CDS has not crossed the chasm. There are several issues that need to be addressed including:

  • Clinician acceptance of the concept of automated execution of evidence-based clinical guidelines.

  • Seamless integration into clinical workflows.

  • Usability including alert fatigue and understanding the human factors that influence alert acceptance.

  • Standardization to enable the interoperability of CDS knowledge artifacts and executable clinical guidelines.

  • Integration with Natural Language Processing (NLP), for example to allow clinicians to ask clinical questions in natural language (Clinical Question Answering or CQA) and to extract clinical answers from very large amounts of unstructured sources of medical knowledge. PubMed has more than 23 millions citations for biomedical literature from MEDLINE, life science journals, and online books. The typical Clinical Practice Guideline (CPG) is 50 to 150 pages long. It is impossible for the human brain to keep up with that amount of knowledge.

  • Integration of predictive risk models based on the analysis of historical clinical data to enable targeted interventions for specific at-risk populations.

  • Integration of genomics to enable personalized medicine. Genomics significantly increases the number of data types that can affect clinical decision making.

  • The use of mathematical simulation in CDS to explore and compare the outcomes of various treatment alternatives.

  • Integration of outcomes research in the context of a shift to a value-based healthcare delivery model. This can be achieved by incorporating the results of Comparative Effectiveness Research (CER) and Patient-Centered Outcome Research (PCOR) into CDS systems. Increasingly, outcomes research will be performed using observational studies (based on real world clinical data) which are recognized as complementary to randomized control trials (RCTs) for discovering what works and what doesn't work in practice. This is a form of Practice-Based Evidence (PBE) that is necessary to close the evidence loop.

  • Support for a shared decision making process that takes into account the values, goals, and wishes of the patient.

  • The use of Visual Analytics in CDS to facilitate analytical reasoning over very large amount of structured and unstructured data sources.

In this post, I share my thoughts on CDS interoperability, integration with clinical workflows, usability, Visual Analytics, and natural language processing (NLP).

 

Interoperable and Executable Clinical Guidelines


The complexity and cost inherent in capturing the medical knowledge in clinical guidelines and translating that knowledge into executable code remains an impediment to the widespread adoption of CDS software. Therefore, there is a need for standards for the sharing and interchange of CDS knowledge artifacts and executable clinical guidelines.

Different formalisms, methodologies, and architectures have been proposed over the years for representing the medical knowledge in clinical guidelines. Examples include, but are not limited to the following:

  • The Arden Syntax
  • GLIF (Guideline Interchange Format)
  • GELLO (Guideline Expression Language Object-Oriented)
  • GEM (Guidelines Element Model)
  • The Web Ontology Language (OWL)
  • PROforma
  • EON
  • PRODIGY
  • Asbru
  • GUIDE
  • SAGE.

More recently, the ONC Health eDecision Initiative has published the following specifications:



Because of the complexity and cost of developing CDS software, CDS software capabilities can be exposed as a set of services (part of a Service Oriented Architecture) that can be consumed by other client health IT systems such as EHR and Computerized Physician Order Entry (CPOE) systems. To reduce costs, these CDS software services can be shared by several health care providers. Consensus was recently achieved on the ONC Health eDecision CDS Guidance Service Use Case.

Enabling the interoperability of executable clinical guidelines requires a standardized domain model for  representing the medical information of patients and other contextual clinical information. The HL7 virtual Medical Record (vMR) is a standardized domain model for representing the inputs and outputs of CDS systems.

In practice, executable CDS rules (like other complex types of business rules) can be implemented with a business rule engine using forward chaining. This is the approach taken by OpenCDS and some large scale CDS implementations in real world healthcare delivery settings. This allows CDS software developers to externalize the medical knowledge contained in clinical guidelines in the form of declarative rules as opposed to embedding that knowledge in procedural code. Many viable open source business rule management systems (BRMS) are available today and provide capabilities such as a rule authoring user interface, a rules repository, and a testing environment. Furthermore, a rule execution environment can be easily integrated with business processes (see the section below on clinical workflow integration), ontologies, and predictive analytics models.

The W3C Rule Interchange Format (RIF) specification is a possible solution to the interchange of executable CDS rules. The RIF Production Rule Dialect (PRD) is designed as a common XML serialization syntax for multiple rule languages to enable rule interchange between different BRMS. For example, RIF-PRD would allow the exchange of executable rules between existing BRMS like JBoss Drools, IBM ILOG JRules, and Jess. RIF is currently a W3C Recommendation and is backed by several BRMS vendors. A paper entitled "A model driven approach for bridging ILOG Rule Language and RIF" presented by Valerio Cosentino, Marcos Didonet Del Fabro, and Adil El Ghali at the RuleML 2012 conference describes an application of RIF to rule interoperability.

 

Seamless Integration into Clinical Workflows and Care Pathways


One of the main complaints against CDS systems is that they are not well integrated into clinical workflows. Existing business process management standards like the Business Process Modeling Notation (BPMN) can provide a proven, practical, and adaptable approach to the integration of CDS rules and clinical pathways. Some existing open source and commercial BRMS already provide an integration of business rules and business processes out-of-the box and there are well-known patterns for integrating rules and processes in business applications.

 

Usability of Clinical Decision Support Systems


We need to do a better job in understanding the human factors that influence alert acceptance by clinicians in CDS. We also need clear and proven usability guidelines (backed by scientific research) that can be implemented by CDS software developers. Several research projects have sought to understand why clinicians accept or ignore CDS recommendations (for example, see this study by Seidling et al.: Factors influencing alert acceptance: a novel approach for predicting the success of clinical decision support).

The British National Health Service (NHS) Common User Interface (CUI) Program has created standards and guidance in support of the usability of clinical applications with inputs from user interface design specialists, usability experts, and hundreds of clinicians with a diversity of background in using health information technology. The program is based on a rigorous development process which includes: research, design, prototyping, review, usability testing, and patient safety assessment by clinicians.

In the US, the National Institute of Standards and Technology (NIST) has also published some guidance on the usability of clinical applications.


Toward Data-Driven Clinical Decision Support


The use of predictive risk models for personalized medicine is becoming a common practice. These models can predict the health risks of patients based on their individual health profiles (including genetic profiles). These models often take the form of logistic regression models. Examples include models for predicting cardiovascular disease, breast cancer, diabetes, ICU mortality, hospitalization, and hospital readmission (an important ACO performance measure). Developing accurate predictive models requires a lot of real world clinical data (millions of patient records satisfying the three Vs of Big Data: Volume, Velocity, and Variability). Developing these models requires the use of Statistical Computing and Machine Learning techniques for data analysis (see my previous post titled Statistical Computing and Data Mining with R).

Thanks to the Meaningful Use incentive program, adoption of electronic health record (EHR) systems by providers is rapidly increasing. This translates into the availability of huge amount of EHR data which can be harvested to provide the Practice Based Evidence (PBE) necessary to close the evidence loop. PBE is the key to a learning health system. The Institute of Medicine (IOM) released a report last year titled "Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care". The report describes a learning health system as:

"...delivery of best practice guidance at the point of choice, continuous learning and feedback in both health and health care, and seamless, ongoing communication among participants, all facilitated through the application of IT."
PBE will require not only rigorous scientific methodologies, but also a computing platform suitable for the era of Big Data in medicine. As Williams Osler (1849-1919) famously said:
"Medicine is a science of uncertainty and an art of probability."

Visual Analytics is an emerging scientific discipline that can help clinicians in decision making in the era of Big Data and data-driven medicine. The goal of Visual Analytics is to obtain deep insight for effective understanding, reasoning, and decision making through the visual exploration of massive, complex, and often ambiguous data (see my previous post titled Visual Analytics for Clinical Decision Making).

Developing data-driven CDS capabilities will introduce a new type of technical challenge: a seamless integration between business rules, predictive models, and ontologies (see this study by Sottara et al.: Enhancing a production rule engine with predictive models using PMML).

Integration with Natural Language Processing (NLP)


It is not practical to expect that all medical knowledge required in clinical decision making will be manually translated into IF-THEN statements to be executed by rule engines. A vast amount of rapidly increasing medical knowledge exists in the form of narrative text in clinical guidelines and other unstructured sources of medical knowledge like textbooks, research papers, and academic journals.

For an in-depth discussion on the topic of NLP in Clinical Decision Support, see my previous post titled Automated Clinical Question Answering: The Next Frontier in Healthcare Informatics.

Sunday, March 24, 2013

Statistical Computing and Machine Learning with R

The use of predictive risk models for personalized medicine is becoming a common practice in healthcare delivery. These models can predict the health risk of patients based on their individual health profiles. Examples include models for predicting breast cancer, stroke, cardiovascular disease, Alzheimer's disease, chronic kidney disease, diabetes, hypertension, and operative mortality for patients undergoing cardiac surgery. These predictive models are created through data analysis using statistical computing.

Predictive risk modeling can be used to identity at-risk populations and provide them with pro-active care including early screening and prevention. For example, predictive risk modeling can help identify patients at risk of hospital re-admission, an important Accountable Care Organization (ACO) quality measure.

Another important challenge in healthcare is to discover what works and what does not work in clinical practice. Comparative Effectiveness Research (CER), an emerging trend in Evidence Based Practice (EBP), has been defined by the Federal Coordinating Council for CER as "the conduct and synthesis of research comparing the benefits and harms of different interventions and strategies to prevent, diagnose, treat, and monitor health conditions in 'real world' settings."

Despite their inherent methodological challenges (lack of randomization leading to possible bias and confounding), observational studies (using real world clinical data) are increasingly recognized as complementary to Randomized Control Trials (RCTs) and an important tool in clinical decision making and health policy.

Statistical Computing and Machine Learning are essential components of intelligent health IT systems. Over the last few years, the free and open source R Project for Statistical Computing has emerged as one the most popular tools for data analysis. This poll by kdnuggets.com shows the breakdown in popularity of various data mining and analytic tools.

R supports several Machine Learning algorithms including:

  • Nearest Neighbor
  • Naive Bayes
  • Decision Trees
  • Classification Rules
  • Logistic Regression
  • Regression Trees
  • Model Trees
  • Neural Networks
  • Support Vector Machines
  • Association Rules
  • k-Means Clustering
A technique called "Ensemble Methods" which consists in combining multiple models into one can be used to achieve a higher level of accuracy than its components. There are also R packages for niche methods like the Latent Class Causal Analysis (LCCA) Package for R. LCA is used in behavioral health research.

The following are very useful resources for doing statistical computing and data mining with R:
 
  • RStudio: an Integrated Development Environment (IDE) for R

  • ggplot2: statistical graphics and plotting system for R

  • sqldf: a package for manipulating R data frames using SQL

  • RMySQL: R interface to the MySQL database

  • RMongo: MongoDB Database interface for R

  • RHIPE: Big Data analysis using R and Hadoop. RHIPE stands for R and Hadoop Integrated Programming Environment. This approach is referred to as D&R (Divide and Recombine) Analysis of Large Complex Data (see this tech report on D&R from the RHIPE team)

  • RHadoop:  Big Data analysis using R and Hadoop. This tool provides Hadoop MapReduce functionality in R

  • Rattle: A Graphical User Interface for Data Mining using R. This tool can export predictive models in Predictive Model Markup Language (PMML) format.

Sunday, March 10, 2013

How Not to Build A Big Ball of Mud, Part 2

In a previous post entitled How not to build a big  ball of mud, I described the complexity of modern software systems and the challenges faced today by software developers and architects. Domain Driven Design (DDD) is a proven pattern language that can foster a disciplined approach to software development. DDD was first introduced by Eric Evans nine years ago in a seminal book entitled: Domain-Driven Design: Tackling Complexity in the Heart of Software. Over the last 9 years, a community of practice has emerged around DDD and many lessons have been learned in applying DDD to real world complex software development projects. During that time, software complexity has also increased significantly. Changes in the field of software development during the last few years include:

  • The proliferation of client devices which requires a Responsive Web Design (RWD) approach. RWD is made possible by open web standards like HTML5, CSS3, and Javascript which have displaced proprietary user interface technologies like Flex and Silverlight. RWD frameworks like Twitter Boostrap and Javascript Libraries like JQuery have become very popular with developers. The demands put on Javscript on the client side have created the need for Javascript MVC frameworks like AngularJS and EmberJS.

  • The importance of the user experience in a competitive online marketplace. Performing usability testing early in the software development life cycle (using wireframes or mockups) to test design ideas and obtain early feedback from future users is extremely valuable for creating the right solution. Metrics such as the System Usability Scale (SUS) can be used to assess the results of usability testing.

  • The prevalence of REST, JSON, OAuth2, and Web APIs for achieving web scale.

  • The emergence of Polyglot Persistence or the use of different persistence mechanisms such as relational, document, and graph databases within the same application. Developers are discovering that modeling data for NoSQL databases has many benefits, but also its own peculiarities.

  • The demands for quality and faster time-to-market have led to new techniques like test automation and continuous delivery.

The open source community has responded to these challenges by creating many frameworks which are supposed to facilitate the work of developing software. Software developers spend a considerable amount of time researching, learning, and integrating these various frameworks to build a system. Some of these frameworks can indeed be very helpful when used properly. However, DDD puts a big emphasis on understanding the domain. Here is what I learned from applying DDD over the last few years:


  • DDD is a significant intellectual investment, but with a potential for big rewards. To be successful in applying DDD, one must take the time to understand and digest the underlying principles, from the building blocks (entities, aggregates, value objects, modules, domain events, services, repositories, and factories) to the strategic aspects of applying DDD. For example, understanding the difference between an aggregate, a value object, and an entity is essential. Learning the right approach to designing aggregates is also very important as this can significantly impact transactions and performance. I highly recommend reading the recently published Implementing Domain Driven Design by Vaughn Vernon. The book provides a contemporary approach to applying DDD. For example, it covers important topics in applying DDD to modern software systems such as:  sub-domains, domain events, event stores and event sourcing, rules for aggregate design, transactions, eventual consistency, REST, NoSQL, and enterprise application integration with concrete examples.

  • Proper application layering (user interface, application, domain, and infrastructure), understanding the responsibility of each layer (for example, an anemic domain model and a fat application layer are anti-pattern), and coding to interfaces. DDD is object-oriented (OO) design done right. The SOLID Principles of OO design are still applicable.

  • Determine if DDD is right for your project. Most of my work during the last few years has been in the healthcare domain. The HL7 CCDA and the Virtual Medical Record (vMR) define an information model for Electronic Healthcare Records (EHR) and Clinical Decision Support (CDS) systems respectively. Interoperability is an important and challenging issue in healthcare. DDD concepts such as "Strategic Design", "Context Map", "Bounded Context", and "Published Language" are very helpful in addressing and navigating this type of complexity.

  • As I mentioned earlier, DDD puts a big emphasis on understanding the domain. Developers applying DDD should be prepared to dedicate a considerable amount of time to learning about the domain, for example by collaborating and carefully listening to domain experts and by reading as much as they can about the domain. This is also the key to creating a rich domain model with behavior (as opposed to an anemic one). I found that simply reading industry standards and regulations is a great way to understand a domain. So understanding the domain is not just the responsibility of the Business Analyst. The code is the expression of the domain, so the coder needs to understand the domain in order to express it with code.

  • Some developers blame popular frameworks for encouraging anemic domain models. I found that a lack of understanding of the domain and its business rules is a major contributing factor to anemia in the domain model. A rule engine like Drools can help externalize these business rules in the form of declarative rules that can be maintained by domain experts through a DSL, spreadsheet, or web-based user interface.

  • There are opportunities in using recent ideas like Event Sourcing and the Command Query Responsibility Segregation (CQRS). These opportunities include: scalability, true audit trails, data mining, temporal queries, application integration. However, being pragmatic can help avoid unnecessary complexity.

  • I recommend exploring tools that are specifically designed to support a DDD or Model-Driven Development (MDD) approach. Apache Isis, Roma Meta Framework, Tynamo, and Naked Objects are examples of such tools. These tools can automatically generate all the layers of an application based on the specification of a domain model. By doing so, these tools allow you to really focus your time and attention on exploring and understanding the domain as opposed to framework and infrastructure concerns. For architects, these tools can serve as design pattern automation, constraining the development process to conform to DDD principles and patterns. I believe this is part of a larger trend in automating software development which also includes the essential practice of test automation. We software developers like to automate the job of other people. However, many tasks that we perform (including coding itself) are still very manual. Aspect-Oriented Programming (AOP) (using AspectJ for example) can also be used to enable this type of design pattern automation through compile-time weaving.

  • Check my previous post for 20 techniques for achieving software excellence.

Sunday, February 24, 2013

State of the Semantic Web in the Clinical Domain

In a previous post entitled Why Do We Need Ontologies in Healthcare Applications, I explained the important difference between ontologies, coding systems, and information models of data structures. I also outlined the benefits of using Semantic Web technologies like RDF, RDFS, OWL, SWRL, R2RML, SPARQL, SKOS, and Linked Open Data (LOD). These benefits include:

  • Reasoning and inferencing which are essential characteristics of intelligent Health IT Systems (iHIT)
  • Model consistency checking
  • Open World Assumption (OWA) and Non-Unique Naming Assumption enabling the integration of heterogeneous data sources and knowledge bases using Linked Open Data (LOD) principles. This integration can be accomplished by providing an RDF view over existing relational databases using R2RML (RDB to RDF Mapping Language) and by performing SPARQL federated queries. Intelligent queries can retrieve inferred facts using SPARQL 1.1 Entailment Regimes.
  • Linking to other biomedical ontologies like SNOMED and the Translational Medicine Ontology
  • Clinical Knowledge Management (CKM) using OWL to model and execute Clinical Practice Guidelines (CPGs) and Care Pathways (CPs).

Semantic Web in Clinical and Translational Research


The following are papers on how Semantic Web technologies are being used to realize these benefits in the healthcare domain:

Apache Stanbol


I recently came across Apache Stanbol, a new Apache project which is described as "a set of reusable components for semantic content management". What I really like about Apache Stanbol is that it not only works on unstructured data sources, but also integrates a number of other popular Apache open source software which can be used to add a semantic layer to modern RESTful content-oriented applications. These components include:

  • Apache Tika for text and metadata extraction from a variety of commonly used document formats
  • Apache OpenNLP for natural language processing and named entity recognition (NER)
  • Apache Solr for document store and semantic search
  • Apache Jena as the RDF and Semantic Web framework.
Other open source components like Apache Mahout (a scalable Machine Learning library) can be integrated to provide document recommendation and clustering services.

The Content Enhancers in Stanbol can perform named entity recognition (NER) and link text annotations to external datasets such as DBPedia.  In the clinical domain, these enhancers can be used to extract entities from medical records, journal articles, and clinical guidelines. These entities can then be linked to other clinical data sources such as drug and disease databases using Linked Data techniques.

Apache Stanbol also provides Reasoners based on Jena RDFS, OWL, and OWLMini Reasoners as well as the HermiT OWL Reasoner. These reasoners can perform consistency checking and classification. Stanbol supports Inference Rules in the following formats: SWRL, Jena Rules, and SPARQL (by converting Stanbol Rules into SPARQL CONSTRUCTs).

Sunday, February 17, 2013

Automated Clinical Question Answering: The Next Frontier in Healthcare Informatics

In a previous post, I predicted that 2013 will be the year Intelligent Health IT Systems (iHIT) go mainstream.  I based my prediction on a number of factors, notably the transformation of healthcare to a value-based delivery system driven by the latest scientific evidence (evidence-based practice and practice-based evidence).

Last week, IBM together with health insurer WellPoint Inc., and New York’s Memorial Sloan-Kettering Cancer Center announced the commercialization of Watson (the supercomputer which beat human champions in "Jeopardy!" on February 16, 2011) for question answering (QA) in the clinical domain. The following are some interesting facts released by IBM as part of this announcement:

  • The supercomputer has ingested 1,500 lung cancer cases from Sloan-Kettering records, plus 2 million pages of text from journals, textbooks and treatment guidelines. This is what I called Big Data in medicine.
  • In 2012, Watson became 240 percent faster and 75 percent smaller so it can run on a single server. No surprise here and I expect this trend to continue.

The following YouTube video entitled Oncology Diagnosis and Treatment explains how IBM envisions using Watson for Clinical Question Answering (CQA):



The User Experience in the Watson Demo

 

  • Clinical questions can be posed in natural language (spoken or typed in by the clinician using a keyboard).
  • The sources used for answering clinical questions include both structured (EMR databases) and unstructured information (journal articles, clinical guidelines, etc.).
  • Personalized medicine: the proposed interventions are driven by the data in the patient's medical record and the system can prompt the clinician for additional information on the patient if necessary. The displayed evidence and recommendations are updated to reflect changes in the patient's clinical data.
  • Human Factors: the clinician is always in the loop. She can ask Watson how it arrives at a specific care recommendation and can even remove a specific evidence (if deemed irrelevant or not appropriate).
  • The use of confidence scoring and evidence highlighting.
  • Patient-centeredness and shared decision making: the treatment plans take into account the values, goals, and wishes of the patient (patient preferences). Treatment options are discussed with the patient.
  • Comparative effectiveness is used to compare the benefits and harms of different interventions.
  • Information is displayed using data visualization (dashboard) to help meet key performance indicators in the context of a value-based payment model.


The Science Behind Watson


The real question is how do we make intelligent health IT systems like Watson widely available to all patients. A landmark report published by the Institute of Medicine in 2001 and titled Crossing the Quality Chasm - A New Health System for the 21st Century contained the following recommendation:

Patients should receive care based on the best available scientific knowledge. Care should not vary illogically from clinician to clinician or from place to place.

For the scientifically (and Artificial Intelligence) inclined, the following are some pointers on the science behind Watson:


The picture below represents a high level architecture of Watson (click on the image to enlarge it).


DeepQA



AskHermes and MiPACQ


IBM Watson is not the only effort to develop automated CQA capabilities.  Some earlier CQA efforts used the PICO framework (Problem/Population, Intervention, Comparison, Outcome) to facilitate processing. More recent efforts have focused on the use of clinical questions posed in natural language.

AskHermes (Help clinicians to Extract and aRrticulate Multimedia information for answering clinical quEstionS) allows clinicians to enter questions in natural language and uses the following unstructured information sources: MEDLINE abstracts, PubMed Central full-text articles, eMedicine documents, clinical guidelines, and Wikipedia articles.

The processing pipeline in AskHermes includes the following: Question Analysis, Related Questions Extraction, Information Retrieval, Summarization and Answer Presentation. AskHermes performs question classification using MMTx (MetaMap Technology Transfer) to map keywords to UMLS concepts  and semantic types. Classification is also achieved through supervised machine learning algorithms such as Support Vector Machine (SVM) and conditional random fields (CFRs). Summarization and answer presentation are based on clustering techniques.

MiPACQ (Multi-source Integrated Platform for Answering Clinical Questions) is based on Natural Language Processing (NLP) and Information Retrieval (IR) and utilizes data sources such as Electronic Medical Record (EMR) databases and online medical encyclopedia like Medpedia. MiPACQ uses a processing pipeline based on UIMA (Unstructured Information Management Architecture) and machine learning-based as well as rule-based scoring. NLP capabilities are provided by ClearTK and cTakes (clinical Text Analysis and Knowledge Extraction System).



The Road Ahead


Automated Clinical Question Answering (CQA) is really hard. However, that is the future of computing: intelligent machines we can have meaningful conversations with. CQA is a multidisciplinary field which combines disciplines like statistical computing, information retrieval, natural language processing, machine learning, rule engines, semantic web technologies, knowledge representation and reasoning, visual analytics, and massively parallel computing. There are several open source projects that provide the building blocks. Many EHR software today are glorified data entry systems. We need to move to the next level and that will require technical leadership.