Tuesday, August 11, 2009

Adding Semantics to SOA

What can Semantic Web technologies such as RDF, OWL, SKOS, SWRL, and SPARQL bring to Web Services. One of the most difficult challenges of SOA is data model transformation. This problem occurs when services don't share a canonical XML schema. XML transformation languages such as XSLT and XQuery are typically used for data mediation in such circumstances.

While it is relatively easy to write these mappings, the real difficulty lies in mapping concepts across domains. This is particularly important in B2B scenarios involving multiple trading partners. In addition to proprietary data models, it is not uncommon to have multiple competing XML standards in the same vertical. In general, these data interoperability issues can be syntactic, structural, or semantic in nature. Many SOA projects can trace their failure to those data integration issues.

This is where semantic web technologies can add significant value to SOA. The Semantic Annotations for WSDL and XML Schema (SAWSDL) is a W3C recommendation which defines the following extension attributes that can be added to WSDL and XML Schema components:

  • The modelReference extension attribute associates a WSDL or XML Schema component to a concept in a semantic model such as OWL. The semantic representation is not restricted to OWL (for example it could be an SKOS concept). The modelReference extension attribute is used to annotate XML Schema type definitions, element and attribute declarations as well as WSDL interfaces, operations, and faults.
  • The liftingSchemaMapping and loweringSchemaMapping extension attributes typically point to an XSLT or XQuery mapping file for transforming between XML instances and ontology instances.

A typical example of how SAWSDL might be used is in an electronic commerce network where trading partners use various standards such as EDI, UBL, ebXML, and RosettaNet. In this case, the modelReference extension attribute can be used to map a WSDL or XML Schema component to a concept in a common foundational ontology such as one based on the Suggested Upper Merged Ontology (SUMO). In addition, lifting and lowering XSLT transforms are attached to XML Schema components in the SAWSDL with liftingSchemaMapping and loweringSchemaMapping extension attributes respectively. Note that any number of those transforms can be associated with a given XML schema component.

Traditionally, when dealing with multiple services (often across organizational boundaries), an Enterprise Services Bus (ESB) provides mediation services such as business process orchestration, business rules processing, data format and data model transformation, message routing, and protocol bridging. Semantic mediation services can be added as a new type of ESB service. The SAWSDL4J API defines an object model that allows SOA developers to access and manipulate SAWSDL annotations.

Ontologies have been developed for some existing e-commerce standards such as EDI X12, RosettaNet, and ebXML. When required, ontology alignment can be achieved with OWL constructs such as subClassOf , equivalentClass , and equivalentProperty.

Semantic annotations provided by SAWSDL can also be leveraged in orchestrating business processes using the business process execution language (BPEL). To facilitate service discovery in SOA Registries and Repositories, interface definitions in WSDL documents can be associated with a service taxonomy defined in SKOS. In addition, once an XML message is lifted to an ontology instance, the data in the message becomes available to Semantic Web tools like OWL and SWRL reasoners and SPARQL query engines.


Uche Ogbuji said...

I dunno, Joel,

I tend to think the problem is at a higher level, and that SAWSDL alone is just throwing bits at the problem.

I think a gradation of business/service modeling from informal description of the problem space to *lightweight* annotations within information-bearing (i.e. protocol) formats. The problem is that if your protocol formats are heavyweight (i.e. WSDL, OWL, XSD etc.) you haven't solved the problem that so much in SOA raises a complexity barrier to its fundamental goal of harmonizing business and tech.

For a flavor of my personal preferences in how to mix Semantics into SOA (emphasis on simplicity and pragmatics) see my STC09 slides:


I warmed up to that in my contribution to the PwC report:


Joel Amoussou said...

The design goals of SAWSDL were actually modest. SAWSDL is a simple standardized mechanism for adding semantic annotations (in the same spirit as GRDDL and RDFa) to an already established model/representation which in this case is XML Schema and WSDL. SAWSDL allows you to do this at a granular level by mapping WSDL and XSD components to concepts in an ontology.

Akara is certainly an innovative approach since it supports RESTful web services, RDF, and pipelining for data transformation (do you support XProc?). However, organizations have invested billions of dollars in SOA-based systems (particularly ESBs) during the last five to eight years. So the question is how do you add a semantic layer to existing SOA applications in order to solve specific problems without disrupting these sometimes already fragile systems.

Anonymous said...

I apologize for the newbie question, but you all appear to be talking about the right topics. In regards to Transparency and Open Government, what about the situation where you have a huge government entity that has terrabytes of contract/procurement information it currently makes available in flat ASCII files. Wouldn't it make more sense (i.e., be more accessible to more people in more ways) to make that data available as XML coded information and let the semantic web, as a community, take it and create products for distribution? (NOTE: I don't claim any real understanding about how the XML thing works when you are talking about exporting hundreds of thousands of procurement actions per month from a legacy database system).

Joel Amoussou said...

It's actually a good thing for the SemWeb community to take care of lifting some of the XML datasets into RDF and Linked Data. This is already happening with data.gov datasets(see http://data-gov.tw.rpi.edu/wiki/Main_Page).

But, I also believe that in the long term, the government should embrace Linked Data principles.