Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Managing Data Integrity in SOA and SaaS Based Environments
Techniques for managing transactions in the cloud

Data integrity is one of the most critical elements in any system. Data integrity is easily achieved in a standalone system with a single database. Data integrity in such a system is maintained via database constraints and transactions. Transactions should follow ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure data integrity. Most databases support ACID transactions and can preserve data integrity.

Next in the complexity chain are distributed systems. In a distributed system, there are multiple databases and multiple applications. In order to maintain data integrity in a distributed system, transactions across multiple data sources need to be handled correctly in a fail-safe manner. This is usually done via a central global transaction manager. Each application in the distributed system should be able to participate in the global transaction via a resource manager. This is achieved using a 2-phase commit protocol as per the XA standard. Most databases and custom applications have the ability to participate in a global transaction. Many packaged applications can also participate in a global transaction via EAI adapters. In reality, in most environments, some of the applications may support participation in a global transaction via 2-phase commit, some may support only single phase commit transactions and some may not support any transaction capability at all.

Moving further up in complexity are distributed systems with a mix of on-premise and partner applications. In this case, not all applications in the system are under the control of the organization and partner application interface may not support XA. B2B integration standards such as EDI and ebXML are the primary methods of ensuring reliability and data integrity across partner systems.

Enter the world of SOA and Cloud computing, and the problem of data integrity gets magnified even more, as there is a mix of on-prem and SaaS applications exposed as services. SaaS applications are multi-tenant applications hosted by a third party. SaaS applications usually expose their functionality via XML based APIs over HTTP protocol. SOAP and REST based web services are the most common methods of implementing these APIs. Also, in SOA based environments, many on-prem applications expose their functionality via SOAP and REST web services as well. One of the biggest challenges with web services is transaction management. At the protocol level, HTTP doesn’t support transactions or guaranteed delivery, so the only option is to implement these at the API level. Although there are standards available for managing data integrity with web services such as WS-Transaction and WS-Reliability, these standards are not yet mature and not many vendors have implemented these. Most SaaS vendors expose their web services APIs without any support for transactions. Also, each SaaS application may have different levels of availability and SLA (Service Level Agreement), which further complicates management of transactions and data integrity across multiple SaaS applications. There are several techniques that can be applied to ensure data integrity in such environments.

Let’s take a simple scenario of new customer creation at a company. This company uses 2 SaaS vendors, one for Marketing and one for CRM. In addition, there is an on-premise ERP application. When a new customer places an order, the customer information needs to be sent to the Marketing service (for marketing campaigns), CRM service (for customer management) and ERP application (for order fulfillment). Both Marketing and CRM applications expose their customer creation APIs via SOAP web services over HTTP, whereas the ERP application exposes customer creation via a database API. Here is the sequence of operations in this transaction:

1. Customer creation in Marketing via SOAP web service (Doesn't support transaction)

2. Customer creation in CRM via SOAP web service (Doesn’t support transaction)

3. Customer creation in ERP via database insert (Supports transaction)

In order to maintain data integrity across the 3 applications, either all the steps should get successfully executed or none of them should get executed. In the above sequence of operations, if step 1 succeeds but step 2 fails, step 1 can’t be rolled back. If step 1 and 2 succeed but step 3 fails, steps 1 and 2 can’t be rolled back. So we have a data integrity issue at hand in various failure scenarios and customer record will exist in some systems but not in others. This is usually not acceptable in any production environment. So what can be done to handle this problem? There are several techniques that can be applied in this scenario:

Technique 1: Perform the operations that support transactions before the operations that don’t support transactions

In our example, step 3 should be moved to the beginning as follows:

1. Customer creation in ERP via database insert (Supports transaction)

2. Customer creation in Marketing via SOAP web service (Doesn't support transaction)

3. Customer creation in CRM via SOAP web service (Doesn’t support transaction)

With this change in the sequence of operations, if step 1 succeeds but step 2 fails, step 1 can just be rolled back. We still have a problem if step 1 and 2 succeed but step 3 fails. This is where the following techniques come in handy.

Technique 2: Use compensating transactions

In our new sequence as per technique 1, if steps 1 and 2 succeed but step 3 fails, rollback step 1 and issue a compensating transaction for step 2. Compensating transaction in this case will be to delete the customer. Of course, for this to work, the Marketing SaaS application needs to provide a “delete customer” API which should be requested before signing up with this SaaS vendor.

Technique 3: Break the transaction into multiple decoupled transactions

In our example, step 3 can be executed in a separate asynchronous transaction using a queue. Queue can be implemented using database or some messaging technology such as JMS. In either case, both write and read of messages from queue will support transactions. Here is the sequence of operations with this change:

First transaction:

1. Customer creation in ERP via database insert (Supports transaction)

2. Post message to a queue for customer creation in CRM (Supports transaction)

3. Customer creation in Marketing via SOAP web service (Doesn't support transaction)

In the above sequence, if step 2 fails, step 1 can be rolled back and if step3 fails, steps 1 and 2 can be rolled back. Note that posting message to queue is done before customer creation in Marketing to make sure the step that doesn’t support transaction is executed last (as per Technique 1).

Second transaction:

1. Queue listener retrieves message from queue (Supports transaction)

2. Customer creation in CRM via SOAP web service (Doesn't support transaction)

In the above sequence, if step 2 fails, step 1 can be rolled back.

So by breaking a transaction into multiple smaller transactions separated by queues, we are able to achieve data integrity.

Technique 4: Execute the transaction as a long-running transaction

If all the steps of the transaction are orchestrated as separate tasks of a long-running process using a state machine or BPM (Business process management) tool, then failure at any step will result in the process not progressing to the next step. Retries can be introduced at every step to ensure that every step is successful before the whole process is finished. This is the most reliable technique of all the techniques discussed but this can also introduce latency as the process can take a long-time to finish if any application or service is down for a long-time. This solution introduces more complexity into the environment and may not be acceptable in all situations but this is also the most reliable way to design distributed transactions in services based environments.

By applying the techniques discussed in this article, most failure scenarios can be handled effectively so that data integrity is not compromised. These techniques can be applied to any distributed system but are most useful (and almost mandatory) in SOA and SaaS based environments where interfaces are exposed via web services.

About Vinay Singla
Vinay Singla is a senior technology professional with extensive experience in the SaaS and SOA space.

Enterprise Open Source Magazine Latest Stories . . .
Apache Deltacloud, the Red Hat-contributed ReSTful API that abstracts differences between clouds so services on any cloud can be managed – provided of course there’s a driver – has graduated from the Apache Foundation’s incubator and is now a full-fledged Top-Level Project (TLP). The...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE