Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


SOA and Data Integration
The marriage of data integration and SOA could end up in divorce

First, the history. Data integration is the name the vendors have adopted to replace the ETL (Extract Translate Load), data cleansing, and data warehousing tools of days gone by.

These tools actually pre-date the notion of EAI, and were really the first sets of technology designed to deal with data and the use of that data for decision support (business intelligence now). They would extract large amount of data from a single or several operational data stores, clean the data, roll up the data, and put it in another data store, typically the data mart or data warehouse for analysis. From there somebody would "mine" the data to extract relevant information, such as productivity over time, sales effectiveness over years, profit by division, you get the idea. Very powerful notion for its time, and very powerful technology today.

When I was first working on the notion of EAI, I used these tools, and thought they had value. However, I also understood that the value of integration was really about the movement of information in real-time, system-to-system, in patterns that resemble actual business processing (sales to inventory to accounting, etc.), and while there was some business intelligence there it was really in the domain of real-time monitoring (BAM). Thus, you really had two different threads of technology born...ETL (now data integration) and EAI (now integration or SOA).

Enter SOA, and the hype around it, and everyone looking to link to it. All data integration vendors, and EAI vendors for that matter, are repositioning, retooling, and remarketing their technology as a "SOA solution." So, where is the actual fit for data integration?

Unlike SOA, which can support real-time data movement, data integration (typically) provides adequate business information (data replication) without up-to-the-minute access of information. In many cases, the data is weeks, even months, old, and the data mart or data warehouse is updated through antiquated batch, extract-aggregate-and-load, processes. Indeed this is the way integration is done today, using a data integration product or in most cases no product at all. When I was doing research for my last book, for instance, I found that FTP was still the primary form of data integration.

Evolving Away, and to Data
Things are changing, fortunately. SOA, and the technology that comes with it, lets data warehouse architects and developers move information...no matter where it comes from or where it's going...as quickly as they want to move it. As a result, it's not unheard of to have all participating databases and services in a SOA solution getting new data constantly, thus providing more value to those using the source and target systems existing in the SOA...including those who use them as a data warehouse or data mart.

Therefore, the rise of SOA will also lead to the rise of real-time data warehouse solutions, or could replace the notion of data warehousing altogether. For instance, we could put the data behind services (data services or abstract data services), with users leveraging up-to-the-minute information to make better business decisions using BI tools or BAM, or just snapping them into composite applications.

As mentioned, as time goes on data integration products may not be needed as SOA architects craft services that abstract both operational and aggregated data, in some cases leveraging aggregated data without having to replicate and change the data, but doing it through abstraction layers. This approach, if possible for your domain, is much less expensive and less complex. In essence, the SOA becomes the place to leverage services that can deal with the data layer(s) through many types of abstraction services, services that can be mixed and matched in composite to create solution instances for the SOA. SOA's value is in bringing all of these things together as a platform for business solutions.

Needs Coupling...Okay, Some Coupling
For a technology to truly be a SOA technology, or so I argue, it have to support the notion of coupling, as well as cohesion, and not just one or the other. This is where some data integration products fall down.

Coupling, in the context of application integration and SOA, is the binding of applications together so that they are dependent on each other, sharing the same services, methods, interfaces, and perhaps data. This is the core notion of SOA where the applications are bound by shared services, versus the simple exchange of information (using services or not).

Of course, the degree of coupling that occurs is really dependent on the SOA architect, and how she or he binds source and target systems together. In some instances systems are tightly coupled, meaning they're dependent on each other. In other instances, they are loosely coupled, meaning that they're more independent. It doesn't matter if you're doing this through Web Services or other mechanisms; you're typically going to have to make these architectural tradeoffs within the notion of coupling.

There are, of course, more pros and cons of coupling that should be considered in the context of the problem you're looking to solve. On the pros side you have the ability to bind systems by sharing behavior, and bound data, versus simply sharing information. This provides the integration solution set with the ability to share services that could be redundant to the integrated systems, thus reducing development costs. This is the reason we leverage SOAs.

Then there's the ability to tightly couple processes as well as shared behavior. This means that process integration engines, layered on top of SOA solutions, have more skill at binding actual behavior (functions, methods, services) versus just simply moving information from place to place.

The problem is that many data integration solution are more about information/data than about sharing services, so they're hard fit for many SOAs. ESBs have a similar issue, but not as obvious. As a result, the marriage between data integration and SOA could end up in divorce if coupling is a requirement. Again, generally speaking.

About David Linthicum
Dave Linthicum is the CTO of Blue Mountain Labs, and an internationally known cloud computing and SOA expert. He is a sought-after consultant, speaker, and blogger. In his career, Dave has formed or enhanced many of the ideas behind modern distributed computing including EAI, B2B Application Integration, and SOA, approaches and technologies in wide use today. In addition, he is the Editor-in-Chief of SYS-CON's Virtualization Journal. For the last 10 years, he has focused on the technology and strategies around cloud computing, including working with several cloud computing startups. His industry experience includes tenure as CTO and CEO of several successful software and cloud computing companies, and upper-level management positions in Fortune 500 companies. In addition, he was an associate professor of computer science for eight years, and continues to lecture at major technical colleges and universities, including University of Virginia and Arizona State University. He keynotes at many leading technology conferences, and has several well-read columns and blogs. Linthicum has authored 10 books, including the ground-breaking "Enterprise Application Integration" and "B2B Application Integration." You can reach him at david@bluemountainlabs.com. Or follow him on Twitter. Or view his profile on LinkedIn.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Enterprise Open Source Magazine Latest Stories . . .
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
C12G Labs has just announced an update release of OpenNebulaPro, the enterprise edition of the OpenNebula Toolkit. OpenNebula 3.2, released two weeks ago, brings important benefits to cloud providers with a new easily-customizable self-service portal for cloud consumers, and builders w...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE