Comments
litl_phil wrote: While it's nice that Google and Acer share the vision of cloud-based computing, it's also worth noting that we at litl already have a webbook on the market (available at litl.com) that runs our own cloud-based OS. Unlike Chrome, litlOS is focused on creating a new and better web experience for the home, so we don't have the usual browser interface, we have our own innovative UI. In conjunction with easel mode (litl's inverted-V position) and our growing cohort of litl channels (special apps t...
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Turbo-Charging Applications with Mid-Tier Distributed Caching
Fast and predictable data access

Caching Topologies
Depending on data usage patterns such as data volatility, frequency of update, and expiration requirements, many different topologies or configurations must be available for use. For example, for relatively small volumes of read-only or rarely updated data, a brute force "replicate everywhere" topology may work. In contrast, large amounts of volatile data (which may grow) may require a topology that will dynamically spread the load over the members in the cluster and repartition when new members are added. A combination of these topologies could also be used, which would provide the benefits of both in-memory access and the ability to grow and load balance the data across the cluster.

The key here is that the developer shouldn't have to code the clustering, replication, data backup, or parallel processing logic required to support the different topology types. The developer should code to a standard API and concentrate on writing business logic. The configuration underneath should be able to be changed declaratively via configuration, without any changes to the APIs that have been written.

Data Source Integration
When using a mid-tier data grid there are a number of usage patterns for data. Some data will be populated directly from the applications themselves. However, for applications that require data to be cached, there should be a consistent way of loading data from back-end data sources in the case of a cache-miss - that is, when the data being queried isn't available in the grid but does exist in a back-end data store. The developer shouldn't have to write code to deal with it.

Vendors with robust solutions in this space frequently implement them using approaches that let the data source plug transparently into the grid. For example, in the case of Oracle Coherence, loading directly from the database is done declaratively by attaching a CacheStore interface to the deployment configuration. Developers can either implement to a standard interface that calls to the back-end data store for query and update or use out-of-the-box integration with persistence solutions such as JBoss Hibernate or Oracle TopLink.

When either of these methods is used, if the data doesn't exist in the data grid, the solution will automa- tically delegate the data request to the CacheStore implementation, which then retrieves it from the back-end stores.

The capability of refreshing the data objects in the data grid based on time-triggered or other data expiry mechanisms is especially useful for those who use the data grid as a system of record and the official place for accessing data. Having a formal mechanism such as this built into the solution enables expiry policies and other data eviction policies to be matched by the infrastructure, which refreshes the data grid based on policies defined by an administrator. Ideally the solution shouldn't require customers to poll their back-end system for changes in data or scheduling jobs to refresh the data grid — these solutions are simply not scalable or manageable.

Sending the Processing to the Data
The advantage of using a distributed cache or data grid topology is that processing as well as data can be scaled when adding more resources to the grid. In a traditional use case in which we need to read data and do processing on it within a Map (for example, giving a raise to employees), we may have used something similar to the following (ignoring error handling, etc.):

Iterator<Employee> iter = map.values().iterator();
for (Employee emp : iter) {
    emp.setSalary(e.getSalary() * 1.1);
}

This (which could be written dozens of ways, of course) would achieve the desired result, but in this example, if the Map wasn't local to the Java process or distributed on another server, there would be a lot of network traffic to and from the client. The process would be serialized (that is, one entry processed at a time) and to rewrite this to run in parallel over multiple JVMs, taking into consideration the co-ordination of the concurrent processing, would require a considerable amount of work.

Taking advantage of grid processing and the ability of the data caching topology to load balance and partition data across multiple servers, it makes sense to send the processing to where the data is, rather than bringing the data to the client for processing. A common approach (this example is specific to Oracle Coherence) is to deploy code in the grid that performs the logic local to the nodes in the grid, rather than requiring the programmer to bring the all the data to the client.

The example shows how this approach could be used to raise the salary of all employees. First create a class to process the data:

    public class RaiseSalary extends AbstractProcessor {
       public RaiseSalary() {
       }

    public Object process(Entry entry ) {
    Employee emp = (Employee)entry.getValue();
    emp.setSalary(emp.getSalary() * 1.10);
    entry.setValue(emp);
    return null;
   }
}

Now invoke this across the Map (data grid):

    empCache.invokeAll(AlwaysFilter.INSTANCE, new RaiseSalary());

Sending the processing to the data dramatically improves the performance of tasks such as this because now the compute activity is parallelized across the entire grid.

Figure 1 illustrates the benefits of sending the processing to the data.

With multiple nodes in the grid and data distributed in parallel across the nodes, the processing model would scale well and take advantage of the processing capabilities of each node. Also, the fact that data doesn't need to be shipped back and forth between the client and server significantly increases the scalability and performance of such a system. As outlined in the example, using traditional non-grid methods would result in extremely poor performance and limited scalability.


About Tim Middleton
Tim Middleton is a solution architect with Oracle in Perth, Western Australia. He has over 17 years of experience in the IT industry. During this time he has been involved in the design and implementation of many large and leading-edge technology projects within the government and private sectors. His focus is on providing middleware solutions around SOA, with an emphasis on architectures that are highly available, scalable and reliable. Tim also has extensive development experience with J2EE and application server-based solutions, as well as many years experience as a DBA.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Enterprise Open Source Magazine Latest Stories . . .
Oracle seems to have divided the open source ranks over the MySQL delay it’s having closing its acquisition of Sun. Eben Moglin, the GPL’s most ardent defender and delineator, the lawyer who has worked hand in glove for years with the Free Software Foundation’s founder Richard Stallman...
Cloud computing is a game changer. The cloud is disrupting traditional software and hardware business models by disrupting how IT service gets delivered. Entrepreneurial opportunities abound as this classic disruptive technology begins to proliferate, so it is no surprise that SYS-CON'...
The irony is that Oracle has advanced MySQL, lost money in the process, and helped its competitors - all at the same time. When Oracle buys Sun and controls MySQL the gift (other than to Microsoft SQL Server) keeps on giving as the existential threat to RDBs is managed by Redwood Shore...
WSO2, the open source SOA company, today announced the launch of the WSO2 Cloud Platform. Available today, the new WSO2 Cloud Platform features a family of WSO2 Cloud Virtual Machines; WSO2 Cloud Connectors for enabling fast, secure cloud services; and the multi-tenant WSO2 Governance-...
Now, the open source Mozilla Thunderbird client software can be used with Open-Xchange collaboration software. The "Community OXtender for Thunderbird" software connector gives users full access to appointments and contacts stored in the Open-Xchange Server and enables them to use Thun...
Morph Labs, a leading provider of enterprise cloud computing technology, today announced an introductory trial of the Morph CloudServer, an open, standards-based server IT organizations can use to rapidly model and evaluate their cloud implementations. A miniature "Cloud Environment in...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE