|
SYS-CON.TV Webcasts
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Top Links You Must Click On
Feature Turbo-Charging Applications with Mid-Tier Distributed Caching
Fast and predictable data access
By: Tim Middleton
Feb. 21, 2008 12:00 PM
Caching Topologies The key here is that the developer shouldn't have to code the clustering, replication, data backup, or parallel processing logic required to support the different topology types. The developer should code to a standard API and concentrate on writing business logic. The configuration underneath should be able to be changed declaratively via configuration, without any changes to the APIs that have been written.
Data Source Integration Vendors with robust solutions in this space frequently implement them using approaches that let the data source plug transparently into the grid. For example, in the case of Oracle Coherence, loading directly from the database is done declaratively by attaching a CacheStore interface to the deployment configuration. Developers can either implement to a standard interface that calls to the back-end data store for query and update or use out-of-the-box integration with persistence solutions such as JBoss Hibernate or Oracle TopLink. When either of these methods is used, if the data doesn't exist in the data grid, the solution will automa- tically delegate the data request to the CacheStore implementation, which then retrieves it from the back-end stores. The capability of refreshing the data objects in the data grid based on time-triggered or other data expiry mechanisms is especially useful for those who use the data grid as a system of record and the official place for accessing data. Having a formal mechanism such as this built into the solution enables expiry policies and other data eviction policies to be matched by the infrastructure, which refreshes the data grid based on policies defined by an administrator. Ideally the solution shouldn't require customers to poll their back-end system for changes in data or scheduling jobs to refresh the data grid — these solutions are simply not scalable or manageable.
Sending the Processing to the Data
Iterator<Employee> iter = map.values().iterator(); This (which could be written dozens of ways, of course) would achieve the desired result, but in this example, if the Map wasn't local to the Java process or distributed on another server, there would be a lot of network traffic to and from the client. The process would be serialized (that is, one entry processed at a time) and to rewrite this to run in parallel over multiple JVMs, taking into consideration the co-ordination of the concurrent processing, would require a considerable amount of work. Taking advantage of grid processing and the ability of the data caching topology to load balance and partition data across multiple servers, it makes sense to send the processing to where the data is, rather than bringing the data to the client for processing. A common approach (this example is specific to Oracle Coherence) is to deploy code in the grid that performs the logic local to the nodes in the grid, rather than requiring the programmer to bring the all the data to the client. The example shows how this approach could be used to raise the salary of all employees. First create a class to process the data:
public class RaiseSalary extends AbstractProcessor { Now invoke this across the Map (data grid): empCache.invokeAll(AlwaysFilter.INSTANCE, new RaiseSalary()); Sending the processing to the data dramatically improves the performance of tasks such as this because now the compute activity is parallelized across the entire grid. Figure 1 illustrates the benefits of sending the processing to the data. With multiple nodes in the grid and data distributed in parallel across the nodes, the processing model would scale well and take advantage of the processing capabilities of each node. Also, the fact that data doesn't need to be shipped back and forth between the client and server significantly increases the scalability and performance of such a system. As outlined in the example, using traditional non-grid methods would result in extremely poor performance and limited scalability. Reader Feedback: Page 1 of 1
Enterprise Open Source Magazine Latest Stories . . .
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||