Comments
litl_phil wrote: While it's nice that Google and Acer share the vision of cloud-based computing, it's also worth noting that we at litl already have a webbook on the market (available at litl.com) that runs our own cloud-based OS. Unlike Chrome, litlOS is focused on creating a new and better web experience for the home, so we don't have the usual browser interface, we have our own innovative UI. In conjunction with easel mode (litl's inverted-V position) and our growing cohort of litl channels (special apps t...
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Java Feature — Concurrent Queries
A pattern for improving database query performance

In the ConcurrentQueryThreadImpl class, the runQuery() method first checks to see if any previously submitted query threads have finished and need to be reaped. This is important because the list of running threads is constrained so that too many queries can't run at once and overload the database server. So we want to get these threads processed and off the list first to make room for more query threads to be invoked. Once a query thread has been reaped then there's room on the list for another query thread. If there's room on the running threads list and there are queued queries waiting to be submitted (e.g., queries that previously had to wait because the running thread list was full) then they get submitted first before the query being passed to the runQuery() method. The query being passed in would then have to go onto the end of the list. Otherwise, if there's room on the running threads list and no queued queries, the caller's query will be immediately submitted.

The ConcurrentQueryThreadImpl class contains a private QueryThread class that extends Thread. This class starts a new thread, runs the SQL query, and holds onto the results (or an SQLException, if one occurred) until the ConcurrentQueryThreadImpl processes the results and removes the thread from the list. See Listing 5.

Once the ConcurrentQueryThreadImpl notices that the QueryThread is finished, it calls the processResults() method of the CanResolveAConcurrentQuery interface reference that the domain object has implemented, marks the processed object as reaped via the same interface, and removes the QueryThread from the list of running threads. Besides the getInstance() method that gives visibility into the singleton class, the public user interface for this class simply consists of the runQuery() and waitForAllQueriesToComplete() methods.

A Variation Using Thread Pools
In situations where concurrent queries can be used extensively, there might be some uneasiness about starting a new thread for each query and having it exit when that query is completed. In such cases, I'd recommend using a callable thread pool available in the java.util.concurrent package. Threads of this type would have an advantage over normal threads in that a) they can be pooled, b) they can throw an exception, and c) they can return a result. As an exercise, I've implemented a callable thread pool version of the QueryThread class that the ConcurrentQueryThreadImpl class can use to run queries. This class, a private class named QueryThreadPool, implements the Callable interface, instantiates a thread pool the size of the constraint of the maximum number of queries we want to have running at once, and puts the main unit of work of the thread inside the call() method. The source for the QueryThreadPool class is in Listing 6.

To make it easier to switch between the two threading models, a simple interface was extracted from the original QueryThread implementation named IsAConcurrentQueryThreadRunner, mandating the following methods: getResultSet(), getSQLException(), and isAlive(). See IsAConcurrentQueryThreadRunner.java below.

package net.sourceforge.concurrentQuery.article.concurrent;

import java.sql.ResultSet;
import java.sql.SQLException;

public interface IsAConcurrentQueryThreadRunner {

    public ResultSet getResultSet();
    public SQLException getSQLException();
    public boolean isAlive();
}

This interface is used on the ConcurrentHashMap lists that hold the references to the running query threads. Now, it's possible to change a few references of the QueryThread to the QueryThreadPool and vice-versa to switch between the two threading models. Of course, a factory to create the threading model based on a properties file would be more efficient, but outside the immediate scope of our discussion. The entire source for the ConcurrentQueryThreadImpl class is in Listing 7.

A Second, More Robust Implementation
To demonstrate use further, I've put together a more elaborate implementation of this pattern that builds a large object from real database queries. This database has one table that lists cities with large populations, their districts (or states), and the countries in which they reside. For this example, I have built a single object that contains a list of countries that have more than 75 cities. The CountryList object contains a list of its districts, each district contains a list of its cities. All of this is in one big object. Once it's built, the results are printed. Below is Partial output from printing the CountryList object.

=== stuff deleted ===

Country Code: USA
     District: Alabama
         city name: Birmingham, population: 242820
         city name: Huntsville, population: 158216
         city name: Mobile, population: 198915
         city name: Montgomery, population: 201568
     District: Alaska
         city name: Anchorage, population: 260283

=== stuff deleted -==

Once built, this object contains 11 countries, 350 districts each associated with its country, and 2,233 cities each associated with its district. I've implemented the solution using a concurrent query pattern that uses a factory to create a concurrent query object with the desired threading model (normal threads, callable thread pool, or runnable thread pool). Then I created a factory broker singleton class that reads the threading model, JDBC settings. and the number of connections from a properties file and invokes the proper factory to create the concurrent query object. If I use one connection, thus simulating a serialized approach, it takes about 30 seconds on average to construct this object (this doesn't include the amount of time needed to print the results). If I use two connections concurrently, the process of constructing the object takes only about 7.7 seconds. Using three connections gets the time down to 5.2 seconds. Your mileage may vary and you will eventually hit a point of diminishing returns where adding more concurrent connections won't improve performance.

Consider the CountryList domain class in Listing 8 that accepts an argument for the number of cities, builds a list of countries that have more than that number of cities, and then constructs a list of the districts in each country.

Note that the processResultSet method is defined in the ResolvableFromConcurrentQuery interface. Also, the DistrictList class, which is instantiated by the CountryList object, is a domain object that participates in concurrent queries and will invoke a CityList object, yet another concurrent query domain object. And all of this happens using threaded queries and queued queries on lists to manage them. Notice too that in this implementation that I've chosen to have the domain objects explicitly call the resolve() method of the ConcurrentQuery object rather than build a notification into the interface as the previous implementation did with the isReaped() method. The resolve() method waits for all the running threads and queued queries to complete before continuing. The tradeoff is whether or not it's more feasible to have each getter in the domain object check to be sure it's reaped or whether it's better to have the domain objects explicitly wait to be resolved.

So, in general, a concurrent query implementation will likely have a mechanism to invoke a SQL query without waiting for the SQL results, and a way to ensure that an object is properly built before it's used - either by having the business logic explicitly wait for all results to finish after invoking some concurrent queries, or by having the domain object itself recognize that it hasn't processed its SQL results and requests to wait for those results.

When To Use Concurrent Queries
I wouldn't propose using a concurrent query pattern as a general rule for all database access because of resource constraints, but I believe there are many applications that could benefit from occasional use. This pattern fits most easily with POJOs that already build and execute and process results for their own SQL queries. The following are characteristics of applications that might benefit:

  • Database and server resources are adequate and the database server isn't already under duress.
  • Your application is already using JDBC queries.
  • Your application controls when queries are run and when the results are processed (e.g., not using an external tool for building, managing, and running queries).
  • You're not having issues with the number of connections available to the database server.
If so, then it might be feasible to implement this pattern. Remember, you can always configure the number of queries allowed to run concurrently to one, essentially running your application as a regular serialized JDBC query/result model, if resource constraints become an issue.

Conclusion Such a simple pattern can be implemented in a few hours and the results might help a project over some bumpy performance issues. A few items worth noting that didn't seem to fit in anywhere else:

  • Concurrent queries don't have to be implemented using threads. Since most database servers are multithreaded themselves they usually return control back to the client after a query has been parsed and submitted while the database server works on the query. If you hold the connection then you can check for the results later without having to use threads (e.g., set a timeout to zero and check for a result). Of course, the threading approach is pretty efficient and I personally like that model better. While it's entirely possible to use JDBC and hold the connection without immediately processing the result, the de facto standard for Java/JDBC development, up to this point, has been to submit queries and process results in one operation. But, when using a language or platform whose threading package isn't trustworthy then this pattern can be implemented without threads. In a previous project, I implemented a variation of this pattern using C and ODBC without threads.
  • If you access a singleton concurrent query implementation from threaded clients then you might need to synchronize methods or blocks strategically in the concurrent query singleton.
  • I've never implemented this pattern with objects that insert, update or delete data, but I suppose it could be done. I've never implemented this pattern to participate in a transaction, but that too should be possible.
  • Besides building query-intensive objects faster, another potential use for this pattern could be in improving front-end user response time by pre-fetching data. For instance, suppose that after a user logs in to your application, his likely next choice would be to pull a list of active orders, view a list of products, or view their account settings. Concurrent queries could be used to build objects for all three potential choices immediately after the user logs in. By the time the user decides on which option to choose, the domain objects would be immediately available, or at least closer to being available than if the object started to be constructed after the user made a choice. Of course, an expiration date on the object would be in order in case the user takes 30 minutes to make a choice. Sure, you might end up building an object that you don't use, but I've had several instances where the perceived user response time was more valuable than the application resources. I don't like fast food restaurants that have my burger made before I actually order it, but I'm not as picky about my data.
For More Information
All of the sources found here, plus the source for the implementation of the list of countries example is available on sourceforge.net. Since concurrent query is more of a pattern than a packaged solution, the project on sourceforge.net is just a sample implementation intended for perusal. Sources are available for download from http://sourceforge.net/projects/concurrentquery.

The SleepyObject (ant target: run-example1) and ConcurrentSleepyObject (ant target: run-example2) are found in the article package and use a Postgres database. View the readme for instructions on creating the sleep function in Postgres. Other database servers might have a built-in function (e.g., waitfor in MS SQL) that could be substituted.

The country list example (ant target: run-ModelDriver) uses a MySQL database server. The DDL and data to create the city table is included and instructions for loading are also in the readme file.

About Andy Pardue
Andy Pardue is a senior software developer who has specialized in the medical software industry for over 15 years, 11 years as a telecommuter from his home office in Mesquite, Texas. He can be reached at: andypardue@gmail.com.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Does this sound familiar? You have a domain object, perhaps for reporting purposes, that's built from a ton of JDBC queries and it takes too long to load. Nothing else happens until this object is built, so it's become a bottleneck. Even worse, each of the queries is actually well tuned, so there isn't much to gain from modifying the queries themselves - there are just too many of them. You don't want to change (or can't change) your data model, so what can be done to alleviate this problem short of a major redesign? There are several options like caching, lazy loading, resource pooling. Another worthy option would be to implement a variation of the concurrent query pattern.


Your Feedback
JDJ News Desk wrote: Does this sound familiar? You have a domain object, perhaps for reporting purposes, that's built from a ton of JDBC queries and it takes too long to load. Nothing else happens until this object is built, so it's become a bottleneck. Even worse, each of the queries is actually well tuned, so there isn't much to gain from modifying the queries themselves - there are just too many of them. You don't want to change (or can't change) your data model, so what can be done to alleviate this problem short of a major redesign? There are several options like caching, lazy loading, resource pooling. Another worthy option would be to implement a variation of the concurrent query pattern.
Enterprise Open Source Magazine Latest Stories . . .
Oracle seems to have divided the open source ranks over the MySQL delay it’s having closing its acquisition of Sun. Eben Moglin, the GPL’s most ardent defender and delineator, the lawyer who has worked hand in glove for years with the Free Software Foundation’s founder Richard Stallman...
Cloud computing is a game changer. The cloud is disrupting traditional software and hardware business models by disrupting how IT service gets delivered. Entrepreneurial opportunities abound as this classic disruptive technology begins to proliferate, so it is no surprise that SYS-CON'...
The irony is that Oracle has advanced MySQL, lost money in the process, and helped its competitors - all at the same time. When Oracle buys Sun and controls MySQL the gift (other than to Microsoft SQL Server) keeps on giving as the existential threat to RDBs is managed by Redwood Shore...
WSO2, the open source SOA company, today announced the launch of the WSO2 Cloud Platform. Available today, the new WSO2 Cloud Platform features a family of WSO2 Cloud Virtual Machines; WSO2 Cloud Connectors for enabling fast, secure cloud services; and the multi-tenant WSO2 Governance-...
Now, the open source Mozilla Thunderbird client software can be used with Open-Xchange collaboration software. The "Community OXtender for Thunderbird" software connector gives users full access to appointments and contacts stored in the Open-Xchange Server and enables them to use Thun...
Morph Labs, a leading provider of enterprise cloud computing technology, today announced an introductory trial of the Morph CloudServer, an open, standards-based server IT organizations can use to rapidly model and evaluate their cloud implementations. A miniature "Cloud Environment in...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE