Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
SYS-CON.TV
Top Links You Must Click On


Better Scaling with New I/O
Better Scaling with New I/O

With J2SE Version 1.4, Java finally has a scalable I/O API. Not that the old API was an absolute failure (Java's tremendous success in the application server market refutes this), but some of the old API's properties led to drastic restrictions. The worst one was the blocking I/O.

To write data over a socket, you have to call the write() method of an associated OutputStream. This call returns only after you've written all the necessary bytes. Given that the send buffers are full and the connection is slow, this might take a while. If your program operates only with a single thread, other connections have to wait, even if they're ready to process write() calls. To work around this problem, you have to associate a thread with each socket. This way one thread can work while another one is blocked due to I/O-related tasks.

Threads aren't as heavyweight as real processes. But, depending on the underlying platform, they're not resource savers either. Each thread uses a certain amount of memory and, apart from that, many threads imply many thread-context switches, which aren't cheap.

Java needed a new API to separate the all-too-happy marriage of socket and thread. This finally happened with the new I/O API (java.nio.*).

In this article I show how you can write a simple Web server with both the new and the old API. Since HTTP, the Web's protocol, is not as trivial as it used to be, I'll realize only some simple central features. Therefore, the programs shown here are neither secure nor protocol-conforming.

Old School Httpd
Let's look at the old-school HTTP server first (see Listing 1). (Listings 1-5 can be downloaded from www.sys-con.com/java/sourcec.cfm.) Since I need only a single class for this realization, it's quickly explained. In the main() method, a ServerSocket is instantiated and bound to port 8080. Of course, you'd usually bind a Web server to port 80, but on Unix systems you can only do this with superuser rights. Fortunately, not everyone has them, which is why I chose to use port 8080.

public static void main() throws IOException {
ServerSocket serverSocket = new ServerSocket(8080);
for (int i=0; i < Integer.parseInt(args[0]); i++) {
new Httpd(serverSocket);
}
}
Then a number of Httpd objects are created and initialized with the shared ServerSocket. In the Httpd's constructor, I make sure all instances have a meaningful name, set a default protocol, and start the server by executing the start() method of its superclass Thread. This leads to an asynchronous call to the run() method, in which an infinite loop is located.

In this infinite loop, the ServerSocket's blocking accept() method is called. When a client connects to port 8080 of the server, the accept() method will return a socket object. Associated with each socket are an Input- and an OutputStream. Both are used in the following call to the handleRequest() method. In this method the client's request is read, checked, and an appropriate response is sent back. If it's a legitimate request, the requested file is sent back using sendFile(). If it's not, the client will receive a corresponding error message (sendError()). To keep things simple, I won't discuss the specifics of the protocol.

while (true) {
...
socket = serverSocket.accept();
...
handleRequest();
...
socket.close();
}
Now let's think about this realization for a second. Does it perform well? On the whole, yes. Certainly I could optimize the request parsing - the StringTokenizer doesn't have a reputation for being extremely fast. But at least I turned off the TCP delay (slow-start algorithm), which is unsuitable for short connections, and the sending of the file is buffered. But even more important, all threads operate independently of each other. The native, and therefore fast, accept() method decides which thread accepts a new connection. Apart from the ServerSocket object, the threads don't share any resources that might need to be synchronized. This solution is fast but, unfortunately, not very scalable, as threads are definitely a limited resource.

Nonblocking Httpd
Let's look at another solution that uses the new I/O package. It's a bit more complicated and requires the cooperation of different threads. It consists of four classes (see Figure 1):

  1. NIOHttpd (see Listing 2)
  2. Acceptor (see Listing 3)
  3. Connection (see Listing 4)
  4. ConnectionSelector (see Listing 5)
NIOHttpd basically launches the server. Just as in Httpd, a server socket is bound to port 8080. The important difference is that this time I use a java.nio.channels.ServerSocketChannel instead of a ServerSocket. I need to open the channel with a factory method before binding it explicitly to the port using the bind() method. Then I instantiate a ConnectionSelector and an Acceptor. Doing so, each ConnectionSelector is registered with an Acceptor. In addition, the Acceptor is provided with the ServerSocketChannel.

public static void main() throws IOException {
ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.socket().bind(new InetSocketAddress(8080));
ConnectionSelector cs = new ConnectionSelector();
new Acceptor(ssc, cs);
}
Figure 2 depicts the concurrent execution of the Acceptor and ConnectionSelector threads. To understand the interaction between the two threads, let's first take a closer look at the Acceptor. Its task is to accept incoming connections and register them with the ConnectionSelector. Already in the constructor, the superclass's start() method is called as the required infinite loop is in the run() method. In this loop a blocking accept() method is called that will eventually return a socket object - almost exactly as in Httpd. But this time it's a ServerSocketChannel's accept() method, not a ServerSocket's. Finally, with the obtained socketChannel as an argument, a connection object is created and registered with the ConnectionSelector using its queue() method.

while (true) {
...
socketChannel = serverSocketChannel.accept();
connectionSelector.queue(new Connection(socketChannel));
...
}
To summarize: the Acceptor can only accept and register connections with a ConnectionSelector in an endless loop.

Like Acceptor, the ConnectionSelector is also a thread. In its constructor a queue is instantiated and a java.nio.channels.Selector is opened using the factory method Selector.open(). The Selector is probably the most important part of the server. It allows me to register connections and to obtain a list of those connections that are ready for reading or writing.

After the start() method is called in the constructor, the endless loop in run() is executed. In this loop I call the Selector's select() method. This method blocks until either one of the registered connections is ready for I/O operations or the Selector's wakeup() method is called.

while (true) {
...
int i = selector.select();
registerQueuedConnections();
...
// handle connections...
}
It's crucial to understand that while the ConnectionSelector thread executes select(), no Acceptor thread can register connections with the Selector, because the corresponding methods are synchronized. Therefore I use a queue, to which the Acceptor thread adds connections as needed.

public void queue(Connection connection) {
synchronized (queue) {
queue.add(connection);
}
selector.wakeup();
}
Right after queuing a connection, the Acceptor calls the Selector's wakeup() method. This causes the ConnectionSelector thread to resume execution and return from the blocking select() call. Since the Selector is not blocked anymore, the ConnectionSelector can now register the connection from the queue. It happens the following way in registerQueuedConnections():

if (!queue.isEmpty()) {
synchronized (queue) {
while (!queue.isEmpty()) {
Connection connection =
(Connection)queue.remove(queue.size()-1);
connection.register(selector);
}
}
}
Selector Registration Using Keys
At this point I have to focus on the Connection's register() method. Until now I've talked about a connection that's registered with a Selector. This is a bit simplified. Instead, a java.nio.channels.SocketChannel object is registered with a Selector, but only for specific I/O operations. After registration, a java.nio.channels.SelectionKey is returned. This key can be associated with arbitrary objects using its attach() method. To get a connection with a key, I attach the Connection object to the key. By doing so I can indirectly obtain a Connection from the Selector.

public void register(Selector selector)
throws IOException {
key = socketChannel.register(selector,
SelectionKey.OP_READ);
key.attach(this);
}
Getting back to the ConnectionSelector, the select() method's return value indicates how many connections are ready for I/O operations. If the return value is zero, I skip the rest and return to the select() call. Otherwise, I iterate over the selection keys, which I obtained as Set by calling selectedKeys(). From the keys I get the previously attached Connection objects and call their readRequest() or writeResponse() methods. Which method is actually called depends on whether the connections were registered for read or write operations.

This eventually brings me back to the Connection class. It represents the connection and handles all the protocol's specifics. In its constructor the provided SocketChannel is set to nonblocking mode. This is essential for the server. Then a couple of default values are set and the buffer requestLineBuffer is allocated. As the allocation of direct buffers is somewhat expensive and I'm using a new buffer for each connection, I use java.nio.ByteBuffer.allocate() instead of ByteBuffer.allocateDirect(). If I reuse the buffer, a direct buffer could prove to be more efficient.

public Connection(SocketChannel socketChannel)
throws IOException {
this.socketChannel = socketChannel;
...
socketChannel.configureBlocking(false);
requestLineBuffer = ByteBuffer.allocate(512);
...
}
After all initializations are done and the SocketChannel is ready for reading, the readRequest() method is called by the ConnectionSelector. Using socketChannel.read(requestLineBuffer), all available bytes are read into the buffer. If the full line can't be read, I return to the calling ConnectionSelector and thus allow another connection to take over. However, if the whole line is read, it's time to interpret the request just as I did in Httpd. If it's a legitimate request, I create a java.nio.Channels.FileChannel for the requested file and call the method prepareForResponse().

private void prepareForResponse() throws IOException {
StringBuffer responseLine = new StringBuffer(128);
...
responseLineBuffer = ByteBuffer.wrap(
responseLine.toString().getBytes("ASCII")
);
key.interestOps(SelectionKey.OP_WRITE);
key.selector().wakeup();
}
prepareForResponse() builds the response line and (if necessary) headers as well as error messages, and writes this data to responseLineBuffer. This ByteBuffer is a thin wrapper around a byte array that was created using the factory method ByteBuffer.wrap(byte[]). After generating the data that I want to write, I need to notify the ConnectionSelector that from now on I want to write data rather than read it. This is achieved by calling the selection key's method interestedOps(SelectionKey.OP_WRITE). To guarantee that the selector quickly realizes the connection's change of interest, I call its wakeup() method.

Now the ConnectionSelector calls the connection's writeResponse() method. First, the responseLineBuffer is written to the socket channel. If the entire content of the buffer can be written, and if I still have to send the requested file, I call the transferTo() method of the FileChannel that I opened before. transferTo() potentially transfers data very efficiently from a file to a channel. How efficiently depends on the underlying operating system. In any case, only as many bytes are transferred as can be written to the target channel without blocking. To be on the safe side and to ensure fairness between connections, I set an upper limit of 64KB.

If all data is transferred, close() does the clean-up work. Here, the deregistering of the Connection is important. This is achieved by calling the selection key's cancel() method.

public void close() {
...
if (key != null) key.cancel();
...
}
Again I wonder: How does this realization perform? And again I can answer: it performs well.

In principle, one Acceptor and one ConnectionSelector are sufficient to keep an arbitrary number of connections open. Thus this realization shines in the category of scalability. But as the two threads have to communicate through the synchronized queue() method, they might block each other. There are two ways out of this dilemma:

  1. A better realization of the queue
  2. Multiple Acceptor/ConnectionSelector pairs
One solution could be realized by using a LinkedQueue (see Concurrent Programming in Java by Doug Lea). This data structure is synchronized with two independent locks - one for the head and one for the tail. This ensures that adding and removing threads don't block each other. Only if the queue is empty is there a possibility of mutual blocking, but this can be avoided with an extra check.

In comparison to this elegant approach, my second solution qualifies for the "brute force" category. The load is balanced with multiple Acceptor/ConnectionSelector pairs and the synchronization problem isn't solved, but is somewhat reduced. Unfortunately, this causes additional costs for context switches. Compared to Httpd, fewer threads are needed.

One disadvantage to NIOHttpd, in comparison to Httpd, is that for each request, a new Connection object with buffers is created. This leads to an additional CPU cycle burning caused by the garbage collector. How large these extra costs are depends on the VM. However, Sun doesn't tire of emphasizing that with Hotspot, short-lived objects are not a problem anymore.

Comparative Number Games
How much better does NIOHttpd scale than Httpd? Let's play with a couple of numbers, but before I go into media res, be warned: the formulas and the numbers I'm going to find are highly speculative. Only the concepts' performance is estimated. Important context variables like thread synchronization, context switches, paging, hard disk speed, and caches are not considered.

First I estimate how long it takes to process r simultaneous requests for files with size s bytes, if the client bandwidth is b bytes/second. In the case of Httpd, this obviously depends directly on the number of threads t, as only t requests can be processed at a time. I assume that a corresponding formula looks like Formula 1. c is the constant cost for parsing, etc., that has to be paid for every request. In addition, I assume I can read data faster from the disk than I can write it to the socket, my bandwidth is greater than the sum of the clients' bandwidth, and the CPU is not fully utilized. Therefore the server-side bandwidth, caches, and hard disk speed are not part of the equation.

However, NIOHttpd is not dependent on t. The transfer time l depends mostly on the client bandwidth b, the size of the file s, and the previously mentioned constant costs c. This leads to Formula 2, which estimates the minimum transfer time for NIOHttpd.

The quotient d (see Formula 3) is of interest since it measures the relationship of the performances of NIOHttpd and Httpd.

After closer examination (...and some rows of data), it becomes apparent that if s, b, t, and c are constant, d grows toward a limit. This limit can be easily calculated using Formula 4, which measures the limit of d for r -> ƒ.

Thus, besides the number of threads and constant costs, the connection's length s/b has tremendous influence on d. The longer the connection exists, the smaller d is, and the advantage of NIOHttpd compared to Httpd is greater. Table 1 and Figure 3 show that NIOHttpd can be 126 times faster than Httpd, given that c=10ms, t=100, s=1mb, and b=8kb/s. NIOHttpd has a big advantage if the connection stays open for a long time. If the connection is short, e.g., in a local 100Mb network, the advantage is only 10% provided the files are large. If the files are small, the difference won't be detectable.

In these calculations it's assumed that the constant costs of NIOHttpd and Httpd are about the same and no new costs are introduced by the different ways the servers have been implemented. As mentioned before, this comparison only holds under ideal conditions.

This is sufficient, however, to give you the idea that either concept might be beneficial. It should be noted that most Web files are small, but HTTP-1.1-clients try to keep the connection open as long as possible (with a keep-alive or persistent connection). Often, connections that will never again transfer any data are kept open. In a server with one thread per connection this leads to an incredible waste of resources. So, especially for HTTP servers, the scalability can be increased dramatically by using the new I/O API.

Conclusion
With the new I/O API you can build highly scalable servers. In comparison to the old API, it's a bit more complex and requires a better understanding of multithreading and synchronization. Also, the documentation needs improvement. But if you've gotten over these hurdles, the new API proves to be a useful and necessary improvement of the Java 2 platform.

References

  • HTTP 1.1: www.w3.org/Protocols/rfc2616/rfc2616.html
  • Lea, D. (1999). Concurrent Programming in Java: Design Principles and Patterns. Second Edition. Addison-Wesley. http://gee.cs.oswego.edu/dl/cpj
    About Hendrik Schreiber
    Hendrik Schreiber develops data synchronization solutions utilizing SyncML and J2ME/J2EE for Nexthaus in Raleigh, North Carolina. He is also co-author and author of two German Java related books, published
    by Addison-Wesley.

  • In order to post a comment you need to be registered and logged in.

    Register | Sign-in

    Reader Feedback: Page 1 of 1

    Enterprise Open Source Magazine Latest Stories . . .
    Apache Deltacloud, the Red Hat-contributed ReSTful API that abstracts differences between clouds so services on any cloud can be managed – provided of course there’s a driver – has graduated from the Apache Foundation’s incubator and is now a full-fledged Top-Level Project (TLP). The...
    With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and st...
    AMD said late Tuesday that its chief sales officer Emilio Ghilardi had left the company and that CEO and president Rory Read is going to do his job while a replacement is sought. AMD didn’t say why Ghilardi left but it’s assumed Read wants his own people. Read is relatively new to th...
    During the lifespan of M3 (Monitis Monitor Manager) there has always been something lacking – timers. M3 execution procedure was outlined in this previous article. The execution mentioned in the latter was a one-time-execution, whereas server monitoring requires periodic invocati...
    Red Hat is putting its bought-in Gluster scale-out NAS storage technology, acquired in October, on the Amazon cloud. It’s styled Red Hat Virtual Storage Appliance for Amazon Web Services and other clouds are supposed to follow in short order.
    A new episode of the screencast series is now available at the OpenNebula YouTube Channel. This screencast demonstrates the new easily-customizable self-service portal for cloud consumers. Its aim is to offer a simplified access to shared infrastructure for non-IT end users. The scree...
    Subscribe to the World's Most Powerful Newsletters
    Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


    SYS-CON Featured Whitepapers
    ADS BY GOOGLE