|
SYS-CON.TV Webcasts
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Top Links You Must Click On
General Java Caching & WeakReferences
Caching & WeakReferences
By: Lynn Monson
Aug. 1, 1998 12:00 AM
Java brought garbage collection to mainstream programming. Never before have commercial software developers been so aware of the need and benefit of using a collector. Notwithstanding, the benefits of garbage collection in Java are far from being completely realized. As larger and more complex applications are built in Java, it's becoming apparent that some very flexible memory management schemes are both needed and possible. In this article I explore a cache you can build on JDK 1.2 beta 3 that uses WeakReferences to cooperate with Java's garbage collector. The cache uses WeakReferences to take advantage of available system memory without clogging it with unfreeable objects. When memory is reclaimed, the cache will free those objects that are not in use, resulting in a better-performing application. Objects can be cached in memory but don't have to trigger memory thrashing.
References and WeakReferences The basic idea is to put data inside a java.lang.ref.Reference class (or subclass) instead of referring to the data directly through a variable. The data can be retrieved as needed from the Reference class, but the application doesn't keep a permanent finger on the data. Under certain conditions, if the garbage collector determines that only the Reference object is currently using the data, it may free the data. This can happen even if the application is still using the Reference object that wraps the data. To illustrate, consider the following code: Reference myReference = new WeakReference( new String("some string") ); In this example the application has created a WeakReference, a subclass of the base Reference class. Inside the WeakReference is stored the string "some string." The application refers to the WeakReference through the variable myReference, but has no direct access to the string. Whenever the string is needed, the application calls the get() method on the Reference object as follows: String myString = (String) myReference.get(); // returns "some string" The application can then use the string. For References to do their job, however, the application needs to release its lock on the string object by either exiting the scope where "myString" exists or explicitly setting the variable to null: myString = null; When memory gets tight, the garbage collector may determine that the only reference to the string "some string" is through the Reference object "myReference". When this occurs, the string may be freed even though the Reference object that has stored the string is still in use. This is in stark contrast to the way memory management happens for any other class in Java, where transitive references are sufficient to keep an object from being freed. Reference objects are special, and are specifically handled by the garbage collector. The garbage collector doesn't free a Reference's interior object directly, but instead invokes the clear() method on the Reference. Invoking this method is the signal to the Reference that its interior data will be freed. Once cleared, the Reference object will return null from the get() method. An application can detect that the interior data has been freed as seen here:
String myString = (String)myReference.get(); It's worth noting some awkwardness in the Reference terminology. The Java language has a language construct, called a reference, that should not be confused with the Reference class. The language construct is the way a variable "refers" to its data, while the class is a first-order entity in the system. The Reference class is used to "wrap up" and manipulate the concept of a language reference, a process known as reification. The interior data managed by the Reference class is called the referent of the class, meaning the thing to which the Reference class refers. The specific conditions under which a Reference class is cleared vary from one subclass of Reference to another. Some subclasses are silently cleared while others are not, allowing an application to take action. The basic idea, however, stays the same.
References and ReferenceQueues The ReferenceQueue itself is monitored in two different ways. An application can poll for items in the queue or can block waiting for something to enter the queue. The latter is particularly useful in multithreaded applications when there are auxiliary system resources associated with Reference objects. An application dedicates a thread to monitoring the ReferenceQueue; when an object is placed into the queue, it frees up whatever resources are associated with the reference.
Caching To minimize memory consumption, a simple cache will limit the number of entries it can hold. A sophisticated cache will take advantage of available memory by using a variable number of cache entries. When memory gets tight, items are not in use or memory is being reclaimed, the sophisticated cache releases some (or all) of its cache items. In this way the cache can use available memory without creating sandbars that the memory manager has to work around. Our cache will release items when they are not in use and the garbage collector is reclaiming memory. We will base our cache on JDK 1.2's collection classes. Our cache will implement the java.util.Map interface. which offers a simple put()/get() interface. An object is added via put() and retrieved via get(). We don't want the user of the cache to see any use of Reference objects, so we define the put/get methods to accept the cached objects directly. Listing 1 shows the skeletal implementation of the cache. This simple implementation will store and retrieve cached objects under an applications control. As you can see from the listing, the class is a trivial subclass of Hashtable. This simple cache does not cooperate with the memory manager. All cached objects are kept in memory until the cache is explicitly cleared. Worse yet, the cache grows in size with each new item put in it. There is no size limit. To correct these deficiencies, we introduce WeakReferences into our cache implementation. When an application stores an item in the cache, we won't store it directly. Instead, we put it inside a WeakReference and store that instead. When the garbage collector runs, it is free to collect the cached data if it is not in use. Listing 2 shows the new implementation. As you can see, introducing References into the design has changed the implementation substantially. The class is no longer a subclass of Hashtable, keeps a private copy of all cached data, and does some Reference manipulation. These changes are necessary because of our requirement that users of the cache be shielded from the use of Reference objects.
To ensure that no Reference objects surface outside the cache, we have to guarantee that all of the access points into and out of the cache are protected. Objects passed into the cache are immediately wrapped by a WeakReference. The WeakReference is stored, but is always stripped off before an object is handed out of the cache. To accomplish this, the new cache implementation makes four core changes: That's all we need for the basics. The AbstractMap class drives all other operations from the values returned by the entries() method. With that in place, consumers of the cache see a simple put/get interface and can call the other abstract Map methods such as containsKey and containsValue. Additionally, the cache can be enumerated over, compared for equality with other caches, searched, etc. The only operations not supported by the cache are the collection-based deletion operations. For clarity, I've left that code out of these examples, but implementing them is a small extension; we simply change the collection returned from the entries() call to forward its modifications to the private Hashtable of the cache. All in all, we inherit a pretty complete system from the base collection classes.
Race Conditions In our cache, once the entries() code has determined that a given Reference contains a non-null referent, a direct Java reference to that interior object is maintained and stored. This prevents the garbage collector from freeing the interior object. To clarify this just a little, consider the following code:
if ( myReference.get() != null ) This example has a bug. The problem is that the garbage collector may run between the first call to myReference.get() and the second. It's possible that the Reference is cleared in the process. The bug won't happen often, but when it does, it will be hard to find. The example can be corrected as follows:
Object o = myReference.get(); In this example the object managed by myReference is directly referred to. This prevents any race conditions from freeing the data before it is put into myHashtable.
To Queue or Not to Queue
Testing the Cache This program is implemented by spinning off threads for each of several tasks. A thread that periodically adds items to the cache is started. As each item is added to the cache, another thread keeps a direct reference to the item for a brief time. Another thread periodically runs the garbage collector and prints out whatever is still left in the cache. The implementation of these tasks is found in the class files CacheItemGenerator.java, TimedReference.java and
TimedGarbageCollector.java. After choosing a set of parameters, press the start button and the machine will be set in motion. After each run of the garbage collector, CacheTest dumps the contents of the cache to standard output. You can identify which items are still in the cache by their names. Cache items are named sequentially, "Key 1, Key 2," etc. Listing 3 shows a sample run where all timeouts are set to one second. The CacheTest threads can be halted at any time by pressing the stop button. This allows you to reconfigure the cache parameters and run another test. You should adapt the test program to try caching your own classes. This is easy to do by changing the options in the first pulldown menu. When CacheTest is running, it uses the default class loader to load whatever class is identified in the pulldown. The only requirement is that the class has a default constructor.
Other Kinds of References To address that need, the cache can switch from using WeakReferences to SoftReferences. SoftReferences are cleared by the garbage collector only when their interior objects are not in use and when memory is running low. Additionally, the SoftReferences are subject to a least recently used algorithm. This meets the above objective. With beta 3 of JDK 1.2, however, I've had some variableness and problems with SoftReferences. These could be my own bugs, but because SoftReferences are debuggable only in low-memory situations, I haven't bothered to track down the problem. Caveat emptor. If you have very particular cache needs, you may also want to investigate using a GuardedReference. A GuardedReference is not cleared by the garbage collector. Instead, the GuardedReferences are put into a ReferenceQueue when the garbage collector sees that the interior object is not in use. It's up to the application developer to pull the objects from the queue and clear them.
Summary Reader Feedback: Page 1 of 1
Enterprise Open Source Magazine Latest Stories . . .
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||