Concurrency the Java 1.5 way
I attended a talk tonight on Java concurrency presented by Stuart Halloway at the Northern Virginia JUG that provided a refresher on the java.util.concurrent package. Stuart is one of the founders of Relevance, author of Component Development for the Java Platform, a frequent speaker at technical symposiums, co-author of Rails for Java Developers, and a great technical speaker.
Stuart Halloway
Stuart spent the first few minutes telling us why he now focuses more on Ruby and Rails than on Java. Paraphrasing the title of a Java book written by his Relevance partner Justin Gehtland and Bruce Tate, Stuart says, “We can go even ‘betterer,’ ‘fasterer’ and ‘lighterer’ with some other technologies” like Rails and the Streamlined framework. However, when multithreading and concurrency are needed, Java way outshines the current state of Ruby and Rails, he said.
When considering multi-threading in order to increase the speed of a process, it is important to consider whether the slowness is due to the application being suspended while waiting for an external resource (e.g. a database, user input, disk), or whether the process is suspended while waiting for free CPU cycles. If the process is waiting for an external resource, Stuart said, the language and the number of CPUs won’t matter much. “Java, assembly language and PHP all wait at the same speed,” he said.
Stuart’s talk covered:
- Threads
- Tasks and scheduling
- Locking
- Concurrent collections
- Alternatives to threads
The key point to remember about Java threads is they share code, data, resources, and heap storage. They contain their own instruction pointer and stack. Threading isn’t often needed in server-side programming because components like EJBs and JEE container services abstract the multi-threading away from the developer. But threading is often needed when you need to:
- Keep a user interface responsive (think Swing)
- Take advantage of multiple processors in compute-heavy applications
- Simplify code that would otherwise need to keep checking if other tasks need to be performed (implementing their own task-scheduling loop)
Before Java 1.5,
the Java language used Thread
objects as the main way to
achieve concurrency.
Developers would write a class that implements Runnable
and
pass an instance to a Thread
.
Two of the shortcomings of the Runnable interface is
its single method, run
,
doesn’t return anything and it isn’t declared to throw an
exception to indicate anything went wrong.
“It’s completely wrong,” Stuart said.
Java 1.5 introduced higher-level classes to allow more abstraction
away from Thread objects.
It introduced the Callable
interface,
whose call
method does return something and
is declared to throw an Exception
.
Programmers write Callable classes and pass instances to one of the three
ExecutorService
classes obtained by the
Executors
Java factory,
or perhaps from an external library.
The ExecutorService
s provided by the Executors factory
provide single-theading execution,
and execution by two types of thread pools,
a cached, expandable thread pool or a fixed-size thread pool.
When you give a Callable to an ExecutorService,
you get back a Future
object containing the results of the Callable’s execution.
The result can be an object or an exception that will be thrown.
Stuart demonstrated code that exercised the new threading objects
and shows how to use them.
The code and the slides from his presentation is available
online.
The Need for Locking
You don’t need locks if you’re just telling separate tasks to run concurrently. You need locking code when multiple threads access the same data at the same time. Java provides lock support with the:
synchronized
keyword and blocks- Java 1.5 Lock interface objects, which offer an improvement over a straight synchronized block because you can tell the code to give up its attempt to acquire a lock after a timeout period expires.
-
ReadWriteLock interface, which offers separate locks for whether the process needs to read data or alter the data. </ul> If you want, you can tweak how the ReadWriteLock operates, such as defining whether readers or writers get lock priority.
Concurrent Collection Options
Strategies and the implications of using concurrent collections: strategy and implications:
-
Do nothing
It’s fast, simple, but not thread-safe
-
Fail-fast iterators (introduced in Java 1.2)
Fast, not thread-safe. Misuse of concurrent access probably will cause a fast failure. Fail fast uses optimistic locking: It assume everyone can access a shared resource and uses clean-up code if something goes wrong with multi-threaded writes. Java collections implement the fail-fast strategy by using version numbers that iterators use to see whether the collection has changed.
- Lock the entire collection
Simple, slow, might be thread-safe (like
Hashtable
) - Lock partial collection Complex, maybe faster, maybe thread-safe.
- Copy on write Fast read access, may read stale data. When you write to a collection, you get new copy, so your write can proceed. Iterators for reading threads point to older collection, so data can be stale.
- Immutable Fast, simple, thread-safe, cannot change objects.
- Application-controlled locking Difficult, allows any combination of the above strategies.
Java Collections Design Choices
Collections and strategies
- Legacy (pre-Java 1.2): Lock entire collection
- Collections (1.2) API: Lock none, fail-fast iterators
- Synchronized wrappers (1.2): Lock entire collection
- ConcurrentHashMap: Lock partial collection Uses “lock-striping” to allow uses of different buckets in a hash.
-
CopyOnWriteArrayList: Copy-on-write
Very expensive if using big arrays that are written to regularly. Every write to the collection copies it again. Only advantageous if data is read-mostly.
- String: immutable
Alternatives to Threads
Alternatives, pros and cons:
-
Container-managed threads: Simple. inflexible
Like J2EE containers. You write applications as if you are the only user of the object. Scales well because most data in server side is in the database. The DB controls concurrency.
-
Non-blocking I/O: Do work when available. Con is it as complex as using threads
For example, the
java.nio
(1.4) package. Pro: Do multiple operations and notify me when done. Con: As complicated as threads. Oriented around blocking waits. Tends to get ignored when you’re coding on the server-side. -
Use multiple processes: Pro: simple. Con: inflexible
When you need to perform more work, start more processes.
- Event-driven code: Con: as complex as threads
-
Do nothing: Pro: simple. Con: slow (but performance might not matter for the application)
“Probably more time has been wasted by optimizing code that doesn’t need to be optimized.”
Stuart also discussed the double-checked locking Java anti-pattern and why it is a problem. Heck, the perils surrounding the use of double-checked locking in Java have been known since what, 1997, when I think Java Developer’s Journal published an article on it. But I’ve seen wickedly smart developers insert this potentially evil anti-pattern into their code out of ignorance of the subtle problem. I’m glad Stuart mentioned it as a reminder.
For Java developers interested in learning more about programming using concurrency, Stuart recommended Java Concurrency in Practice by Brian Goetz. The book mixes academic rigor on threading with practical implications for Java developers, he said.
Stuart also will be in town Wednesday night to speak at the Northern Virginia Ruby User’s Group. He’ll be talking about the Streamlined framework for rapidly developing CRUD applications in Rails.
-