Using Java shutdown hooks to process records cleanly

I recently had to write a quick standalone Java application to manipulate data so it could be imported to Excel. I wanted to ensure only full records would be output even if the user hit Control-C during processing.

I knew Java had the notion of a "shutdown hook" you could register with the Java virtual machine to allow code to cleanup and close resources during a "Control-C" interrupt. After I wrote the application, I wrote this little demo application to show one way shutdown hooks could be used to help ensure records are processed atomically, and any resources could be cleaned up before the JVM exits.

Java shutdown hooks allow code to perform some processing after Control-C is pressed, or while the VM is shutting down for other controllable reason, even System.exit. A shutdown hook is a Thread that hasn't been started. The JVM starts the thread during the shutdown process to allow the thread to perform cleanup work needed before the JVM exits. The JVM starts all shutdown hooks concurrently and allows them to complete before continuing with other shutdown actions, such as running object finalizers. It's a nice feature added in Java 1.3, certainly not as powerful as being able to trap all operating-system signals and running code to handle them, but at least it works across all platforms.

In the case of this demo application, the shutdown hook gets started after the user presses Control-C, then mostly just waits until the latest record being processed finishes. The shutdown hook sets an instance variable flag to tell the main processing loop that a JVM shutdown is in progress -- that way, it can stop processing cleanly.

The main processing loop checks after processing each record to see whether the JVM is being shut down abnormally, such as the user hitting Control-C or the user logging off without quitting the application. If a shutdown is in progress, it breaks out of the processing loop to avoid having the next record being corrupted when the operating system decides the application isn't responding quickly enough to the Control-C and kills the JVM outright.

That is one of the unknowns of shutdown hooks: How long can cleanup code run before the operating system decides the application is an uncooperative rogue process that must be killed? Generally, I wouldn't want to risk more than few seconds for Windows to intervene, or for a Unix user to intervene with "kill -9".

This demo application is pretty simple. It adds "ing" to a list of strings stored in a StringBuffer array. To simulate slow, multi-step processing, that is, something that could get interrupted, the application adds each character individually -- i, n, g -- and takes an entire second to do so.

Here is the complete application: CleanShutdownDemo.java. When you run the application and hit Control-C before the program completes normally, it'll dump the state of its data so you can see whether the data got "corrupted" by having some records getting a partial "ing" added. I'll summarize what the methods are doing, below.

Once you download and compile it, there are two ways to run it: the clean way and the dirty way. If you run this program with the command-line argument "clean", it ensures "ing" is added cleanly to the current string being processed, even if the user hits Control-C during processing. If you run the program without an argument or with an argument other than "clean", it runs in "dirty" mode in which a Control-C leaves the currently processed string in an uncertain state. That is, it could:
  • have been left unprocessed
  • have had an "i" appended.
  • have had an "in" appended
  • have been fully processed by having an "ing" appended
Here are some sample runs. First, the "dirty" way:
> java -cp . com.mcqueeney.demo.CleanShutdownDemo dirty
Registering shutdown hook
Starting processing
About to add 'i' to show...added
About to add 'n' to showi...added
About to add 'g' to showin...added
About to add 'i' to cod...added
About to add 'n' to codi...added
About to add 'g' to codin...added
About to add 'i' to thread...added
About to add 'n' to threadi...added
About to add 'g' to threadin...added
About to add 'i' to blogg...added
About to add 'n' to bloggi^C    <-- HIT CONTROL-C
Record 0: showing               <-- THESE LINES PRINTED BY SHUTDOWN HOOK
Record 1: coding
Record 2: threading
Record 3: bloggi
Record 4: vacation
You'll see that I hit ^C after "i" was added to "blogg" but before the "n" got added. The data dump at the end shows that blogg was corrupted by having only an "i" added. That's the purpose of the "clean" mode, to ensure records get a full "ing" added, or they are left completely unprocessed, in their original state.

Now, for the "clean" way. Running the program with the argument "clean" ensures no record is left in an indeterminate state:
> java -cp . com.mcqueeney.demo.CleanShutdownDemo clean
Registering shutdown hook
Starting processing
About to add 'i' to show...added
About to add 'n' to showi...added
About to add 'g' to showin...added
About to add 'i' to cod...added
About to add 'n' to codi...added
About to add 'g' to codin...added
About to add 'i' to thread...added
About to add 'n' to threadi...added
About to add 'g' to threadin...added
About to add 'i' to blogg^C       <-- HIT CONTROL-C
Shutdown hook: Waiting to exit
...added
About to add 'n' to bloggi...added
About to add 'g' to bloggin...added
Process interrupted. Shutting down after processing 4 records.
Unregistering shutdown hook
Finished comparison. Processed 4 records.
Record 0: showing
Record 1: coding
Record 2: threading
Record 3: blogging
Record 4: vacation
You'll see that after hitting ^C, the main thread continues executing. The addIngToString method was able to complete adding "ing" to the word "blogg" even after ^C was hit. The main thread was able to complete its work because the shutdown hook (thread) was running and waiting for the method to complete before returning. The shutdown hook in this demo application "guards" against the JVM stopping the main thread prematurely.

The shutdown hook code gets added in the registerShutdownHook method:
private void registerShutdownHook() {
System.out.println("Registering shutdown hook");
this.shutdownThread = new Thread("myhook") {
public void run() {
// For demonstration purposes: Don't give chance to
// shutdown unless flag is set. Just show data.
if (!runningInCleanMode) {
showData();
return;
}
synchronized(this) {
if (!readyToExit) {
isVMShuttingDown = true;
System.out.println("Shutdown hook: Waiting to exit");
try {
// Wait up to 1.5 secs for a record to be processed.
wait(1500);
} catch (InterruptedException ignore) {
}
if (!readyToExit) {
System.out.println(
"Main processing interrupted." +
" Data corruption possible."
);
}
}
}
showData(); // To demo current state of data.
}
/**
* For demo purposes: Show data to see whether it is "corrupted"
*/
private void showData() {
for (int i = 0, j = dataToProcess.length; i < j; i++) {
System.err.println(
"Record " + i + ": " + dataToProcess[i]
);
}
}
};
Runtime.getRuntime().addShutdownHook(this.shutdownThread);
}
This method creates (but does not start) a new thread, then registers that thread as a shutdown hook with the Java runtime. The shutdown thread uses two instance boolean variables to communicate with the main thread: readyToExit and isVMShuttingDown. If the main thread has already set readToExit to true, the shutdown hook knows the main thread has finished its processing and doesn't need to be "guarded" against JVM shutdown.

If readyToExit is false, the run method sets the isVMShuttingDown flag to true to tell the main thread that it better finish what it's doing (processing the current string record) and then exit -- to avoid being killed by the operating system in mid-record. After setting that flag, it waits for up to 1.5 seconds for the main thread to finish. If the main thread hasn't set the readyToExit flag after waiting, the shutdown hook thread prints a warning to say the data might indeed get corrupted.

Most of the other code in the shutdown hook thread is there for demonstration purpose: to alter behavior depending on whether we are running in "clean" or "dirty" mode, and to print the data being processed in the showData method so we can see for ourselves whether the data has been "corrupted." In fact, the only reason the dataToProcess variable is an instance variable is so the shutdown thread can see it for demonstration purposes.

The other method that cooperates with the shutdown thread is startProcessing. This method checks whether a JVM shutdown is in progress after fully processing one record (which the addIngToString performs).
public void startProcessing(StringBuffer[] records) {
this.dataToProcess = records; // Store for demo purposes.
System.out.println("Starting processing");
int recordsProcessed = 0;
try {
for (int i = 0, j = records.length; i < j; i++) {
// Process this next record but don't let ^C interrupt
// unless it takes more than 1.5 seconds.
addIngToString(records[i]);
++recordsProcessed;
// Don't continue if VM is trying to shut down.
if (this.isVMShuttingDown) {
System.out.println(
"Process interrupted. Shutting down after processing " +
recordsProcessed + " records."
);
signalReadyToExit();
break;
}
} // end while.
// Tell shutdown hook we're done then unregister it.
signalReadyToExit(); // In case shutdown thread already running.
unregisterShutdownHook();
System.out.println(
"Finished comparison. Processed " + recordsProcessed +
" records."
);
} catch (RuntimeException rte) {
System.err.println(
"Got unexpected runtime exception: " + rte.getMessage()
);
throw rte;
} finally {
// Cleanup, assuming we had resources to close.
}
}
You can see that after each call to addIngToString, the "if" statement:
// Don't continue if VM is trying to shut down.
if (this.isVMShuttingDown) {
System.out.println(
"Process interrupted. Shutting down after processing " +
recordsProcessed + " records."
);
signalReadyToExit();
break;
}
checks to see if the shutdown thread has warned us that the JVM is being shutdown early for some reason. If so, it calls signalReadyToExit to tell the shutdown thread that we've acknowledged the JVM shutdown and we are NOT in the middle of processing a record, so it is OK to exit and allow the JVM to complete shutdown processing. It then breaks out of the "for" loop to skip processing further records.

As a side note, the check of the isVMShuttingDown variable should be in a synchronized block for complete thread safety, especially on a multi-CPU system. The Java memory model guarantees cooperating threads can see changes to shared data only when a thread owns the shared lock. I left it out of the above code for ease of reading.

The last method we'll call out for special attention is the method we are trying to ensure runs atomically. The addIngToString simulates the method that performs some important processing that, if interrupted by a JVM shutdown, we still want it to complete processing.
private void addIngToString(StringBuffer buffer) {
char[] toAppend = { 'i', 'n', 'g' };
for (int i = 0, j = toAppend.length; i < j; i++) {
System.out.print(
"About to add '" + toAppend[i] + "' to " + buffer
);
sleep(333); // Sleep to simulate long processing
buffer.append(toAppend[i]);
System.out.println("...added");
}
}
This method slowly adds "ing" to the given string buffer argument. The sleep method performs the obvious Thread.sleep call.

Although this demonstration program has a lot of code, most of it is there for verbosity. The extra code allows us to watch what is happening and be able to set the two different runtime modes to see how the data can get "corrupted" if not being guarded by the shutdown hook. The actual code to protect against data corruption is quite small: 14 real lines of code in the shutdown hook and about six lines in the main application to set and check the shutdown flags being set and watched by the shutdown hook.

Once you get used to the threading issues, adding shutdown hooks to applications for cleaner handling of JVM exits is easy, and helps insulate applications against unexpected interruptions.