Let’s imagine a system that sometimes needs to copy a file to a few locations, but in a way where responsiveness is critical. In other words, if for some reason a file system is overloaded and we are unable to write our file in less than a second, it should give up.
ExecutorService
is a very convenient tool for the job. You can easily use it for executing several tasks in parallel (each writing to a different file system). Yuo also can tell it to give up after some timeout, and it will interrupt them for you. Perfect, just what we need.
The scaffolding looks like this:
void testCopy() throws Exception {
ThreadPoolExecutor exec = (ThreadPoolExecutor) Executors
.newCachedThreadPool();
final long start = System.currentTimeMillis();
Callable<Object> task = new Callable<Object>() {
@Override
public Object call() throws Exception {
try {
copy("a.bin", "b.bin");
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Call really finished after: "
+ (System.currentTimeMillis() - start));
return null;
}
};
Collection<Callable<Object>> taskWrapper = Arrays.asList(task);
List<Future<Object>> futures = exec.invokeAll(taskWrapper, 50,
TimeUnit.MILLISECONDS);
System.out.println("invokeAll finished after: "
+ (System.currentTimeMillis() - start));
System.out.println("Future.isCancelled? "
+ futures.get(0).isCancelled());
Thread.sleep(20);
System.out.println("Threads still active: " + exec.getActiveCount());
}
To simulate response to timeouts on a healthy system with low load, I use a 100 MB file and very short timeout. The task always times out, there is no way my system can copy 100 MB in 50 ms.
I expect the following results:
invokeAll
finished after about 50 ms.
Future.isCancelled?
is true.
- Active thread count is 0. The sleep is there to eliminate some edge cases. Long story short, it gives the copy function some time to detect the interruption.
- Call really finishes after about 50 ms. This is very important, I definitely do not want the IO operations to continue after the task is cancelled. Under higher load that would breed way too many threads stuck in bogus IO.
Just in case, those tests were run on the 1.6 JVM from Oracle on 64-bit Windows 7.
Solution 1: Stream Copy
The first attempt is probably the straightforward – use a loop with a buffer and classic IO, like this:
private void copy(String in, String out) throws Exception {
FileInputStream fin = new FileInputStream(in);
FileOutputStream fout = new FileOutputStream(out);
byte[] buf = new byte[4096];
int read;
while ((read = fin.read(buf)) > -1) {
fout.write(buf, 0, read);
}
fin.close();
fout.close();
}
That’s what all popular stream copying libraries do, including IOUtils
from Apache Commons and ByteStreams
from Guava.
It also fails miserably:
invokeAll finished after: 53
Future.isCancelled? true
Threads still active: 1
Call really finished after: 338
The reason is fairly obvious: there is no check for thread interrupted status in the loop or anywhere, so the thread continues normally.
Solution 2: Stream Copy with Check for Interruption
Let’s fix that! One way to do it is:
while ((read = fin.read(buf)) > -1) {
fout.write(buf, 0, read);
if (Thread.interrupted()) {
throw new IOException("Thread interrupted, cancelling");
}
}
Now that works as expected, printing:
invokeAll finished after: 52
java.io.IOException: Thread interrupted, cancelling
at TransferTest.copyInterruptingStream(TransferTest.java:75)
at TransferTest.access$0(TransferTest.java:66)
at TransferTest$1.call(TransferTest.java:25)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)Future.isCancelled? true
at java.lang.Thread.run(Thread.java:662)
Call really finished after: 53
Threads still active: 0
Nice, but I find it unsatisfactory. It looks dirty and I’m not particularly happy with having this code around my IO lib. There must be a better way, which brings us to…
Solution 3: NIO with transfer
NIO has this nice feature that it actually respects thread interruptions. If you try to read from or write to a channel after the thread has been interrupted, you get a ClosedByInterruptException
.
That’s just what I need. For some reason I also read this answer at StackOverflow, saying:
“Don’t use a buffer if you don’t need to. Why copy to memory if your target is another disk or a NIC? With larger files, the latency incured is non-trivial. (…) Use FileChannel.transferTo()
or FileChannel.transferFrom()
. The key advantage here is that the JVM uses the OS’s access to DMA (Direct Memory Access), if present. (This is implementation dependent, but modern Sun and IBM versions on general purpose CPUs are good to go.) What happens is the data goes straight to/from disc, to the bus, and then to the destination…by passing any circuit through RAM or the CPU.”
Great, let’s do it!
private void copy(String in, String out) throws Exception {
FileChannel fin = new FileInputStream(in).getChannel();
FileChannel fout = new FileOutputStream(out).getChannel();
fout.transferFrom(fin, 0, new File(in).length());
fin.close();
fout.close();
}
Output:
invokeAll finished after: 52
Future.isCancelled? true
Threads still active: 1
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:304)
at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:587)
at TransferTest.copyNioTransfer(TransferTest.java:91)
at TransferTest.access$0(TransferTest.java:87)
at TransferTest$1.call(TransferTest.java:27)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Call really finished after: 146
All I do is a trivial call to transferFrom
. It’s very concise, and promises so much support from hardware and OS… But wait a moment, why did it take 146 ms? I mean, 146 milliseconds is much faster than 338 ms in the first test, but I expected it to terminate after around 50 ms.
Let’s repeat the test on a bigger file, something around 1.5 GB:
invokeAll finished after: 9012
Future.isCancelled? true
Threads still active: 1
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
(...)
Call really finished after: 9170
How awful is that? This is probably the worst thing that could happen:
- The task was not interrupted in a timely manner. 9 seconds is way too long, I expected around 50 millis.
invokeAll
was blocked for the entire time of the operation – 9 seconds. What the hell?
Solution 4 – NIO with Buffering
It turns out I do need some buffering. Let’s try with this one:
private void copyNioBuffered(String in, String out) throws Exception {
FileChannel fin = new FileInputStream(in).getChannel();
FileChannel fout = new FileOutputStream(out).getChannel();
ByteBuffer buff = ByteBuffer.allocate(4096);
while (fin.read(buff) != -1 || buff.position() > 0) {
buff.flip();
fout.write(buff);
buff.compact();
}
fin.close();
fout.close();
}
Output:
invokeAll finished after: 52
Future.isCancelled? true
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:203)
at TransferTest.copyNioBuffered(TransferTest.java:105)
at TransferTest.access$0(TransferTest.java:98)
at TransferTest$1.call(TransferTest.java:29)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Call really finished after: 55
Threads still active: 0
Now that’s exactly what I needed. It respects interruptions by itself, so I don’t need those tedious checks all over my IO utility.
Quirks: Different types of channels
If my IO utility is only used for copying files that it gets by name, like this:
static public void copy(String source, String destination)
… then it’s fairly easy to rewrite the method for NIO.
But what if it’s a more generic signature that operates on streams?
static public void copy(InputStream source, OutputStream destination)
NIO has a little Channels
utility with very useful methods like:
public static ReadableByteChannel newChannel(InputStream in)
public static WritableByteChannel newChannel(OutputStream out)
So it almost seems like we could wrap our streams using this helper and benefit from interruptible NIO API. Until we look at the source:
public static WritableByteChannel newChannel(final OutputStream out) {
if (out == null) {
throw new NullPointerException();
}
if (out instanceof FileOutputStream &&
FileOutputStream.class.equals(out.getClass())) {
return ((FileOutputStream)out).getChannel();
}
return new WritableByteChannelImpl(out);
}
private static class WritableByteChannelImpl
extends AbstractInterruptibleChannel // Not really interruptible
implements WritableByteChannel
{
// ... Ignores interrupts completely
Watch out! If your streams are file streams, they will be interruptible. Otherwise you’re out of luck – it’s just a dumb wrapper, more like an adapter for API compatibility. Assumptions kill, always check the source.