Monthly Archives: June 2012

Testing Spring & Hibernate Without XML

I’m very keen on the improvements in Spring 3 that eventually let you move away from XML into plain Java configuration with proper support from IDE and compiler. It doesn’t change the fact that Spring is a huge suite and it sometimes finding the thing you need can take a while.

XML-free unit tests around Hibernate are one such thing. I knew it was possible, but it took me more than 5 minutes to find all the pieces, so here I am writing it down.

I am going to initialize all my beans in a @Configuration class like this:

@Configuration
@EnableTransactionManagement
public class TestRepositoryConfig {
	@Bean
	public DataSource dataSource() {
		return new EmbeddedDatabaseBuilder().setType(EmbeddedDatabaseType.H2)
				.setName("Nuts").build();
	}

	@Bean
	public LocalSessionFactoryBean sessionFactoryBean() {
		LocalSessionFactoryBean result = new LocalSessionFactoryBean();
		result.setDataSource(dataSource());
		result.setPackagesToScan(new String[] { "pl.squirrel.testnoxml.entity" });

		Properties properties = new Properties();
		properties.setProperty("hibernate.hbm2ddl.auto", "create-drop");
		result.setHibernateProperties(properties);
		return result;
	}

	@Bean
	public SessionFactory sessionFactory() {
		return sessionFactoryBean().getObject();
	}

	@Bean
	public HibernateTransactionManager transactionManager() {
		HibernateTransactionManager man = new HibernateTransactionManager();
		man.setSessionFactory(sessionFactory());
		return man;
	}

	@Bean
	public OrderRepository orderRepo() {
		return new OrderRepository();
	}
}

… and my test can look like this:

@RunWith(SpringJUnit4ClassRunner.class)
@TransactionConfiguration(defaultRollback = true)
@ContextConfiguration(classes = { TestRepositoryConfig.class })
@Transactional
public class OrderRepositoryTest {
	@Autowired
	private OrderRepository repo;

	@Autowired
	private SessionFactory sessionFactory;

	@Test
	public void testPersistOrderWithItems() {
		Session s = sessionFactory.getCurrentSession();

		Product chestnut = new Product("Chestnut", "2.50");
		s.save(chestnut);
		Product hazelnut = new Product("Hazelnut", "5.59");
		s.save(hazelnut);

		Order order = new Order();
		order.addLine(chestnut, 20);
		order.addLine(hazelnut, 150);

		repo.saveOrder(order);
		s.flush();

		Order persistent = (Order) s.createCriteria(Order.class).uniqueResult();
		Assert.assertNotSame(0, persistent.getId());
		Assert.assertEquals(new OrderLine(chestnut, 20), persistent
				.getOrderLines().get(0));
		Assert.assertEquals(new OrderLine(hazelnut, 150), persistent
				.getOrderLines().get(1));
	}
}

There are a few details worth noting here, though:

  1. I marked the test @Transactional, so that I can access Session directly. In this scenario, @EnableTransactionManagement on @Configuration seems to have no effect as the test is wrapped in transaction anyway.
  2. If the test is not marked as @Transactional (sensible when it only uses @Transactional components), the transaction seems to always be committed regardless of @TransactionConfiguration settings.
  3. If the test is marked as @Transactional, @TransactionConfiguration seems to be applied by default. Even if it’s omitted the transaction will be rolled back at the end of the test, and if you want it committed you need @TransactionConfiguration(defaultRollback=false).
  4. This probably goes without saying, but the @Configuration for tests is probably different from production. Here it uses embedded H2 database, for real application I would use a test database on the same engine as production.

That’s it, just those two Java classes. No XML or twisted depedencies. Take a look at my github repository for complete code.

IO vs. NIO – Interruptions, Timeouts and Buffers

Let’s imagine a system that sometimes needs to copy a file to a few locations, but in a way where responsiveness is critical. In other words, if for some reason a file system is overloaded and we are unable to write our file in less than a second, it should give up.

ExecutorService is a very convenient tool for the job. You can easily use it for executing several tasks in parallel (each writing to a different file system). Yuo also can tell it to give up after some timeout, and it will interrupt them for you. Perfect, just what we need.

The scaffolding looks like this:

void testCopy() throws Exception {
	ThreadPoolExecutor exec = (ThreadPoolExecutor) Executors
			.newCachedThreadPool();
	final long start = System.currentTimeMillis();
	Callable<Object> task = new Callable<Object>() {
		@Override
		public Object call() throws Exception {
			try {
				copy("a.bin", "b.bin");
			} catch (Exception e) {
				e.printStackTrace();
			}
			System.out.println("Call really finished after: "
					+ (System.currentTimeMillis() - start));
			return null;
		}
	};
	Collection<Callable<Object>> taskWrapper = Arrays.asList(task);
	List<Future<Object>> futures = exec.invokeAll(taskWrapper, 50,
			TimeUnit.MILLISECONDS);
	System.out.println("invokeAll finished after: "
			+ (System.currentTimeMillis() - start));
	System.out.println("Future.isCancelled? "
			+ futures.get(0).isCancelled());
	Thread.sleep(20);
	System.out.println("Threads still active: " + exec.getActiveCount());
}

To simulate response to timeouts on a healthy system with low load, I use a 100 MB file and very short timeout. The task always times out, there is no way my system can copy 100 MB in 50 ms.

I expect the following results:

  1. invokeAll finished after about 50 ms.
  2. Future.isCancelled? is true.
  3. Active thread count is 0. The sleep is there to eliminate some edge cases. Long story short, it gives the copy function some time to detect the interruption.
  4. Call really finishes after about 50 ms. This is very important, I definitely do not want the IO operations to continue after the task is cancelled. Under higher load that would breed way too many threads stuck in bogus IO.

Just in case, those tests were run on the 1.6 JVM from Oracle on 64-bit Windows 7.

Solution 1: Stream Copy

The first attempt is probably the straightforward – use a loop with a buffer and classic IO, like this:

private void copy(String in, String out) throws Exception {
	FileInputStream fin = new FileInputStream(in);
	FileOutputStream fout = new FileOutputStream(out);

	byte[] buf = new byte[4096];
	int read;
	while ((read = fin.read(buf)) > -1) {
		fout.write(buf, 0, read);
	}

	fin.close();
	fout.close();
}

That’s what all popular stream copying libraries do, including IOUtils from Apache Commons and ByteStreams from Guava.

It also fails miserably:

invokeAll finished after: 53
Future.isCancelled? true
Threads still active: 1
Call really finished after: 338

The reason is fairly obvious: there is no check for thread interrupted status in the loop or anywhere, so the thread continues normally.

Solution 2: Stream Copy with Check for Interruption

Let’s fix that! One way to do it is:

while ((read = fin.read(buf)) > -1) {
	fout.write(buf, 0, read);
	if (Thread.interrupted()) {
		throw new IOException("Thread interrupted, cancelling");
	}
}

Now that works as expected, printing:

invokeAll finished after: 52
java.io.IOException: Thread interrupted, cancelling
	at TransferTest.copyInterruptingStream(TransferTest.java:75)
	at TransferTest.access$0(TransferTest.java:66)
	at TransferTest$1.call(TransferTest.java:25)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)Future.isCancelled? true
	at java.lang.Thread.run(Thread.java:662)

Call really finished after: 53
Threads still active: 0

Nice, but I find it unsatisfactory. It looks dirty and I’m not particularly happy with having this code around my IO lib. There must be a better way, which brings us to…

Solution 3: NIO with transfer

NIO has this nice feature that it actually respects thread interruptions. If you try to read from or write to a channel after the thread has been interrupted, you get a ClosedByInterruptException.

That’s just what I need. For some reason I also read this answer at StackOverflow, saying:

“Don’t use a buffer if you don’t need to. Why copy to memory if your target is another disk or a NIC? With larger files, the latency incured is non-trivial. (…) Use FileChannel.transferTo() or FileChannel.transferFrom(). The key advantage here is that the JVM uses the OS’s access to DMA (Direct Memory Access), if present. (This is implementation dependent, but modern Sun and IBM versions on general purpose CPUs are good to go.) What happens is the data goes straight to/from disc, to the bus, and then to the destination…by passing any circuit through RAM or the CPU.”

Great, let’s do it!

private void copy(String in, String out) throws Exception {
	FileChannel fin = new FileInputStream(in).getChannel();
	FileChannel fout = new FileOutputStream(out).getChannel();

	fout.transferFrom(fin, 0, new File(in).length());

	fin.close();
	fout.close();
}

Output:

invokeAll finished after: 52
Future.isCancelled? true
Threads still active: 1
java.nio.channels.ClosedByInterruptException
	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
	at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:304)
	at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:587)
	at TransferTest.copyNioTransfer(TransferTest.java:91)
	at TransferTest.access$0(TransferTest.java:87)
	at TransferTest$1.call(TransferTest.java:27)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Call really finished after: 146

All I do is a trivial call to transferFrom. It’s very concise, and promises so much support from hardware and OS… But wait a moment, why did it take 146 ms? I mean, 146 milliseconds is much faster than 338 ms in the first test, but I expected it to terminate after around 50 ms.

Let’s repeat the test on a bigger file, something around 1.5 GB:

invokeAll finished after: 9012
Future.isCancelled? true
Threads still active: 1
java.nio.channels.ClosedByInterruptException
	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
	(...)
Call really finished after: 9170

How awful is that? This is probably the worst thing that could happen:

  • The task was not interrupted in a timely manner. 9 seconds is way too long, I expected around 50 millis.
  • invokeAll was blocked for the entire time of the operation – 9 seconds. What the hell?

Solution 4 – NIO with Buffering

It turns out I do need some buffering. Let’s try with this one:

private void copyNioBuffered(String in, String out) throws Exception {
	FileChannel fin = new FileInputStream(in).getChannel();
	FileChannel fout = new FileOutputStream(out).getChannel();

	ByteBuffer buff = ByteBuffer.allocate(4096);
	while (fin.read(buff) != -1 || buff.position() > 0) {
		buff.flip();
		fout.write(buff);
		buff.compact();
	}

	fin.close();
	fout.close();
}

Output:

invokeAll finished after: 52
Future.isCancelled? true
java.nio.channels.ClosedByInterruptException
	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
	at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:203)
	at TransferTest.copyNioBuffered(TransferTest.java:105)
	at TransferTest.access$0(TransferTest.java:98)
	at TransferTest$1.call(TransferTest.java:29)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Call really finished after: 55
Threads still active: 0

Now that’s exactly what I needed. It respects interruptions by itself, so I don’t need those tedious checks all over my IO utility.

Quirks: Different types of channels

If my IO utility is only used for copying files that it gets by name, like this:

static public void copy(String source, String destination)

… then it’s fairly easy to rewrite the method for NIO.

But what if it’s a more generic signature that operates on streams?

static public void copy(InputStream source, OutputStream destination)

NIO has a little Channels utility with very useful methods like:

public static ReadableByteChannel newChannel(InputStream in)
public static WritableByteChannel newChannel(OutputStream out)

So it almost seems like we could wrap our streams using this helper and benefit from interruptible NIO API. Until we look at the source:

public static WritableByteChannel newChannel(final OutputStream out) {
	if (out == null) {
	    throw new NullPointerException();
	}

	if (out instanceof FileOutputStream &&
		FileOutputStream.class.equals(out.getClass())) {
		return ((FileOutputStream)out).getChannel();
	}

	return new WritableByteChannelImpl(out);
}

private static class WritableByteChannelImpl
	extends AbstractInterruptibleChannel	// Not really interruptible
	implements WritableByteChannel
{
// ... Ignores interrupts completely

Watch out! If your streams are file streams, they will be interruptible. Otherwise you’re out of luck – it’s just a dumb wrapper, more like an adapter for API compatibility. Assumptions kill, always check the source.

Learning to Fail

Back at university, when I dealt with much low-level problem solving and very basic libararies and constructs, I learned to pay attention to what can possibly go wrong. A lot. Implementing reliable, hang-proof communication over plain sockets? I remember it today, a trivial loop of “core logic” and a ton of guards around it.

Now I suspect I am not the only person who got used to all the convenient higher-level abstractions so much that he began to forget this approach. Thing is, real software is a little bit more complex, and the fact that our libraries kind of deal with most low-level problems for us doesn’t mean there are no reasons to fail.

Software

As I’m reading “Release It!” by Michael T. Nygard, I keep nodding in agreement: Been there, done this, suffered that. I’ve just started, but it’s already shown quite a few interesting examples of failure and error handling.

Michael describes a spectacular outage of an airline software. Its experienced designers expected many kinds of failures and avoided many obvious issues. There was a nice layered architecture, with proper redundancy on every level from clients and terminals, through servers, through database. All was well, yet on a routine maintenance in database the entire system just hung. It did not kill anyone, but delayed flights and serious financial losses have an impact too.

The root cause turned out to be one swallowed exception on servers talking to the database, thrown by JDBC driver when the virtual IP of the database server was remapped. If you don’t have proper handling for such situations, one such leakage can lock the entire server as all of its threads wait for the connection or for each other. Since there were no proper timeouts anywhere in the server or above, eventually everything hung.

Now it’s easy to say: It’s obvious, thou shalt not swallow exceptions, you moron, and walk on. Or is it?

The thing is, an unexpected or improperly handled error can always happen. In hardware. Or a third party component. Or core library of your programming language. Or even you or your colleague can screw up and fail to predict something. It. Just. Happens.

Real Life

Let’s take a look at two examples from real life.

Everyone gets in the car thinking: I’m an awesome driver, accidents happen but not to me. Yet somehow we are grateful for having airbags, carefully designed crumple zones, and all kinds of automatic systems that prevent or mitigate effects of accidents.

If you were offered two cars at the same cost, which would you choose? One is in pimp-my-ride style with extremely comfortable seats, sat TV, bright pink wheels and whatever unessential features. But it breaks down every so often based on its mood or the moon cycle, and would certainly kill you if you hit a hedgehog. The other is just comfortable enough, completely boring, no cool features to show off at all. But it will serve you 500,000 kilometers without a single breakdown and save your life when you hit a tree. Obvious, right?

Another example. My brother-in-law happens to be a construction manager at a pretty big power plant. He recently took me on a trip and explained some basics on how it works, and one thing really struck me.

The power station consists of a dozen separate generation units and is designed to survive all kinds of failures. I was impressed, and still am, that in power plant business it’s normal to say stuff like: If this block goes dark, this and this happens, that one takes over, whatever. No big deal. Let’s put it in a perspective. A damn complicated piece of engineering that can detect any potentially dangerous conditions, alarm, shut down and fail over just like that. From small and trivial things like variations in pressure or temperature, through conditions that could blow the whole thing up. And it is so reliable that when people talk about such conditions, rare and severe as they are, they say it in the same tone as “in case of rain the picnic will be held at Ms. Johnson’s”.

Software Again

In his “After the Disaster” post, Uncle Bob asked: “How many times per day do you put your life in the hands of an ?if? statement written by some twenty-two year old at three in the morning, while strung out on vodka and redbull?”

I wish it was a rhetorical question.

We are pressed hard to focus on adding shiny new features, as fast as possible. That’s what makes our bosses and their bosses shine and what brings money to the table. But not only them, even we (the developers) naturally take most pride in all those features and find them the most exciting part of our work.

Remember that we’re here to serve. While pumping out features is fun, remember that those people simply rely on you. Even if you don’t directly cause death or injury, your outages can still affects lives. Think more like a car or power station designer, your position is really closer to theirs than to a lone hippie who’s building a little wobbly shack for himself.

When an outage happens and also causes financial loss, you will be to blame. If that reasoning does not work, do it for yourself – pay attention now to avoid pain in future, be it regular panic calls at 3 AM or your boss yelling at you.

More Stuff

Michael T. Nygard ends that airline example with a very valuable advice. Obvious as it may seem, it feels different if you realize it and engrave it deep in your mind. Expect failure everywhere, and plan for it. Even if your tools handle some failures, they can’t do everything for you. Even if you have at least two of each thing (no single point of failure), you can still suffer from bad design. Be paranoid. Place crumple zones on every integration point with other systems, and even different components of your system, in order to prevent cracks from propagating. Optimistic monoliths fail hard.

Want something more concrete? Go read “Release It!”, it’s full of great and concrete examples. There’s a reason why it fits in a book and not in a blog post.