Tag Archives: Code Quality

Validating Class/Package Dependencies with Classycle

April 29, 2015Java, ProgrammingCode Quality, Gradle, TestingKonrad Garus

Classycle is a very nice analyzer and dependency checker for class and package dependencies.

It lets you define package groups (components, layers) and express unwanted dependencies such as cycles, or dependencies between particular packages. For example you can specify that you want no package cycles and no dependencies from com.foo.domain.* on com.foo.api.*. All using a very human-friendly, concise format.

Then you kick off the analyzer (it comes with an Ant task and a standalone command line tool) and it produces a report with violations.

There is a number of other tools out there: JDepend, Sonar, JArchitect and so on. So why Classycle?

It’s free (BSD license).
It’s fast.
It’s powerful and expressive. The rules take just a few lines of easily readable text.
It integrates with build tools very well. We have it running as part of the build script, for every build. It’s really just another automated test. Thanks to that the project structure is probably the cleanest one I’ve worked with so far.

Gradle Plugin

Thanks to having an Ant task Classycle is very easy to integrate with Gradle, with one caveat: The official build is not in Maven Central, and the only build that is there does not include the Ant task.

Gradle itself uses Classycle via a script plugin, buried somewhere in project structure. They published Classycle on their own repository, but it’s the older version that doesn’t support Java 8.

Inspired by that, we wrote our own plugin and made it available for everyone with minimum effort. It’s available on Gradle Plugin Portal and on GitHub.

In order to use it, all you need is:

Add the plugin to your project:

plugins { id "pl.squirrel.classycle" version "1.1" }

Create Classycle definition file for each source set you want to have covered in src/test/resources/classycle-${sourceSet.name}.txt:
```
show allResults

{package} = com.example
check absenceOfPackageCycles > 1 in ${package}.*
```

Congratulations, that’s all it takes to integrate Classycle with your Gradle build! Now you have the following tasks:

# For each source set that has the dependency definition file:
classycleMain, classycleTest, ... 

# Analyze all source steps in one hit:
classycle

# Also part of the check task:
check

See Plugin Portal and GitHub for more information. Happy validating!

Domain Modeling: Naive OO Hurts

September 4, 2012ProgrammingCode Quality, DDD, HibernateKonrad Garus

I’ve read a post recently on two ways to model data of business domain. My memory is telling me it was Ayende Rahien, but I can’t find it on his blog.

One way is full-blown object-relational mapping. Entities reference each other directly, and the O/R mapper automatically loads data for you as you traverse the object graph. To obtain Product for an OrderLine, you just call line.getProduct() and are good to go. Convenient and deceptively transparent, but can easily hurt performance if you aren’t careful enough.

The other way is what that post may have called a document-oriented mapping. Each entity has its ID and its own data. It may have some nested entities if it’s an aggregate root (in domain-driven design terminology). In this case, OrderLine only has productId, and if you want to get the product you have to call ProductRepository.getProduct(line.getProductId()). It’s a bit less convenient and requires more ceremony, but thanks to its explicitness it also is much easier to optimize or avoid performance pitfalls.

So much for the beforementioned post. I recently had an opportunity to reflect more on this matter on a real world example.

The Case

The light dawned when I set out to create a side project for a fairly large system that has some 200+ Hibernate mappings and about 300 tables. I knew I only needed some 5 core tables, but for the sake of consistency and avoiding duplication I wanted to reuse mappings from the big system.

I knew there could be more dependencies on things I don’t need, and I did not have a tool to generate a dependency graph. I just included the first mapping, watched Hibernate errors for unmapped entities, added mappings, checked error log again… And so on, until Hibernate was happy to know all the referenced classes.

When I finished, the absolutely minimal and necessary “core” in my side project had 110 mappings.

As I was adding them, I saw that most of them are pretty far from the core and from my needs. They corresponded to little subsystems somewhere on the rim.

It felt like running a strong magnet over a messy workplace full of all kinds of metal things when all I needed was two nails.

Pain Points

It turns out that such object orientation is more pain than good. Having unnecessary dependencies in a spin-off reusing the core is just one pain point, but there are more.

It also is making my side project slower and using too many resources – I have to map 100+ entities and have them supported in my 2nd level cache. When I’m loading some of the core entities, I also pull many things I don’t need: numerous fields used in narrow contexts, even entire eagerly-loaded entities. At all times I have too much data floating around.

Such a model also is making development much slower. Build and tests take longer, because there are many more tables to generate, mappings to scan etc.

It’s also slower for another reason: If a domain class references 20 other classes, how does a developer know which are important and which are not? In any case it may lead to very long and somewhat unpleasant classes. What should be core becomes a gigantic black hole sucking in the entire universe. When an unaware newbie goes near, most of the time he will either sink trying to understand everything, or simply break something – unaware of all the links in his context, unable to understand all links present in the class. Actually, even seniors can be deceived to make such mistakes.

The list is probably much longer.

Solution?

There are two issues here.

How did that happen?

I’m writing a piece of code that’s pretty distant from the core, but could really use those two new attributes on this core entity. What is the fastest way? Obvious: Add two new fields to the entity. Done.

I need to add a bunch of new entities for a new use case that are strongly related to a core entity. The shortest path? Easy, just reference a few entites from the core. When I need those new objects and I already have the old core entity, Hibernate will do the job of loading the new entities for me as I call the getters. Done.

Sounds natural and I can see how I could make such mistakes a few years ago, but the trend could have been stopped or even reversed. With proper code reviews and retrospectives, the team may have found a better way earlier. Having some slack and good will it may have even refactored the existing code.

Is there a better way to do it?

Let’s go back to the opening section on two ways to map domain classes: “Full-blown ORM” vs. document/aggregate style.

Today I believe full-blown ORM may be a good thing for a fairly small project with a few closely related use cases. As soon as we branch out new bigger chunks of functionality and introduce more objects, they should become their own aggregates. They should never be referenced from the core, even though they themselves may orbit around and have a direct link to the core. The same is true for the attributes of core entites: If something is needed in a faraway use case, don’t spoil the core mapping with a new field. Even introduce a new entity if necessary.

In other words, learn from domain-driven design. If you haven’t read the book by Eric Evans yet, go do it now. It’s likely the most worthwhile and influential software book I’ve read to date.

“Release It!”

August 10, 2012ProgrammingBooks, Code QualityKonrad Garus

A while ago I wrote a post on Learning to Fail inspired largely by Michael T. Nygard’s book titled “Release It”. Now it’s time to review the book itself.

As the sub-title says, the book is all about designing and deploying production-ready software. It opens with a great introduction on why it really matters: Because software often is critical to business. Because its reliability and performance is really our job and matter of professionalism. Finally (if that’s not enough), because its behavior in production will have huge impact on our quality of life as well – matter of choosing between panic attacks and phone ringing at 4 AM, or software Just Working by itself, letting you enjoy healthy life and doing more fun stuff at work. That’s the center of mass here, by the way: More on development and operations, less on management and business.

The book is divided into four main areas. Each starts with a bit of theoretical introduction and/or an anecdote, followed by discussion of concrete phenomena, problems and solutions. Even though it might appear as a collection of patterns and antipatterns, it’s much more than that. Patterns and antipatterns are just a form, but it’s really about setting the focus for a few pages and naming the problem. Anyway, the “pattern” and “antipattern” concept is gone by the middle of the book.

The first part talks about stability, and how it’s impacted by error propagation, lack of timeouts, all kinds of poor error handling, weaker links etc. Then it shows solutions: How to stop errors from propagating. How to be paranoid, expect failure in each integration point (with 3rd party and not), and deal with them. How to fail fast. And so on.

The second part talks about capacity: Dealing with load, understanding constraints and making predictions. Impact from seasonal phenomena or ad campains. Strange and not obvious usage patterns – hitting “refresh” button, web scrapers etc. Finally, dealing with those issues with proper use of caching, pooling, precomputing content and tuning garbage collection.

The third part is a bag with all kinds of design issues: networking, security, availability (understanding and defining requirements, followed by load balancing and clustering), testing and administration.

The last part is all about operations: logging, monitoring, transparency, releasing, that kind of stuff. How to organize it so that routing maintenance will be less pain, monitor will let us detect issues early, and finally after or during an issue we will have enough information to diagnose it.

Some problems are discussed from bird’s eye view. Most problems are more down-to-earth, providing detailed discussion of an issue with a sketch for solution with its weak and strong points. Finally, when applicable, author rolls up his sleeves and is ready to talk about concrete code, SQL, heapdumps, scripting etc.

The book is actually full of real war stories, anecdotes, code samples, tool descriptions, case studies, and all kinds of concrete content. There are a few larger stories that go on like this: On this project the team did this, this and that in order to migate such and such risks. When marketing sent an advert, or when the system was launched, or during routine maintenance, this and this broke and started causing problems. We did heapdumps, monitored traffic and contents, read or decompiled the code etc. and discovered problems there and there. Finally, we solved them with this and that. And here comes the detailed list of trouble spots and ways to mitigate them. It’s really a complete view – from business perspective and needs, down to nitpicking about particular piece of code or discussing popular tools.

Apart from being a great collection of real problems and tricks, there is one longer lasting, recurring aspect that may be the most valuable lesson here. Michael T. Nygard regularly shows (and makes you feel it deep in your guts, especially if you did some maintenance in production) that you really should be expecting failure everywhere and every time. You should try and predict, and mitigate, as many issues as possible, as early as possible. You should be paranoid. More than that, embrace the fact that you will fail to predict everything, and design so that even random unpredictable failures won’t take you down and may be easier to solve.

All the time it’s very concrete and complete. It also feels very professional, genuine and even inspiring.

Highly recommended.

Learning to Fail

June 1, 2012Life, Programming, ScienceBooks, Code Quality, Life, ScienceKonrad Garus

Back at university, when I dealt with much low-level problem solving and very basic libararies and constructs, I learned to pay attention to what can possibly go wrong. A lot. Implementing reliable, hang-proof communication over plain sockets? I remember it today, a trivial loop of “core logic” and a ton of guards around it.

Now I suspect I am not the only person who got used to all the convenient higher-level abstractions so much that he began to forget this approach. Thing is, real software is a little bit more complex, and the fact that our libraries kind of deal with most low-level problems for us doesn’t mean there are no reasons to fail.

Software

As I’m reading “Release It!” by Michael T. Nygard, I keep nodding in agreement: Been there, done this, suffered that. I’ve just started, but it’s already shown quite a few interesting examples of failure and error handling.

Michael describes a spectacular outage of an airline software. Its experienced designers expected many kinds of failures and avoided many obvious issues. There was a nice layered architecture, with proper redundancy on every level from clients and terminals, through servers, through database. All was well, yet on a routine maintenance in database the entire system just hung. It did not kill anyone, but delayed flights and serious financial losses have an impact too.

The root cause turned out to be one swallowed exception on servers talking to the database, thrown by JDBC driver when the virtual IP of the database server was remapped. If you don’t have proper handling for such situations, one such leakage can lock the entire server as all of its threads wait for the connection or for each other. Since there were no proper timeouts anywhere in the server or above, eventually everything hung.

Now it’s easy to say: It’s obvious, thou shalt not swallow exceptions, you moron, and walk on. Or is it?

The thing is, an unexpected or improperly handled error can always happen. In hardware. Or a third party component. Or core library of your programming language. Or even you or your colleague can screw up and fail to predict something. It. Just. Happens.

Real Life

Let’s take a look at two examples from real life.

Everyone gets in the car thinking: I’m an awesome driver, accidents happen but not to me. Yet somehow we are grateful for having airbags, carefully designed crumple zones, and all kinds of automatic systems that prevent or mitigate effects of accidents.

If you were offered two cars at the same cost, which would you choose? One is in pimp-my-ride style with extremely comfortable seats, sat TV, bright pink wheels and whatever unessential features. But it breaks down every so often based on its mood or the moon cycle, and would certainly kill you if you hit a hedgehog. The other is just comfortable enough, completely boring, no cool features to show off at all. But it will serve you 500,000 kilometers without a single breakdown and save your life when you hit a tree. Obvious, right?

Another example. My brother-in-law happens to be a construction manager at a pretty big power plant. He recently took me on a trip and explained some basics on how it works, and one thing really struck me.

The power station consists of a dozen separate generation units and is designed to survive all kinds of failures. I was impressed, and still am, that in power plant business it’s normal to say stuff like: If this block goes dark, this and this happens, that one takes over, whatever. No big deal. Let’s put it in a perspective. A damn complicated piece of engineering that can detect any potentially dangerous conditions, alarm, shut down and fail over just like that. From small and trivial things like variations in pressure or temperature, through conditions that could blow the whole thing up. And it is so reliable that when people talk about such conditions, rare and severe as they are, they say it in the same tone as “in case of rain the picnic will be held at Ms. Johnson’s”.

Software Again

In his “After the Disaster” post, Uncle Bob asked: “How many times per day do you put your life in the hands of an ?if? statement written by some twenty-two year old at three in the morning, while strung out on vodka and redbull?”

I wish it was a rhetorical question.

We are pressed hard to focus on adding shiny new features, as fast as possible. That’s what makes our bosses and their bosses shine and what brings money to the table. But not only them, even we (the developers) naturally take most pride in all those features and find them the most exciting part of our work.

Remember that we’re here to serve. While pumping out features is fun, remember that those people simply rely on you. Even if you don’t directly cause death or injury, your outages can still affects lives. Think more like a car or power station designer, your position is really closer to theirs than to a lone hippie who’s building a little wobbly shack for himself.

When an outage happens and also causes financial loss, you will be to blame. If that reasoning does not work, do it for yourself – pay attention now to avoid pain in future, be it regular panic calls at 3 AM or your boss yelling at you.

More Stuff

Michael T. Nygard ends that airline example with a very valuable advice. Obvious as it may seem, it feels different if you realize it and engrave it deep in your mind. Expect failure everywhere, and plan for it. Even if your tools handle some failures, they can’t do everything for you. Even if you have at least two of each thing (no single point of failure), you can still suffer from bad design. Be paranoid. Place crumple zones on every integration point with other systems, and even different components of your system, in order to prevent cracks from propagating. Optimistic monoliths fail hard.

Want something more concrete? Go read “Release It!”, it’s full of great and concrete examples. There’s a reason why it fits in a book and not in a blog post.

Culture Kills (or Wins)

May 18, 2012Life, ProgrammingCode Quality, The Big PictureKonrad Garus

This post is about one of the things that everyone is aware of to some degree. It feels familiar, but the picture becomes a lot sharper once you put it in a proper perspective.

The Project

There is an older project created a few years ago, perhaps in 2000 or 2005. At the time it was running a single service on a single server for 100 users. The architecture and tools were adequate: One project, one interface, one process. Some shortcuts – that’s fine, you need to get something out to get the money in. Now-standard tools and techniques like TDD, IoC / DI, app servers etc. were nowhere to be seen – either because they were not needed at that time, or they did not even exist in a reasonable form (Spring, Guice, JEE in 2000 or 2005?).

Five or ten years later, the load went up by a few orders of magnitude. So did the number of features. The codebase has been growing and growing, and new features have been added all the time.

Now, let’s consider two situations.

Bad Becomes Worse

What if the architecture and process remained the same from the very beginning? Single project (now many megabytes of source code). Single process with core logic driven by eight-KLOC-long abominations. Many interesting twists related to “temporary” shortcuts from the past. No IoC. No DI. No TDD. Libraries from 2000 or 2005. No agile.

It could be for different reasons. The developers are poor, careless souls who have not improved themselves for so many years and they are not aware that different ways exist (or neglect them). Or they know there are better ways, but are drowned with feature requests.

There is little rotation in the team. A few persistent core members persist, able to navigate in this sea of spaghetti. They are still able to pump out new features. Slower and slower, but anyway. That’s probably the only reason keeping the project alive. They just neglect change, because if the thing still kind of works. Why fix something that is not broken? Why spend money on improving something that’s working?

Now we have a fairly fossilized team and project. I dare say in this shape it can only get worse. Even if the product itself was somewhat interesting and stable, would you like to change your job to join the team and work on it? Dealing with tons of legacy code, in a culture that fears change, with no modern tools at your disposal? With no way to learn anything?

Right.

Very good developers usually already have a job and there is no way they would ever quit for this. You will not be able to hire them, unless you pay insane amount of money. Even then, money is not as good a motivator as genuine passion and interest.

Who would do it? Only people with poor skill, lack of experience, desperates or those who don’t give a damn. They will take forever to get up to speed in this ever growing mudball. And because they’re not top class, chances are the project won’t get any better.

We get a nasty negative feedback loop. Bad code and culture encourages more bad code. Mediocre new hires make it even worse. And so the spiral continues.

Good Gets Better

In the second scenario, the team has some caring craftsmen. They constantly read, learn, think, explore and experiment. They observe their product and process and improve them as they recognize more adequate tools and techniques. Some time along the path they broke it down into modules and instead of a monolithic mudball got an extensible service-oriented architecture. They understood inversion of control and brought it in together with a DI container, refactoring the god classes before they grew out of control. They got higher test coverage. In short, they constantly evaluate what and how they’re doing, how they can make it better, and put that to practice.

Now this team can hire pretty much anyone they like. They may decide to hire inexperienced people with the right attitude and train them. But they also are able to attract passionates who are much above average and who will make it even better.

It creates a sweet positive feedback loop. Great culture never loses the edge and it attracts people who can only make it better.

Quality Matters

That’s why quality and refactoring matter. It’s OK to take a shortcut to get something out. It’s OK to use basic tools for a basic job. But if you never improve, the project will stagnate and rot away.

Sure, fulfilling business needs is important. Having few bugs is important. Avoiding damage (to people or the business) is important. But in the long run if you just keep cranking out features and never retrospect or pay down technical debt, it will become a nasty ever-slowing grind. If you’re lucky, it will just get slower and require some babysitting in production and emergency bug fixing. If you’re less lucky, it will become inoperable and completely unmaintainable if some of the persistent spaghetti wranglers leave or are hit by a truck.

Are We Doomed?

To end this sermon with a positive accent, let’s say that while the feedback loops are strong, they are not unbreakable. Culture change is hard in either direction, but possible. If the “bad” team or its management realizes the situation in time and starts improving, they may be able to shift to the positive loop. Introduce slack or retrospectives, start discussion and slowly, but regularly improve. And if for whatever reason you abandon good practices, letting leaders go or drowning them up to the neck with work, it will start the drift towards the negative loop.

Careful with that PrintWriter

December 10, 2011ProgrammingCode Quality, Java, QuirksKonrad Garus

Let’s assume we have a nice abstraction over email messages simply called Email. Something that can write itself to Writer and provide some other functionality (e.g. operating on the inbox).

Now, take this piece of code:

public void saveEmail(Email email) {
	FileOutputStream fos = null;
	try {
		fos = new FileOutputStream("file.txt");
		PrintWriter pw = new PrintWriter(fos);
		email.writeContentTo(pw);
		pw.flush();
		email.deleteMessageFromInbox();
	} catch (IOException e) {
		log.error("Saving email", e);	
	} finally {
		IOUtils.closeQuietly(fos);
	}
}

Question: What happens if for whatever reason the PrintWriter cannot write to disk? Say, your drive just filled up and you can’t squeeze a byte in there?

To my surprise, nothing got logged, and the messages were removed from inbox.

Unfortunately, I learned it the hard way. This code is only a piece of a larger subsystem and it took me quite a while to figure out. Used to dealing mostly with streams, I skipped this area and checked everything else first. If PrintWriter was like a stream, it would throw an exception, skipping deleteMessageFromInbox.

All streams around files work like that, why would this one be different? Well, PrintWriter is completely different. Eventually I checked its Javadoc and saw this:

“Methods in this class never throw I/O exceptions, although some of its constructors may. The client may inquire as to whether any errors have occurred by invoking checkError().”

At first I was furious about finding that’s all due to yet another inconsistency in core Java. Now I’m only puzzled and a bit angry. Most things in Java work like this – they throw exceptions on every occasion and force you to explicitly deal with them. And it’s not unusual to see 3 or 4 of them in a signature.

One could say, it’s all right there in the Javadoc. Correct. But still, it can be confusing to have two fundamentally different approaches to error handling in one corner of the library. If one gets used to dealing with one side of it, he may expect the same from the other. Instead, it turns out you always have to carefully look at signatures and docs, because even in core Java there is no single convention.

Related: try/catch/throw is an antipattern – interesting read, even if a bit too zealous.

Fill and Print an Array in Clojure

October 27, 2011Clojure, ProgrammingClojure, Code Quality, JavaKonrad Garus

In a post in the “Extreme OO in Practice” series (in Polish), Koziołek used the following example: Fill a 3x3x3 matrix with subsequent natural numbers and print it to stdout.

The example in Java is 27 lines of code (if you remove comments). It is later refactored into 55 lines that are supposed to be more readable, but personally I can’t help but gnash my teeth over it.

Anyway, shortly after reading that example, I thought of what it would look like in Clojure. Here it is:

(def array (take 3 (partition 3 (partition 3 (iterate inc 1)))))
(doall (map #(doall (map println %)) array))

That’s it. I know it’s unfair to compare this to Java and say that it’s shorter than your class and main() declaration. But it is, and it makes the full-time Java programmer in me weep. Anyway, I have two observations.

As I like to point out, Java is awfully verbose and unexpressive. Part of the story is strong typing and the nature of imperative programming. But then again, sometimes imperative programming is very clumsy.

Secondly, this solution in Clojure presents a completely different approach to solving the problem. In Java you would declare an array and then iterate over it in nested loops, incrementing the “current value” in every innermost iteration. In Clojure, you start with an infinite sequence of lazy numbers, partition (group) it in blocks (that’s one row), then group those blocks into a 2D array, and take 3 of those 2D blocks as the final 3D matrix. Same thing with printing the array. You don’t iterate over individual cells, but naturally invoke a function on a sequence.

The difference is subtle, but very important. Note how the Java code needed 4 variables for the filling (“current value” and 3 indexes) and 3 indexes for printing. There was a very clear distinction between data and what was happening to it. In Clojure you don’t need to bother with such low-level details. Code is data. Higher-order functions naturally “serve as” data in the first line, and are used to iterate over this data in the second.

Java Does Not Need Closures. Not at All.

October 17, 2011Clojure, ProgrammingClojure, Code Quality, Java, SwingKonrad Garus

I hope that most Java developers are aware of the Swing threading policy. It makes sense, except for that it’s so insanely hard to follow. And if you do follow it, I dare say there is no place with more ridiculous boilerplate.

Here’s two examples that recently got me scratching my head, gnashing my teeth, and eventually weep in disdain.

Case One: Display Download Progress

class Downloader {
    download() {
        progress.startDownload();
        while(hasMoreChunks()) {
            downloadChunk();
            progress.downloadedBytes(n);
        }
        progress.finishDownload();
    }
}

class ProgressDisplay extends JPanel {
    JLabel label;
    startDownload() {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                label.setText("Download started");
            }
        });
    }
    downloadedBytes(int n) {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                label.setText("Downloaded bytes: " + n);
            }
        });
    }
    finishDownload() {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                label.setText("Download finished");
            }
        });
    }
}

Solution? Easy! Just write your own JLabel wrapper to hide this mess. Then a lengthy JTable utility. Then a JProgressBar wrapper. Then…

Case Two: Persist Component State

Task: Save JTable state to disk. You’ve got to read table state in EDT, but writing to disk from EDT is not the the best idea.

Solution:

ExecutorService executor = Executors.newSingleThreadExecutor();

void saveState(final Table table) {
    SwingUtilities.invokeLater(new Runnable() {
        public void run() {
            TableColumns state = dumpState(table);
            executor.execute(new Runnable() {
                public void run() {
                    saveToDisk(table.getTableKey(), state);
                }
            });
        }
    });
}

Two inner classes, inheritance, all those publics and voids and parentheses. And there is no way to avoid that!

In a sane language, it could look like:

(defn save-state [table]
  (do-swing
    (let [state (dump-state table)]
      (future (save-to-disk table state)))))

All semantics. Little ceremony, almost no syntax. If you counted characters in both solutions, I guess the entire second snippet is as long as the first one somewhere to the invocation of invokeLater. Before you get to the actual logic. No strings attached, do-swing and future are provided by the language or “official” utils.

Bottom Line

I recently saw two presentations by Venkat Subramaniam on polyglot programming where he mercilessly made fun of Java. In his (vaguely remembered) words, when you write such code for the whole day, you come back home and weep. You can’t sleep because you’re tormented by the thought of the abomination you created, and that you will have to deal with tomorrow. If your children asked you what do you for living, would like them to see this?

Do Leave Failing Tests

June 3, 2011ProgrammingCode Quality, TestingKonrad Garus

Thou shalt do your coding in small increments. No work item may be longer than one day. And if thy work is unfinished at the end of the day, cut it off, and cast from thee. We heard it, believe it and are ready to die for it.

Always?

Here’s an example. Recently I spent over two weeks on a single development task. It had several moving parts that I had to create from scratch, carefully designing each of them and their interactions. Together they form a system that must be as robust as possible, otherwise our users won’t be able to use our application at all (hello, Web Start!). Finally, I tried to make it as transparent and as maintainable as possible.

I took the iterative approach. Create a spike or two. Lay out some high-level design and test. Implement some detail. Another spike. Another detail. An a-ha moment, step back and rewrite. You know how it goes.

Now, it was an all-or-nothing piece of functionality. It was critical that everything perfectly fits together and there are no holes. I just had to grasp the whole thing, even though it took days. Secondly, there were no parts that made much sense individually. Design and interfaces varied wildly, as is often the case in the young, unstable stage of development.

Sometimes your task is just like that. It simply is too big for one day or even one week, and too critical and cohesive to be partitioned.

Here’s an advice.

When you have to leave, what is the best moment to stop? It’s when you’re done with the part you were working on now, right? And then, when you get back to it on Monday, you spend a few lazy sleepy hours trying to figure out where you left 3 days ago? Wrong!

When you’re leaving, leave a failing test. Code that fails to compile. A bug that jumps at you and tries to bite your head off as soon as you launch. When you return, you’ll see the fire and jump right in to action. You will know exactly where you left, and won’t take long to figure out what to do next.

Thanks to Tynan for helping me realize it. Though his post is a general lifestyle advice, I think it can be applied to software engineering as well.

Two Worlds: Functional Thinking after OO

March 14, 2011Clojure, ProgrammingCode Quality, The Big PictureKonrad Garus

It goes without saying that functional programming is very different from object-oriented. While it may take a while to grasp, it turns out that functional programming can also lead to simpler, more robust, maintainable and testable solutions.

The First Encounter

In his classic “Clean Code” Robert C. Martin describes a great (and widely adopted) way to object oriented programming. He says code should read from top to bottom like prose. Start with the most high-level concepts and break them down into lower-level pieces. Then when you work with such a source file, you may be able to more easily follow the flow and understand how it works.

It occurred to me it’s quite different from how the natural flow of functional programming in Clojure, where typically all users are below their dependencies. I was unsure if (and how) Uncle Bob’s ideas apply on this ground. When I tweeted my doubts, his answer was: “You can forward declare”. I can, but still I was not convinced.

The light dawned when I came across Paul Graham’s essay titled “Programming Bottom-Up”. The way to go in functional programming is bottom-up, not just top-down.

Language for the Problem

Lisp (and Clojure) have minimal syntax. Not much more than parenthesis. When you look at Clojure source, it’s really dense because there is no syntax obscuring the flow. What you can see at the bottom of the file, though, appears to be program in language created just for this problem.

As Paul writes, “you don’t just write your program down toward the language, you also build the language up toward your program. As you’re writing a program you may think +I wish Lisp had such-and-such an operator.+ So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on.”

Simplicity and Abstractness

Ending up with a language tailored for the problem is not the only nice feature of functional programming. Compared to object-oriented, functional code tends to be a lot more abstract. There are no intricate nets of heavy stateful objects that very often can’t talk to each other without adaptation and need a lot of care to weave together.

Data structures are very simple (if not minimalistic), what makes them easy to use with more functions. They also are immutable, so there is no risk of unexpected side effects. On the other hand, because of simplicity of data structures functions turn out to be much more abstract and generic, and hence applicable in broader context. Add closures and higher-order functions and get a very powerful engine with unlimited applications. Think how much you can do with map alone – and why.

Another way to look at code organization is layers or levels of abstraction. In object-oriented, it usually means that a bottom layer consists of objects which provide some functionality to the higher level. Functional programming takes it one step further: “[each layer acts] as a sort of programming language for the one above.” And if there is a need for distinction to layers, it’s only because of levels of abstraction. Much rarelier because of incompatibility and never because of handcuffing encapsulation.

Maintainability and Code Organization

We all know that in object-oriented programming most of the time you should start with the purpose at hand and go for a very concrete, specialized design. “Use before reuse” is the phrase term here. Then, when you need more flexibility you decompose and make the code more abstract. Sometimes it can be a difficult, time-consuming and bug-inducing effort. It’s not the case with functional programming. “Working bottom-up is also the best way to get reusable software. The essence of writing reusable software is to separate the general from the specific, and bottom-up programming inherently creates such a separation.”

Compared to OO, refactoring of functional code is trivial. There is no state, dependencies or interactions to worry about. No way to induce unexpected side effects. Each function is a simple construct with well-defined outcome given the direct input. Thanks to that, it’s also much easier to test (and cover with automated test suites).

The Curse of Structured Programming

Object-oriented originated from structured programming and all developers inevitably have such background. It takes some training to learn how not to write 2,000-line classes and 500-line methods. It takes much more to learn how to avoid inheritance, program to an interface, compose your code of smaller units and cover them (and their interactions) with tests.

There are ways to make object-oriented programs more likely to succeed, have fewer bugs and be more maintainable. A ton of books has been written on this, including but not limited to excellent and invaluable works by Robert C. Martin, Martin Fowler, Eric Evans, and so on, and so forth. That’s a lot of prose! It turns out that object-oriented programming actually is very difficult and needs a lot craftsmanship and attention.

Functional programming is free from many of these issues. What’s more, learning it yields great return to an object-oriented programmer. It can teach you a lot about algebraic thinking, but also breaking code down into smaller pieces, intention-revealing names and interfaces, avoiding side effects and probably many other aspects.