Validating Class/Package Dependencies with Classycle

Classycle is a very nice analyzer and dependency checker for class and package dependencies.

It lets you define package groups (components, layers) and express unwanted dependencies such as cycles, or dependencies between particular packages. For example you can specify that you want no package cycles and no dependencies from com.foo.domain.* on com.foo.api.*. All using a very human-friendly, concise format.

Then you kick off the analyzer (it comes with an Ant task and a standalone command line tool) and it produces a report with violations.

There is a number of other tools out there: JDepend, Sonar, JArchitect and so on. So why Classycle?

  • It’s free (BSD license).
  • It’s fast.
  • It’s powerful and expressive. The rules take just a few lines of easily readable text.
  • It integrates with build tools very well. We have it running as part of the build script, for every build. It’s really just another automated test. Thanks to that the project structure is probably the cleanest one I’ve worked with so far.

Gradle Plugin

Thanks to having an Ant task Classycle is very easy to integrate with Gradle, with one caveat: The official build is not in Maven Central, and the only build that is there does not include the Ant task.

Gradle itself uses Classycle via a script plugin, buried somewhere in project structure. They published Classycle on their own repository, but it’s the older version that doesn’t support Java 8.

Inspired by that, we wrote our own plugin and made it available for everyone with minimum effort. It’s available on Gradle Plugin Portal and on GitHub.

In order to use it, all you need is:

  • Add the plugin to your project:
    plugins { id "pl.squirrel.classycle" version "1.1" }
    
  • Create Classycle definition file for each source set you want to have covered in src/test/resources/classycle-${sourceSet.name}.txt:

    show allResults
    
    {package} = com.example
    check absenceOfPackageCycles > 1 in ${package}.*
    
  • Congratulations, that’s all it takes to integrate Classycle with your Gradle build! Now you have the following tasks:
    # For each source set that has the dependency definition file:
    classycleMain, classycleTest, ... 
    
    # Analyze all source steps in one hit:
    classycle
    
    # Also part of the check task:
    check
    

See Plugin Portal and GitHub for more information. Happy validating!

Walking Recursive Data Structures Using Java 8 Streams

The Streams API is a real gem in Java 8, and I keep finding more or less unexpected uses for them. I recently wrote about using them as ForkJoinPool facade. Here’s another interesting example: Walking recursive data structures.

Without much ado, have a look at the code:

class Tree {
    private int value;
    private List<Tree> children = new LinkedList<>();

    public Tree(int value, List<Tree> children) {
        super();
        this.value = value;
        this.children.addAll(children);
    }

    public Tree(int value, Tree... children) {
        this(value, asList(children));
    }

    public int getValue() {
        return value;
    }

    public List<Tree> getChildren() {
        return Collections.unmodifiableList(children);
    }

    public Stream<Tree> flattened() {
        return Stream.concat(
                Stream.of(this),
                children.stream().flatMap(Tree::flattened));
    }
}

It’s pretty boring, except for the few highlighted lines.

Let’s say we want to be able to find elements matching some criteria in the tree or find particular element. One typical way to do it is a recursive function – but that has some complexity and is likely to need a mutable argument (e.g. a set where you can append matching elements). Another approach is iteration with a stack or a queue. They work fine, but take a few lines of code and aren’t so easy to generalize.

Here’s what we can do with this flattened function:

// Get all values in the tree:
t.flattened().map(Tree::getValue).collect(toList());

// Get even values:
t.flattened().map(Tree::getValue).filter(v -> v % 2 == 0).collect(toList());

// Sum of even values:
t.flattened().map(Tree::getValue).filter(v -> v % 2 == 0).reduce((a, b) -> a + b);

// Does it contain 13?
t.flattened().anyMatch(t -> t.getValue() == 13);

I think this solution is pretty slick and versatile. One line of code (here split to 3 for readability on blog) is enough to flatten the tree to a straightforward stream that can be searched, filtered and whatnot.

It’s not perfect though: It is not lazy and flattened is called for each and every node in the tree every time. It probably could be improved using a Supplier. Anyway, it doesn’t matter for typical, reasonably small trees, especially in a business application on a very tall stack of libraries. But for very large trees, very frequent execution and tight time constraints the overhead might cause some trouble.

Java 8 Streams API as Friendly ForkJoinPool Facade

One of features I love the most about Java 8 is the streams API. It finally eliminates pretty much all loops from the code and lets you write code that is so much more expressive and focused.

Today I realized it can be used for something else: As a nice front-end for the ForkJoinPool.

Problem: Executors Boilerplate

Let’s say we want to run a number of tasks in parallel. Nothing fancy, let’s say each of them just prints out the name of the executing thread (so we can see it run in parallel). We want to resume execution after they’re all done.

If you want to run a bunch of tasks in parallel using an ExecutorService, you probably need to do something like the following:

ExecutorService executor = Executors.newCachedThreadPool();
for (int i = 0; i < 5; i++) {
    executor.submit(() -> System.out.println(Thread.currentThread()));
}
executor.shutdown();
try {
    executor.awaitTermination(1, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
    // TODO handle...
}

Now, that is a lot of code! But we can do better.

Solution: Stream API

In the end I came up with this utility:

void doInParallelNTimes(int times, Runnable op) {
    IntStream.range(0, times).parallel().forEach(i -> op.run());
}

Reusable and all. Call it like:

doInParallelNTimes(5, () -> System.out.println(Thread.currentThread()));

Done.

This one prints out the following. Note that it’s actually using the main thread as well – since it’s held hostage anyway and cannot resume until execution finishes.

Thread[main,5,main]
Thread[ForkJoinPool.commonPool-worker-1,5,main]
Thread[main,5,main]
Thread[ForkJoinPool.commonPool-worker-3,5,main]
Thread[ForkJoinPool.commonPool-worker-2,5,main]

Another Example: Parallel Computation

Here’s another example. Instead of doing the same thing N times, we can use the stream API to process a number of different tasks in parallel. We can create (“seed”) a stream with any collection or set of values, have a function executed on them in parallel, and finally aggregate the results (collect to a collection, reduce to a single value etc.)

Let’s see how we could calculate a sum of the first 45 Fibonacci numbers:

public class Tester {
    public static void main(String[] args) {
        Stopwatch stopwatch = Stopwatch.createStarted();
        IntStream.range(1, 45).parallel().map(Tester::fib).sum();
        System.out.println("Parallel took " + stopwatch.elapsed(MILLISECONDS) + " ms");

        stopwatch.reset();
        stopwatch.start();
        IntStream.range(1, 45).map(Tester::fib).sum();
        System.out.println("Sequential took " + stopwatch.elapsed(MILLISECONDS) + " ms");
    }

    private static int fib(int n) {
        if (n == 1 || n == 2) {
            return 1;
        } else {
            return fib(n - 1) + fib(n - 2);
        }
    }
}

Prints out:

Parallel took 3078 ms
Sequential took 7327 ms

It achieves a lot in a single line of code. First it creates a stream with descriptions of all the tasks that we want to run in parallel. Then it calls a function on all of them in parallel. Finally it returns the sum of all these results.

It’s not all that contrived. I can easily imagine creating a stream with arbitrary values (including rich Java objects) and executing a nontrivial operation on them. It doesn’t matter, orchestrating all that would still look the same.

When to do it?

I think this solution is pretty good for all the cases when you know the load upfront, and you want to fork execution to multiple threads and resume after they’re all done. I needed this for some test code, but it would probably work well in many other fork/join or divide-and-conquer scenarios.

Obviously it does not work if you want to run something in background and resume execution or if you want to have a background executor running over a long period of time.

Human Error?

I’ve just watched Sidney Decker’s “System Failure, Human Error: Who’s to Blame” talk from DevOpsDays Brisbane 2014. It’s a very nice and worthwhile talk, though there is some noise.

You can watch it here:

It covers a number of interesting stories and publications from the last 100 years of history related to failures and disasters, their causes and prevention.

Very quick summary from memory (but the video surely has more depth):

  • Shit happens. Why?
  • Due to human physical, mental or moral weaknesses – claim from early XX century, repeated till today.
  • One approach (equated to MBA): these weak and stupid people need to be told what to do by the more enlightened elites.
  • Bad apples – 20% people are responsible for 80% accidents. Just find them and hunt them down? No, because it’s impossible to account for different conditions of every case. Maybe the 20% bus drivers with the most accidents drive in busy city center? Maybe the 20% doctors with most patient deaths are infant surgeons – how can we compare them to GPs?
  • Detailed step-by-step procedures and checklists are very rarely possible. When they are, though, they can be very valuable. This happens mostly in industries and cases backed by long and thorough research – think piloting airplanes and space shuttles, surgery etc.
  • Breakthrough: Maybe these humans are not blame? Maybe the failures are really a result of bad design, conditions, routine, inconvenience?
  • Can disasters be predicted and prevented?
  • Look for deviations – “bad” things that are accepted or worked around until they become the norm.
  • Look for early signs of trouble.
  • Design so that it’s harder to do the wrong thing, and easier and more convenient to do the right thing.

A number of stories follows. Now, this is a talk from DevOps conference, and there are many takeaways in that area. But it clearly is applicable outside DevOps, and even outside software development. It’s everywhere!

  • The most robust software is one that’s tolerant, self-healing and forgiving. Things will fail for technical reasons (because physics), and they will have bugs. Predict them when possible and put in countermeasures to isolate failures and recover from them. Don’t assume your omnipotence and don’t blame the others, so it goes. See also the Systems that Run Forever Self-heal and Scale talk by Joe Armstrong and have a look at the awesome Release It! book by Michael Nygard.
  • Make it easy for your software to do the right thing. Don’t randomly spread config in 10 different places in 3 different engines. Don’t require anyone to stand on two toes of their left foot in the right phase of the moon for doing deployment. Make it mostly run by itself and Just Work, with installation and configuration as straightforward as possible.
  • Make it hard to do the wrong thing. If you have a “kill switch” or “drop database” anywhere, put many guards around it. Maybe it shouldn’t even be enabled in production? Maybe it should require a special piece of config, some secret key, something very hard to happen randomly? Don’t just put in a red button and blame the operator for pressing it. We’re all in the same team and ultimately our goal is to help our clients and users win.

The same principles apply to user interface design. Don’t randomly put in a bunch of forms and expect the users to do the right thing. If they have a workflow, learn it and tailor the solution that way. Make it harder for end users to make mistakes – separate opposite actions in the GUI, make the “negative” actions harder to execute.

Have a look at the above piece of Gmail GUI. Note how the “Send” button is big and prominent, and how far away and small the trash is. There is no way you could accidentally press one when you meant the other.

Actually, isn’t all that true for all the products that we really like to use, and the most successful ones?

“Mastering AngularJS Directives” (Book Review)

Unlike many general introduction books, “Mastering AngularJS Directives” by Josh Kurz takes a much more specialized approach. It assumes you know AngularJS fairly well and explores just one (but arguably the most complex) of its corners: directives.

It’s not a thick book and the table of contents looks just right: Basic introduction to directives, a simple example, and then digging deeper into integration of third party libraries, compilation, communication between directives, writing directives to watch live data for changes, and finally some optimization and code quality notes.

Unfortunately, the book is rather poorly written. It is confusing even to someone who has been using AngularJS profesionally for over 1.5 years. The explanations tend to be short and often miss the point. You may see a difficult issue brought up, followed by a listing over 2 pages long, and finally left with unsatisfactory explanation of how it works or why you would do it this way. In some ways it just lacks focus.

There are some substantive errors too – calling JS objects “JSON notation”, mentioning singletons giving you a new instance every time etc.

That said, even though it is a difficult read, it is not without value. I learned quite a few things myself, some of them mentioned directly and some between the lines. It’s one of the first attempts at thorough introduction to directives and it still may come in handy at times.

The bottom line – I am not sure if I would recommend it to a friend. I liked “Mastering Web Application Development with AngularJS” by Paweł Kozłowski and Peter Darwin a lot better, and even though it’s not dedicated to directives it does better job at explaining them.

Navigation and Routing with Om and Secretary

After some quick experiments with Secretary and Enfocus, I decided to dive headfirst to Om.

Since I’m kind of restarting my pet project all the time, the first thing I lay down is routing and navigation. This time I’ll implement it by combining Secretary with Om and a little Bootstrap.

One of the key features of Om is strong separation of state from behavior from rendering. In a nutshell, state is defined in one place in an atom and is just, you know, state. You can manipulate it as you like without worrying about rendering. Finally, you install renders on top of it without worrying about the behavior.

Let’s start with a bunch of imports. We’ll need Secretary and goog.History from Closure as well as some Om for rendering. I’ll also keep a reference to History so I don’t instantiate it over and over.

(ns demo.navigation
  (:require [secretary.core :as secretary :include-macros true :refer [defroute]]
            [goog.events :as events]
            [om.core :as om :include-macros true]
            [om.dom :as dom :include-macros true])
  (:import goog.History
           goog.History.EventType))

(def history (History.))

Now, the state. Each route has a name that will appear on the navigation bar and path for routing.

(def navigation-state 
  (atom [{:name "Add" :path "/add"}
         {:name "Browse" :path "/browse"}]))

Time for some state manipulation. Enter Secretary and Closure history:

(defroute "/add" [] (js/console.log "Adding"))

(defroute "/browse" [] (js/console.log "Browsing"))

(defn refresh-navigation []
  (let [token (.getToken history)
        set-active (fn [nav]
                     (assoc nav :active (= (:path nav) token)))]
    (swap! navigation-state #(map set-active %))))

(defn on-navigate [event]
  (refresh-navigation)
  (secretary/dispatch! (.-token event)))

(doto history
  (goog.events/listen EventType/NAVIGATE on-navigate)
  (.setEnabled true))

It’s very similar to what I did before – two basic routes, gluing Secretary to Closure history with pretty much the same code that is in Secretary docs.

There’s one thing worth noting here. Every time the route changes, refresh-navigation will update the navigation-state atom. For each of the routes it will set the :active flag, making it true for the path we navigated to and false for all others. This will be used to render the right tab as active.

Now, somewhere in my HTML template I’ll put the div to hold my navigation bar:

<div id="navigation"></div>

Finally, let’s do the rendering in Om:

(defn navigation-item-view [{:keys [active path name]} owner]
  (reify
    om/IRender
    (render [this]
            (dom/li #js {:className (if active "active" "")}
                    (dom/a #js {:href (str "#" path)} name)))))

(defn navigation-view [app owner]
  (reify
    om/IRender
    (render [this]
            (apply dom/ul #js {:className "nav nav-tabs"}
                   (om/build-all navigation-item-view app)))))

(om/root navigation-view navigation-state
         {:target (. js/document (getElementById "navigation"))})

Let’s investigate it from the bottom.

om/root binds a component (navigation-view) to state (navigation-state) and installs it on the navigation element in DOM.

navigation-view itself is a composite (container) component. It creates a <ul class="nav nav-tabs"> containing a navigation-item-view for each route.

Finally, navigation-item-view renders <li class="active"><a href="#{path}">{name}</a></li> using the right pieces of information from the map representing a route.

That’s it. Like I said, state is as pure as it can be, routing doesn’t know anything about rendering, and rendering only cares about state. There is no explicit call to rerender anything anywhere. What’s more, reportedly Om is smart enough to figure out exactly what changed and keep the DOM changes to minimum.

Side note – Om looks like a big thing to learn, especially since I don’t know React. But it’s quite approachable thanks to its incredibly good tutorial. It also made me switch from Eclipse with CounterClockWise to LightTable, giving me more productive fun than I can remember.

“Clojure Cookbook” by Luke VanderHart, Ryan Neufeld; O’Reilly Media

Clojure Cookbook

O’Reilly has just published a new book on Clojure, this time from the “cookbook” series. The book includes over 150 practical recipes on doing some common things in Clojure. Each recipe is self-contained and usually very small.

It starts with a detailed walkthrough of primitive and collection manipulations. Then it includes recipes on basic development tasks (REPL, using docs, running programs etc.), I/O, databases (two recipes on SQL, one for each of a handful of NoSQL databases, plus quite a few on Datomic), web applications with Ring, performance optimization, distributed computing (mostly Cascalog, some Storm) and testing.

In my opinion the book is very uneven. It’s very detailed about the primitives and basic collections, but at the same time it doesn’t do justice to state management (atoms, refs, agents) or concurrency. Yet it has two chapters on building a red-black tree. It is very detailed about Datomic, but barely scratches the surface of much more common tools like core.async, core.logic or core.match. It does not include anything about graphics or ClojureScript.

In short, it sometimes pays much attention to some uncommon problems or tools, while giving too little information on more popular pieces. I think the target audience is somewhere around intermediate. I don’t think it’s a good way to get started with the language, but it is a decent, handy survey of some areas of the landscape.

Careful With Native SQL in Hibernate

I really like Hibernate, but I also don’t know a tool that would be nearly as powerful and deceptive at the same time. I could write a book on surprises in production and cargo cult programming related to Hibernate alone. It’s more of an issue with the users than with the tool, but let’s not get too ranty.

So, here’s a recent example.

Problem

We need a background job that lists all files in a directory and inserts an entry for each of them to a table.

Naive Solution

The job used to be written in Bash and there is some direct SQL reading from the table. So, blinders on and let’s write some direct SQL!

for (String fileName : folder.list()) {
    SQLQuery sql = session.getDelegate().createSQLQuery(
        "insert into dir_contents values (?)");
    sql.setString(0, fileName);
    sql.executeUpdate();
}

Does it work? Sure it does.

Now, what happens if there are 10,000 files in the folder? What if you also have a not so elegant domain model, with way too many entity classes, thousands of instances and two levels of cache all in one context?

All of a sudden this trivial job takes 10 minutes to execute, all that time keeping 2 or 3 CPUs busy at 100%.

What, for just a bunch of inserts?

Easy Fix

The problem is that it’s Hibernate. It’s not just a dumb JDBC wrapper, but it has a lot more going on. It’s trying to keep caches and session state up to date. If you run a bare SQL update, it has no idea what table(s) you are updating, what it depends on and how it affects everything, so just in case it pretty much flushes everything.

If you do this 10,000 times in such a crowded environment, it adds up.

Here’s one way to fix it – rather than running 10,000 updates with flushes, execute everything in one block and flush once.

session.doWork(new Work() {
    public void execute(Connection connection) throws SQLException {
        PreparedStatement ps = connection
                .prepareStatement("insert into dir_contents values (?)");
        for (String fileName : folder.list()) {
            ps.setString(1, fileName);
            ps.executeUpdate();
        }
    }
});

Other Solutions

Surprise, surprise:

  • Do use Hibernate. Create a real entity to represent DirContents and just use it like everything else. Then Hibernate knows what caches to flush when, how to batch updates and so on.
  • Don’t use Hibernate. Use plain old JDBC, MyBatis, or whatever else suits your stack or is there already.

Takeaway

Native SQL has its place, even if this example is not the best use case. Anyway, the point is: If you are using native SQL with Hibernate, mind the session state and caches.

ClojureScript Routing and Templating with Secretary and Enfocus

A good while ago I was looking for good ways to do client-side routing and templating in ClojureScript. I investigated using a bunch of JavaScript frameworks from ClojureScript, of which Angular probably gave the most promising results but still felt a bit dirty and heavy. I even implemented my own routing/templating mechanism based on Pedestal and goog.History, but something felt wrong still.

Things have changed and today there’s a lot buzz about React-based libraries like Reagent and Om. I suspect that React on the front with a bunch of “native” ClojureScript libraries may be a better way to go.

Before I get there though, I want to revisit routing and templating. Let’s see how we can marry together two nice libraries: Secretary for routing and Enfocus for templating.

Let’s say our app has two screens which fill the entire page. There are no various “fragments” to compose the page from yet. We want to see one page when we navigate to /#/add and another at /#/browse. The “browse” page will be a little bit more advanced and support path parameters. For example, for /#/browse/Stuff we want to parse the “Stuff” and display a header with this word.

The main HTML could look like:

<!DOCTYPE html>
<html>
<body>
	<div class="container-fluid">
		<div id="view">Loading...</div>
	</div>

	<script src="js/main.js"></script>
</body>
</html>

Then we have two templates.

add.html:

<h1>Add things</h1>
<form>
  <!-- boring, omitted -->
</form>

browse.html:

<h1></h1>
<div>
  <!-- boring, omitted -->
</div>

Now, all we want to do is to fill the #view element on the main page with one of the templates when location changes. The complete code for this is below.

(ns my.main
  (:require [secretary.core :as secretary :include-macros true :refer [defroute]]
            [goog.events :as events]
            [enfocus.core :as ef])
  (:require-macros [enfocus.macros :as em])
  (:import goog.History
           goog.History.EventType))

(em/deftemplate view-add "templates/add.html" [])

(em/deftemplate view-browse "templates/browse.html" [category]
  ["h1"] (ef/content category))

(defroute "/" []
  (.setToken (History.) "/add"))

(defroute "/add" []
  (ef/at 
    ["#view"] (ef/content (view-add))))

(defroute "/browse/:category" [category]
  (ef/at 
    ["#view"] (ef/content (view-browse category))))

(doto (History.)
  (goog.events/listen
    EventType/NAVIGATE 
    #(em/wait-for-load (secretary/dispatch! (.-token %))))
  (.setEnabled true))

What’s going on?

  1. We define two Enfocus templates. view-add is trivial and simply returns the entire template. view-browse is a bit more interesting: Given category name, alter the template by replacing content of h1 tag with the category name.
  2. Then we define Secretary routes to actually use those templates. All they do now is replace content of the #view element with the template. In case of the “browse” route, it passes the category name parsed from path to the template.
  3. There is a default route that redirects from / to /add. It doesn’t lead to example.com/add, but only sets the fragment: example.com/#/add.
  4. Finally, we plug in Secretary to goog.History. I’m not sure why it’s not in the box, but it’s straightforward enough.
  5. Note that in the history handler there is the em/wait-for-load call. It’s necessary for Enfocus if you load templates with AJAX calls.

That’s it, very simple and straightforward.

Update: Fixed placement of em/wait-for-load, many thanks to Adrian!

“Version Control with Git, 2nd Edition” by Jon Loeliger, Matthew McCullough; O’Reilly Media

Version Control with Git

There are reasons why Git has become so popular, but the first encounter with it can a bit overwhelming. Even if you kind of learn how to do basic things, it’s not uncommon to feel like we’re only scratching the surface. The typical reaction when something slightly less typical is needed often sounds like: “There be dragons!”

Here comes “Version Control with Git” by Jon Loeliger and Matthew McCullough.

It starts with a good explanation of the basic concepts of Git. It explains all the building blocks of Git and internal organization of repository. It slowly introduces the basic commands and every time explains very well how a change is reflected in the repository or what a command is really operating on.

The distribution, collaboration, merging etc. are introduced fairly late, but somehow by that time the reader will have understood the core so much that everything just falls into place and is immediately understandable. Finally, it also shows some more arcane features and commands that probably are rarely used, but knowing that they are there and having the book handy for when the time comes doesn’t hurt.

Last but not the least, it explains common usage patterns as well as things that can be done outside the typical path, with appropriate warnings about possible negative impact.

This book is a must-read for all Git users. It’s usable on all levels, from absolute newbie to someone who feels fairly proficient with Git. I’ve been using Git daily for quite a while, and it really helped me understand what is going on. Everything is very accessible, with plenty of examples as small and practical as possible, as well as some images.