Let’s start with a puzzle. Let’s create a little Leiningen project called careful
. Let’s set :main careful.core
in project.clj
and put this in careful/core.clj
:
(ns careful.core (:gen-class)) (defn get-my-value [] (println "Sleeping...") (Thread/sleep 5000) (println "Woke up") "Done") (def my-def (get-my-value)) (defn -main [& args] (println "Hello, World!"))
Here’s the question: What happens when you compile this project?
And the answer is…
$ time lein2 compile Compiling careful.core Sleeping... Woke up Compilation succeeded. real 0m7.098s user 0m5.276s sys 0m0.212s
Hey, my program actually printed something during compilation! And it took way too much time. I didn’t expect that.
Such an apparently innocuous def
can get you in a lot of trouble. Sure, no-one puts a sleep like that, but what about:
- Code that computes something, perhaps something taking time or space?
- Code that loads something from the network?
I don’t yet understand why it is resolved at compile time.
But I can understand why using def
in that way is not a good idea. It’s an old imperative habit. This may be a perfectly valid imperative program:
public static void main(String[] args) { Data data = loadDataFromInternet(); ProcessedData proc = process(data); generateReport(proc); }
You may be tempted to do it this way in Clojure:
(def data (load-data-from-internet)) (def proc (process data)) (defn -main [& args] (generate-report proc))
… but that’s still an imperative style and it feels wrong.
It also is real pain to test.
How about one of these equivalents?
(defn -main [& args] (let [data (load-data-from-internet)] (generate-report (process proc))))
(defn -main [& args] (generate-report (process (load-data-from-internet))))
In the end, I arrived at the following conclusion. You should only use def
for constants, some global parameters, dynamic variables, definitions of higher order functions – that kind of static stuff. All logic and behavior belongs in functions.
Update
As djork pointed out at Reddit, it’s because def
creates a var
in the current namespace with specific value.
It makes some sense when you think of what defn
looks like – it’s really a macro wrapping def
(also pointed out by djork). And we do expect functions introduced by defn
to be compiled, right. Even docs clearly state that defn
is the “same as (def name (fn [params* ] exprs*))
“.
I still find it very confusing, though. I wonder if I’m just abusing the language.
Second Update
I came back to it later, and I may have finally understood.
This is a perfectly valid statement:
(def my-def (get-my-value))
But what about these?
; Unexpected argument: (def my-def (get-my-value)) ; Type cast exception (vector to number) (def my-def-2 (+ 2 [])) ; Whatever invalid statement (def my-def-2 (+ 2 +))
Should they throw a compile-time error? It makes sense, right?
Now, when you type this:
(def my-def (get-current-date))
At runtime, do you expect it to have state as of compile time, or as of run time? In other words, should it be date of compilation, or “now” at the time of execution? The latter, right?
I can see why both evaluations (at compile and run time) are needed. Depending on point of view, it’s either some sort of language fragility or developer abusing the language. Either way, the conclusion stays the same: Careful with that def
, Eugene.
Discussion
Aside from this blog, there is an interesting discussion with more detail at Reddit. Thanks guys!
I think it runs at compile time because your function returns a literal string and doesn’t change any state so the compiler decides to optimize the program by removing it? maybe?
Greg – not really. It still is re-evaluated at runtime.
Clojure compilation acts exactly like the REPL: it reads each expression, compiles it, and executes it before moving on to the next.
It must execute each expression at compile time because Clojure has macros, so subsequent expressions may depend on it at compile-time. (If this expression defines a macro, it must be executed before compiling any expressions that use the macro.) There are several ways to handle this; the one Clojure takes is the simplest and most regular, but it can be surprising. (It usually isn’t noticeable because few definitions have side effects, so executing them at compile time is harmless.)