Tag Archives: MongoDB

Sequential ID Generation in Congomongo

By default MongoDB uses 12-byte BSON id for objects. For some reason I wanted to use an increasing sequence of integers.

The samples in documentation are in JSON and I was not sure how they translate to the Java driver. The JSON sample looks like:

function counter(name) {
    var ret = db.counters.findAndModify({query:{_id:name}, update:{$inc : {next:1}}, "new":true, upsert:true});
    // ret == { "_id" : "users", "next" : 1 }
    return ret.next;
}

db.users.insert({_id:counter("users"), name:"Sarah C."}) // _id : 1
db.users.insert({_id:counter("users"), name:"Bob D."}) // _id : 2

After some googling I found an implementation in Java. Just as I expected, it’s much longer and completely different.

public static String getNextId(DB db, String seq_name) {
    String sequence_collection = "seq"; // the name of the sequence collection
    String sequence_field = "seq"; // the name of the field which holds the sequence
 
    DBCollection seq = db.getCollection(sequence_collection); // get the collection (this will create it if needed)
 
    // this object represents your "query", its analogous to a WHERE clause in SQL
    DBObject query = new BasicDBObject();
    query.put("_id", seq_name); // where _id = the input sequence name
 
    // this object represents the "update" or the SET blah=blah in SQL
    DBObject change = new BasicDBObject(sequence_field, 1);
    DBObject update = new BasicDBObject("$inc", change); // the $inc here is a mongodb command for increment
 
    // Atomically updates the sequence field and returns the value for you
    DBObject res = seq.findAndModify(query, new BasicDBObject(), new BasicDBObject(), false, update, true, true);
    return res.get(sequence_field).toString();
}

Not much later I checked docs and source code for Congomongo and discovered fetch-and-modify. I rewrote the Java sample above to Clojure and later polished it using code from this commit by Krzysztof Magiera. In the end my sequence generator looks like this:

(defn next-seq [coll]
  (with-mongo db
    (:seq 
	  (fetch-and-modify :sequences {:_id coll} {:$inc {:seq 1}} :return-new? true :upsert? true))))
	
(with-mongo db 
  (insert! :books {:author "Adam Mickiewicz" :title "Dziady" :_id (next-seq :books)}))

The raw call to insert! could be wrapped in a function or macro to save some boilerplate if there are more collections. For instance:

(defn insert-with-id [coll el]
  (insert! coll (assoc el :_id (next-seq coll))))
  
(with-mongo db
  (insert-with-id :books {:author "Adam Mickiewicz" :title "Dziady"}))

In some circles this probably is common knowledge, but it took me a while to figure it all out.

Connection Management in MongoDB and CongoMongo

I decided to take the opportunity offered by Jacek Laskowski (in Polish) and take a closer look at interaction with MongoDB in Clojure. It has a nice, challenging learning curve as I haven’t done much practical work in Clojure and I’ve never actually dealt with Mongo before. Double win – learning two interesting things at a time.

The obvious choice for the integration is CongoMongo. It’s really easy to get it all set up and working. The official docs encourage you to simply do this:

(def conn (make-connection "mydb")

(set-connection! conn)

(insert! :robots {:name "robby"})

(fetch-one :robots)

; ... and so on

Easy. Too easy and comfortable. Coming from the old good and heavy JDBC/SQL I felt uneasy with the connection management. How does it work? Does it just open a connection and leave it dangling in the air the whole time? Might be good for a quick spike in REPL, but not for a real application which needs concurrency, is supposed to be running for days and weeks, and so on. How do you maintain it properly?

clojure.contrib.sql has with-connection. That opens the connection, runs something with it and then eventually closes it. CongoMongo has with-mongo, but all it does is bind the argument to *mongo-config* and execute body. Nothing is ever opened or closed.

That seemed insane and broken, until I took a step back and compared source of CongoMongo to documentation of underlying Java driver for MongoDB. The light dawned.

What make-connection really does is create an instance of Mongo and DB (if database name was provided). The result of this function is plain map: {:mongo #<Mongo>, :db #<DB>}.

Javadoc for Mongo say it’s a database connection with internal pooling. For most application, you should have 1 Mongo instance for the entire JVM. A page dedicated to Java Driver Concurrency explains it in more detail: The Mongo object maintains an internal pool of connections to the database. For every request to the DB (find, insert, etc) the java thread will obtain a connection from the pool, execute the operation, and release the connection..

At first I thought CongoMongo docs were misleading. The truth is, it’s just a wrapper for the Java driver. It’s fair for it to assume you know the basic principles of the underlying driver.

So what is called a “connection” here (the Mongo class) is in fact a much more sophisticated object. It maintains a connection pool and creates nice little DB objects for all the data handling, which in turn are smart enough to maintain the actual low-level connections for you. No ceremony, just gets out of the way as quickly as possible and lets you get the job done.

This is amazingly simple and elegant compared to JDBC / JEE / SQL. I guess I soon will be scratching my head over ACID, but at the moment I’m pleasantly surprised with the look of things.