I’ve read a post recently on two ways to model data of business domain. My memory is telling me it was Ayende Rahien, but I can’t find it on his blog.
One way is full-blown object-relational mapping. Entities reference each other directly, and the O/R mapper automatically loads data for you as you traverse the object graph. To obtain Product
for an OrderLine
, you just call line.getProduct()
and are good to go. Convenient and deceptively transparent, but can easily hurt performance if you aren’t careful enough.
The other way is what that post may have called a document-oriented mapping. Each entity has its ID and its own data. It may have some nested entities if it’s an aggregate root (in domain-driven design terminology). In this case, OrderLine
only has productId
, and if you want to get the product you have to call ProductRepository.getProduct(line.getProductId())
. It’s a bit less convenient and requires more ceremony, but thanks to its explicitness it also is much easier to optimize or avoid performance pitfalls.
So much for the beforementioned post. I recently had an opportunity to reflect more on this matter on a real world example.
The Case
The light dawned when I set out to create a side project for a fairly large system that has some 200+ Hibernate mappings and about 300 tables. I knew I only needed some 5 core tables, but for the sake of consistency and avoiding duplication I wanted to reuse mappings from the big system.
I knew there could be more dependencies on things I don’t need, and I did not have a tool to generate a dependency graph. I just included the first mapping, watched Hibernate errors for unmapped entities, added mappings, checked error log again… And so on, until Hibernate was happy to know all the referenced classes.
When I finished, the absolutely minimal and necessary “core” in my side project had 110 mappings.
As I was adding them, I saw that most of them are pretty far from the core and from my needs. They corresponded to little subsystems somewhere on the rim.
It felt like running a strong magnet over a messy workplace full of all kinds of metal things when all I needed was two nails.
Pain Points
It turns out that such object orientation is more pain than good. Having unnecessary dependencies in a spin-off reusing the core is just one pain point, but there are more.
It also is making my side project slower and using too many resources – I have to map 100+ entities and have them supported in my 2nd level cache. When I’m loading some of the core entities, I also pull many things I don’t need: numerous fields used in narrow contexts, even entire eagerly-loaded entities. At all times I have too much data floating around.
Such a model also is making development much slower. Build and tests take longer, because there are many more tables to generate, mappings to scan etc.
It’s also slower for another reason: If a domain class references 20 other classes, how does a developer know which are important and which are not? In any case it may lead to very long and somewhat unpleasant classes. What should be core becomes a gigantic black hole sucking in the entire universe. When an unaware newbie goes near, most of the time he will either sink trying to understand everything, or simply break something – unaware of all the links in his context, unable to understand all links present in the class. Actually, even seniors can be deceived to make such mistakes.
The list is probably much longer.
Solution?
There are two issues here.
How did that happen?
I’m writing a piece of code that’s pretty distant from the core, but could really use those two new attributes on this core entity. What is the fastest way? Obvious: Add two new fields to the entity. Done.
I need to add a bunch of new entities for a new use case that are strongly related to a core entity. The shortest path? Easy, just reference a few entites from the core. When I need those new objects and I already have the old core entity, Hibernate will do the job of loading the new entities for me as I call the getters. Done.
Sounds natural and I can see how I could make such mistakes a few years ago, but the trend could have been stopped or even reversed. With proper code reviews and retrospectives, the team may have found a better way earlier. Having some slack and good will it may have even refactored the existing code.
Is there a better way to do it?
Let’s go back to the opening section on two ways to map domain classes: “Full-blown ORM” vs. document/aggregate style.
Today I believe full-blown ORM may be a good thing for a fairly small project with a few closely related use cases. As soon as we branch out new bigger chunks of functionality and introduce more objects, they should become their own aggregates. They should never be referenced from the core, even though they themselves may orbit around and have a direct link to the core. The same is true for the attributes of core entites: If something is needed in a faraway use case, don’t spoil the core mapping with a new field. Even introduce a new entity if necessary.
In other words, learn from domain-driven design. If you haven’t read the book by Eric Evans yet, go do it now. It’s likely the most worthwhile and influential software book I’ve read to date.
Very interesting article and some eyes-opening opinions. For some time I was thinking that in some cases full ORM approach might be not the best solution and now I’ve found some answers in your post.
Thanks for sharing!
As far as I understood DDD, you should model your domain objects according to the needs of your domain. In your side project you simply copied the object model from a different domain. What worked for them doesn’t necessarily have to work for you as well.
On a technical level, Hibernate allows you to map an object multiple times (that’s what the “entity”-attribute is for), so you could have provided your own – leaner – mappings for the root project’s object model leaving out the attributes and relations you don’t need.
Tomasz – thanks for the comment.
Frisian –
I could use different mappings over the same tables (or views), but it’s more of a band-aid, not the real cure. It introduces some duplication. It means you have to learn and maintain two models (make sure they stay in sync), It may break some domain rules. And it still leaves the old project in mess.
What I think should be done here (and the way I understand DDD) is that the core domain should be as small as possible and “self-contained”. Other use cases can reference it with _unidirectional_ links.
That way I could reuse the core entities “as is” in the side project. One consistent model everywhere. The core project would benefit from such simpler design as well.
Their domain model would IMHO only be “wrong” if they put multiple domains into it without clear boundaries. Some domains are more complex than others, so more objects are needed to model them adequately.
Entities are preferred to form hierarchies (aggregates) with only aggregate roots having relations with other aggregate roots. Still, the number of relations will probably the same compared to a “non-DDD” domain model. After all, these relations are there because of a business need. And if makes sense business-wise to have bi-directional relations, then the domain model should reflect that.
I don’t consider mapping the same domain more than once as a band-aid. Basically, that’s what CQRS is about, namely dividing the same domain into one for writing and one for reading.
What is “one domain”, then? One system can have several use cases that are fairly distinct. They’re all in its domain, but it does not mean the core has to “know” about them.
Sure, they’re all there because of business needs, but they’re different contexts. That’s the point with bounded contexts in DDD – to isolate them. The model still reflects business, is consistent etc., but each use case has its own drawer rather than all of them lying on the desk at all times. Feels like arguing 500-line methods vs. Clean Code, actually. :-)
In the last point, I wonder if there is confusion between CQRS and DDD (separating command from query does not yet introduce bounded contexts). I know, I know, the line is thin and CQRS naturally encourages different “queries” over consistently updated model. :-)
So the problem of the domain model you copied was, that they didn’t respect the boundaries of bounded contexts in their domain model? That’s something completely different.
But even if they did, it would have lessened your pains only up to a point: Inside a bounded context relations between aggregate roots are fair game. Thus having to copy all the mappings for those bounded contexts you touch in your side project isn’t a sign of a design flaw per se.
Bounded contexts should be used with CQRS as well, though I still have to see an example, where it lead itself to different bounded contexts in the command part and the querying part.
On data access level, the problem was that there were no bounded contexts anywhere in the code, and all entities were free to reference others directly with Hibernate-mapped links.
On code quality level, the core entities ended up having way too many fields with usage scatted at random places. Even worse, the update model was so that every server class was free to call session.update(entity), and it did. That hurt a lot, especially when one set of fields was updated in more than one place (would love to see events or commands here).