Transactions and Units of Work in Web-Applications

Introduction

Thanks to modern O-R mapping tools writing database-backed business applications became much easier. A lot of tedious code vanished, reading persistent entities from the database into main memory as well as writing back changes is about as easy as it can possibly get. The database is almost invisible - inserts, updates and deletes occur as side-effects of simple state-changes of in-memory objects. This facilitates the use of proper o-o design by pushing the SQL statements down (and out-of-sight) into some framework. Obviously, this is a good thing. Unfortunately, it comes at a price: Our shiny, pure-java domain objects are, almost invisibly to the innocent programmer, tied to the database state via some fairly complex machinery (the O-R-mapper).

After all, what ultimately matters in business application programming, is what your program does to the database (think of corporate policies restricting the programmer to use a fixed set of static SQL-statements only in an application). It is therefore essential to know exactly what you do to the DB when you call some method on your domain model. I will discuss this issue with special regard to web-applications and Hibernate as the O-R-mapper.

Top Level Design (Principles)

  • Domain Classes shouldn't be cut into slices along technical layers if possible. That means architectures that make you write several technical classes for each class of the business domain, like PersonEJB, PersonVO, PersonDAO are evil. They violate DRY. If you think, you can't avoid them write some generator or framework which automates their creation.
  • Domain Objects shouldn't be in different "modes of operation", in other words, they shouldn't have a technical lifecycle, at least not when used in normal application code (like, in hibernate, objects detached from the session versus attached to the session. The goal is, to make sure that the result of calling a method on a domain object should only depend on the non-technical state of, at most, all the domain objects.

These 2 principles along with the general design of current O-R mappers like hibernate leads to the following basic pattern:

  • Each unit of work follows the pattern:
    • Get hold of a set of entities (instances of persisted objects) via a persistence service. These populate the first-level-cache (the Hibernate-session) initially. They can also be regarded as the "working copy" of a part of the database. I'll use this term from now on to avoid misunderstandings due to synonoms.
      • Call methods on elements of the working copy, possibly pulling further entities from the database into the working copy.
      • Flush changes to the working copy by executing the corresponding SQL on the DB and commit the changes to the database.
      • Repeat the previous two steps as appropriate until the probability of the working copy to become either stale or too large grows unacceptably.
      • Each unexpected exception occurring during the previous steps invalidates the working copy. All references to it must be released.

With regard to this basic pattern it is particularly noteworthy that:

  • The lifecycle of the working copy is not a priori tied to some web-application state (neither request nor session). It usually lasts longer than a request but shorter than a web-session. This is what's called the session-per-conversation pattern by the hibernate-guys
  • Elements of the working copy live and die together. That means they are loaded into one hibernate sessions - and they stay with this session until the gc gets them all. This avoids LazyInitializationExceptions (An entity survived its working copy, i.e. references a dead hibernate session) and NonUniqueObjectExceptions (An entity attempts to join a working copy a second time - as a different Java instance)

A technically convenient while somewhat unrelated imposition is

  • An application should, whenever possible, work on one transactional resource, the database, only. This seems to be hard at first sight, since most applications need reliable (transactional) communication with their neighbours, but it can be achieved nicely by using messaging services embedded in the database (e.g. Oracle Advanced Queueing). Of course, JTA transaction management is an option here - but it's costly (license fees & runtime) and not always easily configurable.

Conversations

A conversation is the lifetime of a working copy. The notion of conversations as something which can and should be independent of the usual web-application state-objects web-session and http-request came, as far as I know, up in the hibernate community (see http://hibernate.org/42.html).

Obviously most of the previous section's principles are about the correct implementation of conversations. A web-application sticking to these principles should support conversations as a scope of object lifetime with a longevity between a request's and a web-session's one.

Besides choosing the right longevity for a conversation, one has to decide on the cardinality of conversations. The JBoss Seam framework, for example, decides to let one user session have lots of active, concurrent conversations, spawned alongside with the user opening new windows or tabs of his web-browser. While this is technically impressive when compared to the rather rudimentary support for multi-windowed access to web-apps in other web-frameworks, I don't see a real need for it in most cases.

Common Fallacies

I'll conclude with refutations of some arguments which are often brought up against the previously sketched design.

My Entities are POJOs, so they can and should be treated as just this.

The fact that a class can be compiled and instantiated using nothing but the standard Java SDK / runtime (that's what I take the term "POJO" to mean) does not imply that all of its behaviour is in its code, or more precisely in the code of the domain model which it is part of, combined with the specified Java runtime behaviour. Good O-R mappers like Hibernate (or even any of the more interesting recent Java frameworks will do all sorts of things to it, including replacing instances with instances of generated proxy classes, manipulating their byte code, adding technical state to their domain-specific state.

Stateful services are evil/complex/don't perform well

Dialog applications need state somewhere. With web-applications, you can choose to keep it

  • In the browser. Good for simple applications with lots of concurrent users, performs badly for complex applications, because along with each browser event the complete context necessary to interpret it has to be transmitted in the http request and de-serialised into the corresponding backend-objects.
  • On the server, in the web-session and/or some stateful backend service. The advantage is less data in http-requests, the downside is that all active user-sessions must fit into server-memory.

    These considerations make the choice easy for typical intra-/extranet business applications. They usually involve complex logic and are used by a limited amount of concurrent users - who, when confronted with response-times above half a second tend to call for their old 3270 host terminals.

A web application should delegate business logic to a stateless backend, preferably using DTOs (dumb data transfer objects)

It's certainly a good thing to separate presentation logic from business logic. That does, however not imply that there has to be a "physical border" between them, so that domain objects must be treated like remote objects. That's rubbish from the very first days of EJB 1.x, which stays alarmingly popular - maybe due to the procedural style going with it - something which still appeals to a lot of people.