Tuesday, May 22, 2007

Comparing beans or why equality is subjective

In my project I use Hibernate and DWR so between my database and application I have to deal with Hibernate fiddling with my objects and between my client (browser) and application I have to deal with the DWR.

This all seems well and fine and generally it is. In fact kudos are in order.

However, there is no need to list all the key features again, you can just check the websites of the products for that. What I would like to describe is the fundamental problem of comparing objects that these frameworks do not solve.

The essence of the problem is that in a pure object oriented environment all objects have identity (== in Java) and when you go beyond the boundaries of your environment that law doesn't apply anymore. In other words: once
your application let's go of an object, that object is gone forever. To illustrate an example is appropriate.

Say in my application I have some simple user management. A User object can be persisted to the database and retrieved when a user logs in. The user can edit his profile (user properties) in the client and since I'm using Ajax i don't want the user to have to reload and redo his changes if I restart my application.

In real life each user is unique, so I expect the same from my application, database and client. Sounds simple, but it isn't. First let's have a look at the hibernate persistence layer. In the relational database there is no identity by default (it is impossible to distinguish two records with the exact same columns). This is easily fixed by adding a primary key to the table. Hibernate can do this for you. You just have to put some sort of Id property on your Pojo and tell Hibernate to use a sequence to fill in this property once the object is saved (eww).
Now there are three ways to obtain an object:
  1. create it yourself
  2. get it from hibernate
  3. get it from somewhere else (the client)
In the first case the Id will not be initialized (obviously, because Hibernate is in charge of that). In the second case the Id will definitely be initialized, otherwise the pk constraint would be violated. The third case is arbitrary. If the client got the object from the application in the first place there is a good chance that it has been saved before. In that case the client better keep track of the id. If the client is trying to create a new user there will not be an id.

Actually it is even a bit more complicated, as I might want to allow the application to pass objects to the client before they are saved. In this scenario the client and the application need to be able to negotiate identity between objects without polluting the id property that Hibernate needs to fill once the object gets saved. (If you get lost here it is probably a good idea to read up on Hibernate and the equals() method.)

Now take a note that I have been talking about identity and equality. Equality is the functional identity (i.e. two distinct objects in the jvm could be representing the same object from the users perspective).

If a user loads his User object in the client and after some changes he hits the save button, the application needs to be able to see that the existing User with a certain identity should be saved (resulting in a database update). You could establish this behavior with a smart implementation of the equals() method.

It is important to make a clear distinction between functional equality and technical identity. To illustrate suppose the application is a friends network. One user is logged in and updating his details. Another user is logged in and adding the first user to his network to do this he fills in the user form, the client sends the new user object to the application. The application needs to check if the user is equal to a known user to prevent duplication. Also there need to be two different instances of the first user. The identity of these user objects needs to be managed over the DWR pipes.

Now we have two objects in our application that are functionally equal, but technically distinct. This exposes the problem I'm trying to get at. Since the two objects are equal, but not identical either updates done in the first session do not show up in the second or you would have to constantly manage conflicting changes when either user saves the object.

Elegant solutions are most welcome.