Login | Register
My pages Projects Community openCollabNet
Project highlights: Home - Download - Documentation - Contribute

The Referenced Object Problem

The most interesting insights yet to come out of Ubik show how hard it is to provide a consistent view across local objects and those in the persistence store. This problem is not unique to Ubik, as it occurs in all object-oriented database applications, however Ubik is in a position whereby a solution can be applied that will benefit all applications on the platform. Consider the following query:

Brand[Manufacturer.Name = 'Microsoft']

When checking the results of this query against a local transaction, Ubik will reevaluate any results based on objects of type Brand that have been modified locally. This is reasonably efficient so long as the number of modifications is not enormous (thousands are fine - millions of modifications may suffer from the lack of indexing.) This will catch the cases where Brands have been added or removed that fit the query, as well as where the Brand has been associated with a different Manufacturer than the one associated with it on the server.

There is another factor to consider however: the Name property of a Manufacturer may have changed locally, affecting the result that should be returned. Consider the case where a Manufacturer not previously called 'Microsoft' has been given the name 'Microsoft' in the current transaction. It is not hard to detect such a condition - such logic can be implemented reasonably efficiently. The problem is: once we've detected such a change, how do we then determine which Brand objects are affected?

This seemingly simple problem is surprisingly tough to solve. Effectively we could end up having to test all of the Brands in the system to see whether they are affected by the change (this is what the current Ubik implementation will do!)

So far a few different options have been identified:

1.      Current solution - execute the entire query locally - usable as the scenario is quite rare and when the type being queried has a limited number of instances there is not a huge performance problem. Will be unworkable in some situations though.

2.      Raise an exception whenever this situation is detected - this would (hopefully) uncover any code triggering this scenario before it becomes a production problem. Not ideal as this could cause hard-to-find production bugs.

2a.      Raise an exception as for (2), however provide an additional parameter to 'Select' that will override this behaviour with either (1) or by ignoring the problem (in some cases the developer may understand the context enough to write correct code without worrying about the effects of the local changes on the query result.) This option might buy some time while (4) is more fully explored and implemented.

3.      Use local execution as is currently being done, however use the rest of the query to restrict the number of items retrieved to be queried locally. Same problems as the first solution, better performance in many cases though.

4.      Get a list of the affected referenced objects, execute a second query to determine which of the queried objects is affected by the changes to the referenced objects, recalculate based on this. This may be more efficient or less depending on circumstances - sometimes the extra roundtrip will take more time than the local execution as in (1.) Requires transformation of the query - hard to implement. For example if Manufacturer {a} was modified, execute Brand[Manufacturer = {a} ...] then re-execute the query on the objects retrieved. Still on shaky ground but there is some promise, this will be explored as a potential workaround.

Solved with help from Hibernate and Luke!

Although Ubik transactions are unable to take advantage of the transactional capabilities of the database, it is still possible to approach this problem in a way similar to the way that it is solved by Hibernate (and possibly other ORMs?) When the scenario above is detected, the modifications that are causing the problem can be sent to the server along with the query. These records can be written to the database in a temporary transaction that is rolled back once the query has executed. In this way, the query will return results correct in terms of the client's in-memory session state.