At my work, we use a product that has a purely object oriented persistent storage. (as opposed to an object/relational model where relational data is mapped onto objects.) I’m not sure which object oriented persistent storage is important, but if you gathered them all up and alphabetized them, you would get far through the list before you found it.
Some of my earliest introductions to object oriented software development described the flexibility and nimbleness of developing software in this model over previous methodologies. Encapsulation allows the implementation to change without changing the interface to an objects clients. Object derivation allows you to extend object behavior of subclasses without modifying existing code in the base classes. The common features that were touted seemed to be along the lines of allowing change to be easier so that development would move faster.
I’m suspecting that at least in our case, an object oriented persistent store negates this advantage by tying object classes to specific implementations at a specific point in time. Our application has two objects that act as repositories or brokers to a heterogeneous collection of objects. One is a manager of “content” objects like stories and the other is the manager of “resource” objects like CSS files, and image and text pieces that make up the look and feel of the visual presentation. Both derive from the same class, and are composed of things like “catalog” classes for efficient searching and storage classes for ease of access by key. The software has tools to import a directory of resources off the filesystem and into the persistent store, or export the persistent store into a tar archive for use storage elsewhere (like the CVS revision control system.)
In our last software release, someone created a resource with a ridiculously obscure name “2nd_nav_topstories_off” or something like that. The attempt to import the resources failed. What There was already an item with that name in the repository, but any attempts to read that item to modified it would fail. I wound up finding out was that that the item was initially put there, but then after it was stored, someone made a modification to the repository code so that it would call .makeObsolete on the objects that were removed. Our image types were all modified to have that method, but this existing object was one of the basic image types (ones that come with the app server.) and knew nothing about it.
Describing in terms of a statically typed object oriented language, someone made some internal changes so that the signature of the method needed to change from being “public Object getResource(Key k)” to “public Resource addResource(Key k)” (by internally within the method of treating the object as a Resource), but not all of the data saved by the original version were our application’s Resource types. Also looking at this from the statically typed Java point of view, it solves things problem by hashing the fields and the type signatures of objects as it serializes objects and checks them as it unserializes them. It will throw an exception if trying to deserialize an object from a different version than you serialize it as. (with an escape hatch called the serialVersionUID field.)
Its sort of odd, using a less dynamic language than python wouldn’t allow this to happen. All of the objects in the repository object would likely need to adhere to a specific interface or abstract superclass. Some people will say that the only solution is a Dump and reload of the data, and that is what projects built on RDBMs do. I’m not sure if I buy that either. First of all, a “dump and reload” of an arbitrarily connect object graph is much more difficult than a tabular structure. Secondly, we had most of the dump and reload functionality built into the application, and it didn’t help us. Thirdly, it isn’t the data attributes of the objects that changed, or the interface of the repository objects, so I’m not sure what a dump and reload could have helped here.What would have prevented this? I guess a bunch of things. If the “import from the filesystem” code cleared the repository, older objects would have been erased. That is almost what the writers intended on doing, and I guess the only argument I have is that if a system makes it easy for errors like this to happen, the errors will occur. A fixup script to run before deploying new code? I guess that could have worked too, although somewhat annoying to have to keep track of. When mulling this over with a co-worker, where older versions of the code get saved along with the current version, and small adaptors are written from one version to the next.
One final thing, I guess I should mention is that this incident broke the Open/Closed Principle, and I guess some people might say that the result was inevatable (or at least reasonably possible.) I guess my concern isn’t so much the fact that the bug occured in a persistant object store, but that the repurcussions were so much larger than in systems that don’t have one.