* Rucksack: a flexible, light weight, open source persistence library * Arthur Lemmens, alemmens@xs4all.nl, 2006-04-30 * (talk given at the ECLM 2006 in Hamburg) 1. Introduction 2. Serialization 3. Object table/cache 4. Garbage collection 5. Transactions 6. Recovery 7. Indexing 8. Schemas 9. Questions ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 1. Introduction * EXPERTISE Maybe I should start by saying that I don't consider myself an expert on any of the subjects that I'm going to talk about today (except maybe Lisp programming). But I've been wanting a Lisp persistence library for years, and nobody else was writing it for me (or, if they were writing them, they lacked features that I considered essential). So I finally decided to ignore the fact that I didn't know anything about things like database implementation, transactions or garbage collectors and just do it. So here we are... * FEATURES Rucksack is a persistence library for Common Lisp. It's a bit similar to systems like AllegroCache or PLOB, but it's also different in some important respects. Here are some of its features: PICTURE - Common Lisp only - 99% portable - persistent conses, vectors, CLOS objects, ... - object cache - parallel transactions - incremental garbage collector - schema evolution - use MOP to automatically deal with slot changes - btree indexing for class instances and slot values - flexible architecture - not finished END - it's all written in Common Lisp - it's almost all portable, except for process locks and some MOP magic (I write and test it with Lispworks) - it tries to provide persistent equivalents of Lisp's classical data structures, including persistent conses, persistent vectors and persistent CLOS instances. - it has an object based cache; changes to persistent objects are written to disk (serialized) during a transaction-commit - it supports parallel transactions - it has an incremental garbage collector - it support schema evolution, in the sense that it can deal with changes to persistent class definitions in a way that's similar to what CL provides with update-instance-for-redefined-class - it uses the MOP to automatically deal with slot changes - it provides indexing for instances and slot values Btrees are included, user defined indexes are also possible. - I try hard to keep it flexible and readable, so it should be relatively easy to adapt to your own needs. - it's not finished yet I've written code for almost everything mentioned above, but there are still quite a few loose ends and it needs some heavy testing. * THIS TALK PICTURE - serialization How do I get it on disk? - object ids How do I know where I put it? - cache Déjà vu. - garbage collection How do I get rid of this mess? - parallel transactions Have my cake and eat it too. - failure recovery What if somebody pulls the plug? - automatic slot and class indexing How do I find it back? - schema evolution What if my class definition changes? END I'm going to present Rucksack in about the same order as I developed it. This means we'll start at the bottom, with things like serialization, object ids, cacheing and garbage collection. Then we'll move on to headache stuff like parallel transactions, recovering from failure, using the MOP for automatic slot and class indexing, and schema evolution. * THE JUNGLE Writing a persistence library is like hacking your way through a jungle. At each point there are difficult decisions to make, and making the best decision is almost impossible. In fact, what's best will often depend on the application. Garbage collection is a nice example. One week I thought I should write a mark-and-sweep collector, the next week I thought that a copying collector would be better. Or maybe some kind of mixture? I ended up writing half a copying collector, then changed my mind and wrote a mark and sweep collector. Now I'm having doubts again. But this is nothing new for most Common Lisp programmers. We're used to the fact that there is no single best programming paradigm, no single best answer. Instead, Common Lisp provides a flexible programming framework that you can adapt to the problem at hand. I try to do the same for Rucksack. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 2. Objects on disk: serialization * INTRO The serializer is the low level layer that takes care of writing objects to disk and reading them back again. * SHARED OBJECTS One of the things that can complicate serialization is that the serializer must deal with shared objects in such a way that they can be reconstructed correctly by the 'deserializer'. This can have quite a large effect on the memory usage of a serializer (there's a reason why Common Lisp has the *print-circle* flag). Fortunately, we don't need to worry about shared objects when we're serializing in Rucksack: Rucksack only respects the object identity of objects that are explicitly declared to be persistent objects. (Erm, well, that's not entirely true: there are some corner cases like symbols and packages, but I'm not going to go into that now.) * WHY NOT WRITE AND READ If you know that you don't need to worry about shared objects, serializing objects to disk is easy. In principle, you could just use Common Lisp's WRITE function to write the object to disk. Then you can use READ to deserialize the object when you need it again. This is possible, but it would be slow. Lisp's syntax for representing data was designed to be writable and readable by humans. The serializer doesn't have to worry about human readability, so it can make decisions that allow for smaller representations and much faster reading. In my experience, the speed difference can easily be one or two orders of magnitude. Two examples: - 'Container objects' like vectors and lists are prefixed by the number of elements they contain. This means that the deserializer can pre-allocate a container of exactly the right size. - The serializer prefixes every object by its type. This is not necessarily equivalent to a Common Lisp type, but gives enough information to the deserializer so that it can prepare itself for what's coming. * MARKERS CODE (defconstant +minus-one+ #x09) (defconstant +zero+ #x0A) (defconstant +one+ #x0B) (defconstant +two+ #x0C (defconstant +object+ #x70) (defconstant +unbound-slot+ #x71) (defconstant +shared-object-definition+ #x72) (defconstant +shared-object-reference+ #x73) (defconstant +hash-table+ #x80) (defconstant +pathname+ #x90) (defconstant +array+ #xA0) END The markers above are hard wired constants and I define them explicitly. This may look a bit un-lispy; I've seen other serialization libraries where the marker numbers automatically roll out of some macro. I don't do that. I do it the old-fashioned way because I want a well defined file format for Rucksack; there should not be any implementation or platform dependencies in Rucksack's file format. CODE FRAGMENT: DESERIALIZE (defun deserialize (serializer &optional (eof-error-p t) (eof-value nil)) "Reads the next object from the serializer stream. Signals an end-of-file error or returns EOF-VALUE when the end of the stream is reached." (let ((marker (read-next-marker serializer))) (if marker (deserialize-contents marker serializer) ;; End of file (if eof-error-p (error 'end-of-file :stream serializer) eof-value)))) END The top-level DESERIALIZE function just reads a marker and then calls the generic function DESERIALIZE-CONTENTS. DESERIALIZE-CONTENTS has a different method for each marker. For example: * SERIALIZING A HASH TABLE (defmethod serialize ((hash-table hash-table) stream) (serialize-marker +hash-table+ stream) ;; Hash-table-test is guaranteed to return a symbol (for the ;; standardized hash-table test functions), so that's nicely ;; portable. (serialize (hash-table-test hash-table) stream) (serialize (hash-table-size hash-table) stream) (serialize (hash-table-rehash-size hash-table) stream) (serialize (hash-table-rehash-threshold hash-table) stream) (serialize (hash-table-count hash-table) stream) (maphash (lambda (key value) (serialize key stream) (serialize value stream)) hash-table)) * DESERIALIZING IT AGAIN CODE (defmethod deserialize-contents ((marker (eql +hash-table+)) stream) (let* ((test (deserialize stream)) (size (deserialize stream)) (rehash-size (deserialize stream)) (rehash-threshold (deserialize stream)) (count (deserialize stream))) (let ((table (make-hash-table :test test :size size :rehash-size rehash-size :rehash-threshold rehash-threshold))) (loop repeat count do (let* ((key (deserialize stream)) (value (deserialize stream))) (setf (gethash key table) value))) table))) END * OBJECTS THAT CAN'T BE SERIALIZED Some Lisp objects can't be serialized portably: structs and function objects are the most obvious examples. I think that not serializing those is a small price to pay for portability, but I suppose there are exceptions. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 3. FINDING/UPDATING OBJECTS: OBJECT TABLE, CACHE * INTRO That was easy. So now we're able to save normal Lisp objects to disk. And we can even load them back later. This means we're doing fine for settings where we can dump the entire world from time to time, and load it back when we need it. For many applications, this is all that's needed. And a good serializer can be orders of magnitude faster than using WRITE/READ or writing MAKE-LOAD-FORM methods, so we're already ahead of the game. * SERIALIZING PERSISTENT OBJECTS Things get more interesting when we need to serialize persistent objects. For persistent objects we must make sure that we respect object identity, for example. And we must save some kind of representation of the object's class, so we can recreate it correctly. And we must save all slot values, so we need some simple MOP magic to find all slots. * OBJECT IDENTITY Let's look at object identity first: Suppose we have a simple persistent family: CODE (let* ((jane (make-instance 'person :name "Jane")) (dick (make-instance 'person :name "Dick" :child jane)) (mary (make-instance 'person :name "Mary" :child jane))) (make-instance 'family ;; Try to be politically correct. :parent-1 dick :parent-2 mary)) END CODE Now JANE is a 'shared object': it is (or 'she is') referenced twice. But we don't want to save her *twice*. When serializing either DICK or MARY, we just save a *reference* to JANE. When we *deserialize* DICK (or MARY) at a later point, we don't deserialize JANE either. Instead we fill the CHILD slot of DICK with a *proxy*. Only when the application tries to read DICK'S CHILD slot will the JANE object be loaded into memory by the deserializer. * SLOT-VALUE-USING-CLASS We use the MetaObject Protocol to detect whenever a persistent slot is being accessed. Here's the method that makes sure that proxies are automatically dereferenced at the right moment: CODE (defmethod slot-value-using-class :around ((class persistent-class) object slot) ;; Automatically dereference proxies. (declare (ignore class slot)) (maybe-dereference-proxy (call-next-method))) END We have similar methods on (SETF SLOT-VALUE-USING-CLASS) to hook into slot *writes* and on INITIALIZE-INSTANCE to do the right thing when a new persistent object is created in memory. * PROXIES Here's the definition of a proxy in Rucksack: CODE (defclass proxy () ((object-id :initarg :object-id :reader object-id) (rucksack :initform (current-rucksack) :initarg :rucksack :reader rucksack)) (:documentation "Proxies are some kind of in-memory forwarding pointer to data in the cache. They are never saved on disk.")) END Instead of a class like this, we could also have used plain object ids (no-nonsense raw integers) to represent the objects. This would be more efficient, but it has two problems: 1. It would become quite difficult to work with more than one rucksack at a time, because you'd need to keep track of which object id belongs to which rucksack in your application code. 2. You lose 'type information': you can't distinguish an object id from a proxy, because they both look like integers from the outside. This means that the application programmer will have to dereference proxies by hand instead of having it done automatically by the compiler (unless you force a static distinction between slots that always contain proxies and slots that contain other value; but such a rigid distinction wouldn't really fit with Lisp's dynamic programming style). * UPDATING PERSISTENT OBJECTS One question that I had to answer for Rucksack is: how are slot values of persistent objects updated on disk? I've made a big choice that has a strong influence on rest of Rucksack: objects on disk are never overwritten (as long as they can be referenced). Instead of serializing a new version of an object into the same disk space as an old version, the new version is serialized into some new, freshly allocated space on disk. I'll return to this decision later, when I talk about parallel transactions and recovering from crashes. * CHOICE: OBJECT TABLE, DIRECT POINTERS, BTREES? Another choice that a persistence library has to make is: given an object ID, how do I find the corresponding object? How do I find the disk position of the object, so my deserializer can reconstruct it in memory when I need it? For Rucksack, I've considered three possibilities: 1. Use the disk position itself as the object ID. So an object ID would be pretty much like a normal pointer in memory. 2. Use an 'object vector': the vector is indexed by object identifiers, and each vector element contains the disk position of the object. 3. Use a more complicated indexing scheme (btrees, for example). In fact, I'm *still* considering these possibilities. At the moment, Rucksack uses the second scheme. Each rucksack directory has a file called 'heap', which contains serialized versions of all objects, and a file called 'objects', which contains a straightforward mapping from object IDs to object positions on disk. * CACHE So the object table and the serializer give us a way to reconstruct an object, given its ID. But reconstructing an object from its ID is an expensive operation. We may want to cache this. Keeping a cache of recently reconstructed objects may speed up our programs a lot. * GETTING AN OBJECT FROM THE CACHE Here's a code fragment from an old version of Rucksack that shows the basic working of the cache: CODE FRAGMENT (defmethod cache-get-object (object-id (cache standard-cache)) (let ((result (or (gethash object-id (objects cache)) (gethash object-id (dirty-objects cache)) (let ((object (load-cached-object object-id cache))) ;; Add to in-memory cache. (when (cache-full-p cache) (make-room-in-cache cache)) (setf (gethash object-id (objects cache)) object) object)))) ;; Put it (back) in front of the queue. (add-to-queue object-id cache) result)) END So we keep a hashtable of objects that have been loaded from disk and a separate hashtable of objects that have been changed since they were loaded. If we can't find an object in one of those hashtables we load it from disk. * LOWER-LEVEL CACHES I've never written a relational database or even looked at the code for one, but I've tried to read some of the database literature. From that literature I get the impression that relational databases usually do their caching on a much lower level than what I just described. For example, here's a quote from "Database Systems: The Complete Book" by García-Molina, Ullman & Widom: A DBMS will manage disk blocks itself, rather than relying on the operating system's file manager to move blocks between main and secondary memory. I haven't taken that route for Rucksack, and I've decided to ignore issues like the size of disk blocks entirely, mostly because: - I hope that the operating system will take care of this - I want to keep Rucksack as simple as possible - I want Rucksack to be written in portable Common Lisp, which doesn't know the concept of disk blocks. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 4. GARBAGE COLLECTION * DELETE-INSTANCE: YES OR NO? One question that persistence libraries need to answer is: Do we provide a way to delete a persistent object, so that it won't be 'loadable' anymore and its disk space can be reused? This question is very similar to the old manual versus automatic memory management question. Of course, Lispers already knew the answer to that question before I was born: manual memory management is a sin. And it isn't even a pleasant one. Somehow the question looks different when we're dealing with persistence. But as soon as you introduce a function that takes an object ID and removes the corresponding object from the persistent store in such a way that it can't be loaded again, you basically introduce 'dangling pointers' again. Welcome back to 'pointer hell'! So... While I'm uncertain about many of the decisions I've made for Rucksack, I'm pretty sure about this one: There will be no delete-instance. Every object id that's visible to the programmer will always refer to a valid persistent object, so there will be no 'dangling pointers'. When necessary, a garbage collector will take care of reclaiming unused disk space. To ensure that an object will never be visible again, you can remove it from all objects that contain a reference to it. This is exactly how we normally deal with non-persistent objects in Lisp. (I believe Martin Cracauer may tell us that garbage collection is a decadent luxury and that real programmers don't need it. And he would probably be right too, *if* you *absolutely* need to squeeze maximum performance out of your system. But Rucksack is designed with slightly more modest performance goals, and slightly less modest usability goals. ) * THE GARBAGE COLLECTOR I don't have time to discuss all details of the memory manager and the garbage collector, so I'll briefly describe the basics. I've considered both copying and mark-and-sweep collectors. One advantage of a copying collector is that it automatically defragments the heap, and that allocating space for a new object can be trivial: you just serialize it to the end of the heap file. The biggest problem I see with a copying collector is that it becomes inefficient if you have many big objects that don't contain references to other objects (for example the contents of binary files that we want to keep in the database): these objects will need to be copied during each collection. An application I wrote a while ago is a simple publication archive that fits this model: not much 'structured data' and fairly many binary files, with sizes ranging from a few hundred KB to about 100 MB. When I was writing my copying collector, I began to imagine my collector spending 99 % of its time just copying these files. Now I'm usually not the kind of programmer who tries to squeeze the last bit of performance out of his programs, but this looked painful even to me. So I decided to try my luck with a mark and sweep collector. Basically it works like this: 1. BEGIN At the beginning of a garbage collection cycle, all objects are marked 'dead'. If you use the object vector that I described earlier, this is very easy to do. Each element of the object vector reserves a few bits for the garbage collector. One of these bits is the so called mark bit. So we just loop across the vector, clearing the mark bit. 2. MARK Now all live objects are traced, beginning with the so called root set. Each object that can be reached from the root set is marked 'alive' again. All other objects will remain marked 'dead', which is exactly what we need. 3. SWEEP Now the entire heap is 'sweeped', object after object, from left to right. The disk space for each object that is marked dead is returned to a free list. * MAKING IT INCREMENTAL This works well and is not too difficult to implement. But for big databases it can take a while (minutes, maybe even hours) to run. If you're writing web servers or other programs that continually need to be able to handle user requests, this is unacceptable. So I made it incremental. To make a garbage collector incremental, you need to do two things: 1. The first thing is that the collector must be able to do its work in many small steps. (The most obvious moment to run such a small step is just after some disk space has been allocated for a new object.) Each small step must know where it should continue the work of the previous small step. In other words, all the state that's normally kept in local variables on the stack must now be registered in slots of the garbage collector object. CODE FRAGMENT (defclass mark-and-sweep-heap (garbage-collector free-list-heap serializer) ((state :initform :ready :type (member :starting :marking-object-table :scanning :sweeping-heap :sweeping-object-table :finishing :ready) :accessor state) ;; Some counters that keep track of the amount of work done by ;; the garbage collector. (nr-object-bytes-marked :initform 0 :accessor nr-object-bytes-marked) (nr-heap-bytes-scanned :initform 0 :accessor nr-heap-bytes-scanned) (nr-heap-bytes-sweeped :initform 0 :accessor nr-heap-bytes-sweeped) (nr-object-bytes-sweeped :initform 0 :accessor nr-object-bytes-sweeped))) END CODE As you can see in the code fragment, there isn't really that much state to keep track of. 2. The second thing is that the garbage collector needs to know how much work it should do at each step. An effective way is to divide the size of the free heap space by the size of the heap that still needs to be garbage collected and try to keep that number more or less constant. In formula form you get something like: CODE FRAGMENT New / Free = DoNow / Do so: DoNow = (New / Free) * ToDo New : number of new bytes allocated at this step Free : number of free bytes in the heap DoNow : number of bytes to collect at this step Do : number of bytes to collect during this garbage collection END Fortunately, you don't need a degree in mathematics to understand this formula. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 5. TRANSACTIONS * ATOMICITY, ISOLATION, DURABILITY Changes to data often need to occur in so called 'atomic' groups. This means that either *all* of the changes should be executed, or none of them should be executed. Such an all-or-nothing group of changes is usually called a transaction. When an error occurs in the middle of a transaction, all changes that were executed by the transaction must be undone: this is called a 'rollback'. When the transaction has finished without any errors, all changes must be saved to disk: this is called a 'commit'. Apart from atomicity ('all-or-nothingness') and durability (saving all changes) another important requirement for transactions is that they may not interfere with each other. As long as a transaction hasn't committed its changes, it must run as if it were the only transaction using the database. In other words, it must appear to be 'isolated' from all other transactions. Bank accounts are a popular example when discussing transactions. Let's look at the following function to transfer money from one account to another. PICTURE (defun transfer-money (amount account-a account-b) "Transfer AMOUNT from ACCOUNT-A to ACCOUNT-B." (with-transaction () ;; Check that there's enough money in the account. (unless (plusp (- (slot-value account-a 'balance) amount)) (error "Not enough money.")) ;; Subtract the amount from account A. (decf (slot-value account-a 'balance) amount) ;; Add the same amount to account B. (incf (slot-value account-b 'balance) amount))) END Now suppose two transactions T1 and T2 are trying to transfer 100 EURO from an account that contains 150 EURO. If the bank is not careful, we could get the following sequence of events: PICTURE 1. T1: check that there's enough money -> OK 2. T2: check that there's enough money -> OK 3. T1: subtract 100 EURO from the account -> now it contains 50 EURO. 4. T2: subtract 100 EURO from the account -> OOPS. END This is one kind of error that can happen when two transactions are not isolated from each other. * POSSIBLE SOLUTIONS To guarantee that transactions run in isolation, we could: - Let the application programmer worry about it and write locking code where necessary. I haven't chosen this solution. Keeping transactions isolated is a difficult problem where it's easy to make mistakes that are hard to detect. - Use some kind of automatic locking. The best known automatic locking strategy is called "two phase locking". This basically means that a transaction acquires a lock for each object that it wants to change, and then doesn't release the lock until the entire transaction has finished. I haven't chosen this solution either. One potential problem with two phase locking is that long-running transactions (for example a transaction that looks at all bank accounts to make a monthly report) can easily lock up all other transactions. Now there may be solutions for this too, but I felt these were getting too complicated. * RUCKSACK'S SOLUTION So what does Rucksack do? It uses so called 'optimistic concurrency control' combined with 'multiple object versions'. 'Optimistic concurrency control' means that Rucksack doesn't use locks to keep its transactions isolated from each other. Instead it just rolls back a transaction when it is trying to change data that it shouldn't change, basically telling it to 'try again later'. Rolling back and retrying a transaction can be rather expensive; that's why this strategy is called 'optimistic': it assumes that this kind of transaction conflict happens rarely. Rucksack tries to make transaction conflicts more rare by using multiple object versions. With multiple object versions, each transaction that modifies an object gets its own copy of that object. This means that an older transaction can stay in its own consistent little world and happily keep reading older versions of objects that are already being changed by younger transactions. So one transaction can have its cake while another transaction is eating it ;-) That doesn't mean that both transactions can be eating the cake at the same time, of course. In that case, Rucksack will abort the second transaction that tries to eat the cake and give it an opportunity to retry later. If there's any cake left, of course. * SAFETY NET The fact that Rucksack uses optimistic concurrency control does not mean that you can't use manual or automatic locking on a higher level. It would be possible to use manual locking and treat Rucksack's transaction conflict detection mechanism as a sort of safety net, for example. * DETECTING CONFLICTS So how are transaction conflicts detected? Each transaction has a unique ID. For each version of each object, we register the ID of the transaction that created/modified the object. The ID also functions as a relative timestamp: transaction A is older than transaction B if its ID is less than the ID of transaction B. Each object creation/modification always happens in the context of a transaction. A transaction conflict occurs when an old transaction tries to modify an object that was modified by a younger transaction. (Actually, there only needs to be a serialization conflict when an old transaction tries to modify a slot that was modified by a younger transaction. But Rucksack detects conflicts at the object level, not the slot level.) * TRANSACTIONS AND THE CACHE Before we had transactions, the cache was relatively simple: it kept a set of 'clean' objects: objects that had been read from disk but had not been changed and a set of 'dirty' objects: objects that had been modified and needed to be written back to disk. With multiversion transactions, this design needs to change. Here's a simplified version of Rucksack's class definition for CACHE: PICTURE (defclass standard-cache (cache) ((heap :initarg :heap :reader heap) (schema-table :initarg :schema-table :reader schema-table) (commit-file :reader commit-file) ;; Clean objects (objects :initarg :objects :reader objects) (highest-transaction-id :initarg :highest-transaction-id :initform 0 :accessor highest-transaction-id) (transactions :initform (make-hash-table) :reader transactions :documentation "A mapping from transaction ids to transactions. Contains only open transactions, i.e. transactions that haven't been rolled back or committed."))) END And here's the definition of TRANSACTION: PICTURE (defclass standard-transaction (transaction) ((id :initarg :id :reader transaction-id) (dirty-objects :initform (make-hash-table) :reader dirty-objects :documentation "A hash-table (from id to object) containing all objects of which the slot changes have not been written to file yet."))) END * GETTING OBJECTS FROM THE CACHE So what happens when an object must be retrieved from the cache? With multiple object versions, a transaction is only allowed to see the version that it has modified itself. Or, if it hasn't modified the object, the version that was modified by the youngest transaction that's older than itself. For example: if the current transaction is #3, and the object has been modified by transactions #1 and #5. Then the 'compatible' object version is the one that was modified by #1. CODE FRAGMENT (defmethod cache-get-object (object-id (cache standard-cache)) (let ((transaction (current-transaction))) (or ;; Unmodified, already loaded and compatible with the ;; current transaction? Fine, let's use it. (let ((object (gethash object-id (objects cache)))) (and object (<= (transaction-id object) (transaction-id transaction)) object)) ;; Modified by an open transaction? Try to find the ;; 'compatible' version. (find-object-version object-id transaction cache) ;; Not in memory at all? Then load the compatible version ;; from disk. (multiple-value-bind (object most-recent-p) (load-object object-id transaction cache) (when most-recent-p ;; Add to in-memory cache if the loaded object is ;; the most recent version of the object. (when (cache-full-p cache) (make-room-in-cache cache)) (setf (gethash object-id (objects cache)) object)) object)))) END * ROLLING BACK A TRANSACTION When a transaction is rolled back, all side effects of the transaction must be undone. There must be no noticeable difference between an rolled back transaction and a transaction that hasn't even started yet. Rolling back is only possible for a transaction that hasn't been committed yet (and is not currently being committed). This makes rolling back easy: the transaction can basically just clear its dirty objects table to ensure that no changes to those objects will be written to disk. So here's some more code: PICTURE (defmethod transaction-rollback-1 ((transaction standard-transaction) (cache standard-cache) (rucksack standard-rucksack)) (clrhash (dirty-objects transaction)) (queue-clear (dirty-queue transaction)) (close-transaction cache transaction)) END ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 6. CRASHES AND TRANSACTION RECOVERY * RECOVERY The main problem for recovery is that a transaction commit may fail halfway. This would result in an inconsistent state. To preserve consistency, we need to undo the effects of the partially committed transaction. As usual, there are different ways of doing this. And, as usual, I've tried to find a strategy that's relatively simple to implement, integrates well with the rest of Rucksack and is a bit different from the most popular way of doing this. Rucksack's recovery strategy is based upon two ideas: 1. The most important idea is that we can use the fact that Rucksack never actually overwrites objects but always creates new versions. 2. The second idea is that Rucksack saves a summary of what a transaction commit is going to do to a separate file, a so called 'commit file'. After saving the summary of what it's going to do, it actually *does* it and then deletes the summary again. If a transaction-commit fails halfway, the recovery procedure can undo the partial effects of the failed transaction by use the information in the commit file. * COMMIT FILE The following code shows how commit files are created. CODE (defun create-commit-file (transaction cache) "Write object ids of all dirty objects to the commit file, so recovery can do its job if this transaction never completes." (with-open-file (stream (commit-filename cache) :direction :output :if-exists :supersede :if-does-not-exist :create) (serialize (transaction-id transaction) stream) (serialize (hash-table-count (dirty-objects transaction)) stream) (loop for object-id being the hash-key of (dirty-objects transaction) do (serialize object-id stream)))) END So the commit file contains the transaction id of the transaction that's going to be committed *and* it contains the object ids of all *objects* that are going to be committed. Once the commit file is created successfully, the actual transaction commit will save all dirty objects to disk and then remove the commit file. * RECOVERING Whenever Rucksack is started, it checks to see if there's a commit file left. If there is, it knows that is being started after a crash. It will then investigate all object ids that were saved in the commit file. For each object that contains a version created by the unfinished transaction, it will remove that version from the object version list. After that, it can resume normal operation. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 7. INDEXING OBJECTS: BTREES, SLOT/CLASS INDEXES, MOP MAGIC * INTRO So... We know how to save persistent objects and load them back, we can cache the load operation for better performance, we can reclaim unused disk space when necessary, we can use transactions to keep changes isolated from changes in other processes, and we even stand a fair chance of recovering from failure. What more could we need? Ah yes... Sometimes we want to find some objects we're interested in without traversing the whole database. In other words, we need a persistent indexing mechanism. Unlike many other database or persistence libraries, Rucksack builds indexes on top of persistent objects, not the other way round. Maybe this costs us a bit in performance because we can't use any low level tricks for the indexing. But in return for that we get a flexible and easily extendible indexing mechanism. Here's how it works: * MORE MOP MAGIC We already needed some MOP magic to hook into SLOT-VALUE-USING-CLASS and INITIALIZE-INSTANCE for persistent objects. Now we need some more magic to add our own slot and class options for specifying indexes. This means we can specify something like the following: PICTURE (defclass person () ((name :index (btree :key< string< :key= string=)) (year-of-birth :index (btree :key< < :key= =)) (age :persistence nil)) (:indexed t) (:metaclass persistent-class)) END PICTURE You see an :INDEX slot option being used for the NAME and YEAR-OF-BIRTH slots, a :PERSISTENCE slot option for the AGE slot and an :INDEX class option for the whole PERSON class. Let's look at the class option first: by specifying :INDEXED T, we specify that the object IDs of all instances of this class (or one of its subclasses) will be added to an index. Then the generic function RUCKSACK-MAP-INSTANCES can be used to iterate over all instances of this class (or one of its subclasses, unless those subclasses have an :INDEXED NIL class option). For slot options the situation is similar, except that you can use so called index specs to specify explicitly what kind of indexing should be used. An index spec has a simple structure: it is either a symbol or a list. If it's a symbol, it's the name of an index class. If it's a list, the car of the list is the name of an index class, and the cdr contains a plist of initargs for the index. The index class can be an arbitrary class (including classes that you define yourself) as long as it's persistent and follows a simple indexing protocol. At the moment, Rucksack has only one index class: a no-nonsense, catch-as-catch-can implementation of btrees, written on top of persistent conses and persistent vectors. * INDEXES AND GARBAGE COLLECTION All class and slot indexes are automatically added to Rucksack's set of root objects, so indexed objects won't be removed by the garbage collector. All non-indexed objects are NOT part of the root set, so they will be garbage collected if they're unreachable from one of the indexes. * QUERIES (if time left) I think it should be relatively easy to build a simple query language on top of RUCKSACK-MAP-INSTANCES and RUCKSACK-MAP-SLOT-VALUES. This should make it possible to generate relatively efficient code for queries like: PICTURE (select 'event (lambda (event) ...) :where '(and (string= name "ECLM") (= year 2006)) :order-by 'name) END I think there's quite a bit of literature on query optimization and Lisp's fantastic code transformation and run-time compilation features may add some interesting new possibilites to the standard repertoire. But that's still open territory. At least for me ;-) * MULTI-DIMENSIONAL INDEXES (if time left) Another interesting Rucksack extension that I haven't written yet would be to define multi-dimensional indexes (i.e. indexes that look at the value of more than one slot). For example: PICTURE (defclass point () ((x :type number) (y :type number)) (;; Don't use standard indexing (by object id), because we're not ;; interested in EQL-ness for points. :indexed nil ;; Index points by their coordinates instead. :index ((x y) (2d-index :key< (< <) :key= (= =)))) (:metaclass persistent-class)) END I'm just making something up here. Maybe another syntax would be needed for multi-dimensional indexing. My main point here is that it is relatively easy to create such extensions yourself and integrate them with the rest of Rucksack without having to resort to low-level hacks. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 8. SCHEMAS * INTRO Suppose we have some persistent objects that were created with the following class definition: PICTURE (defclass person () ((name :initarg :name) (age :initarg :age)) (:metaclass persistent-class)) END Now we decide to change the class definition of PERSON to: PICTURE (defclass person () ((name :initarg :name) (birth-date :initarg :birth-date)) (:metaclass persistent-class)) END * THE PROBLEM What should happen when we try to load one of the old person instances? We can't rely on UPDATE-INSTANCE-FOR-REDEFINED-CLASS, because as far as our Lisp implementation is concerned, the disk version of our persistent PERSON isn't an instance of any class at all: it's just some bytes in a file. And when we load the object from disk, we can't ALLOCATE an INSTANCE of the obsolete PERSON class, because we have no portable way of finding that obsolete class. * VERSIONS OF CLASS DEFINITIONS Rucksack's solution is to keep track of the schema of each persistent object. Each schema corresponds to *a version of* a class definition. The most important elements of a schema are: PICTURE - schema ID - class name - list of effective persistent slots - version number END All schemas are kept in a schema table. They're indexed both by ID and by class name. Whenever Rucksack saves a persistent object, it saves the schema id that corresponds to the object's class. When a persistent object must be loaded from disk, Rucksack loads the schema nr and finds the corresponding schema. If the schema has an old version number, Rucksack calls a generic function after calling ALLOCATE-INSTANCE for the current class version. The generic function is very similar to UPDATE-INSTANCE-FOR-REDEFINED-CLASS: it takes a list of added slots, a list of deleted slots and a property list containing the slot names and values for slots that were discarded and had values. The default method for this function ignores the deleted slots, initializes added slots according to their initargs or initforms and initializes shared slots (that did not change) with the values that were saved on disk. * CREATING NEW SCHEMAS When a class definition changes (e.g. because of DEFCLASS), this must be detected by Rucksack. (This can be done with the MOP by hooking into (RE)INITIALIZE-INSTANCE for persistent classes. We're doing that anyway for keeping the class and slot indexes up to date.) Rucksack analyzes the new class definition; if it's different from the previous version, a new schema is added to the database. From that moment, when an instance of the redefined class is created it will be saved with the new schema number.