Progress on BlitzDB’s Index Component
I recently gained some decent momentum on developing the indexing component of BlitzDB. Most of my time spent on BlitzDB for the last couple of weeks have been studying the indexing API and digging into how other engines have implemented it. I even referred back to MySQL 4.x to see how the BDB engine pulls off the Indexing API.
The actual coding wasn’t too bad thanks to Tokyo Cabinet’s awesome B+Tree API. I’ve been busier adding new tests and fixing silly bugs as they arise. I also implemented the Primary Key optimization that I blogged about a while back. As a result of all this, the following goodness has been added to BlitzDB’s Trunk.
- Index Lookup
- Forward Index Scan
- Reverse Index Scan
This means that BlitzDB is now equipped with both a Table Scanner and an Index Scanner which are two essential components for a general purpose storage engine. As much as I’d like to work on optimizing the code and adding features (like recovery), I’m going to take a break and spend the rest of the month working on testing and debugging. There’s no point in adding features if the base has notable flaws in it.
Challenges Encountered
Writing the Index Scanner itself is easy. The most difficult thing that slowed me down was developing the comparison function for index keys. The end result was a simple piece of code but I had to study various things before I could start writing any code.
- How to respect collation
- How keys are represented internally
- How types are represented internally
- How to write a custom comparison function for Tokyo Cabinet
- … and so on
I’ve also started using Evernote to jot down my spontaneous ideas on optimizing BlitzDB. I’ve made these notes public and they will most likely be updated while I’m commuting on the train.
There are much more that I’d like to write about like how I intend on developing the table recovery routine without simply using TC’s recovery mechanism but I shall restrain myself for another day.
