I’m a little behind in announcing this but I’m going to be speaking at O’Reilly’s MySQL Conference this year. My presentation is a three hour tutorial titled, Drizzle Storage Engine Development. Practical Example with BlitzDB. Three hours is a long time but I assure you that there will be a break.
This session isn’t solely about going through Drizzle’s Storage Engine API. Various performance topics like B+Tree structure, memory handling and concurrency control will be covered. I will also go through BlitzDB’s design concept and it’s internal stuff. So, needless to say I’ll talk a lot about Tokyo Cabinet and it’s internals as well.
Hopefully those that come along will walk out of the tutorial standing far ahead of the start line. It will help you get started on reading the implementation of other storage engines in the MySQL ecosystem (MyISAM, InnoDB, PBXT, Federated and so forth). Better yet you will start writing one.
Looking forward to seeing you there :)
Toru Maesaka drizzle, event, oss, travel conference, drizzle, mysql, travel
I recently gained some decent momentum on developing the indexing component of BlitzDB. Most of my time spent on BlitzDB for the last couple of weeks have been studying the indexing API and digging into how other engines have implemented it. I even referred back to MySQL 4.x to see how the BDB engine pulls off the Indexing API.
The actual coding wasn’t too bad thanks to Tokyo Cabinet’s awesome B+Tree API. I’ve been busier adding new tests and fixing silly bugs as they arise. I also implemented the Primary Key optimization that I blogged about a while back. As a result of all this, the following goodness has been added to BlitzDB’s Trunk.
- Index Lookup
- Forward Index Scan
- Reverse Index Scan
This means that BlitzDB is now equipped with both a Table Scanner and an Index Scanner which are two essential components for a general purpose storage engine. As much as I’d like to work on optimizing the code and adding features (like recovery), I’m going to take a break and spend the rest of the month working on testing and debugging. There’s no point in adding features if the base has notable flaws in it.
Challenges Encountered
Writing the Index Scanner itself is easy. The most difficult thing that slowed me down was developing the comparison function for index keys. The end result was a simple piece of code but I had to study various things before I could start writing any code.
- How to respect collation
- How keys are represented internally
- How types are represented internally
- How to write a custom comparison function for Tokyo Cabinet
- … and so on
I’ve also started using Evernote to jot down my spontaneous ideas on optimizing BlitzDB. I’ve made these notes public and they will most likely be updated while I’m commuting on the train.
There are much more that I’d like to write about like how I intend on developing the table recovery routine without simply using TC’s recovery mechanism but I shall restrain myself for another day.
Toru Maesaka drizzle, oss blitzdb, drizzle, index
The concept of web services to interoperate and broaden the ecosystem is a beautiful thing. I agree to this concept but lately I’ve found myself being frustrated to a certain subset of this concept. Where does my frustration come from? It comes from cross-posts of micro blogging messages. To be more specific, seeing significant amount of Twitter updates on my Facebook news feed.
Living in Tokyo, I usually check social updates and RSS feeds on the train. It often begins from checking unread tweets then firing up whatever I feel like checking next (usually Facebook, Google Reader or Mixi). What disappoints me here is having to rip through tweets that I’ve already looked at on Facebook. Tokyo is a busy place so it’s important to gain information efficiently.
On Facebook I have various types of connections from childhood friends to acquaintances. Some of them are on Facebook and not on Twitter (and vice versa). I guess this is to do with user demographics but the important thing here is that I’m gradually finding it hard to pickup content from my friends outside of IT. I’m saying IT because it seems from observing my news feed that it’s mostly my friends in the IT industry that have setup cross posting.
There is probably some sort of content balancing gimick in the news feed code but this seems to not work well against my heavy Japanese twitter user friends. It’s a shame because my cross posting friends are completely innocent and aren’t trying to deliberately cause noise. Last thing I want to do is defriend people just because they are innocently causing noise.
Further Thoughts
This is what user specified content filters are for! If there’s a ‘block updates from xxx service’ option I would use it. Or perhaps I’m missing something and there is a way to filter out content from certain web services in the news feed.
If there is such an option, I would love to be enlightened!
Toru Maesaka random, webservice interoperability, usability, web