There is a mutual understanding in the Drizzle community that the MySQL query cache works well for a small database but isn’t sufficient for relatively large scale usages. Does your application involve a lot of database updates? if so, you’ll probably face fragmentation issues in the query cache (though using the query cache isn’t suitable for use cases like this).
Caching is the key ingredient in boosting the performance of any software that requires significant amount of computation, hence it is something that can’t be overlooked. So how can we improve Drizzle?
The idea is to create a pluggable query cache subsystem that can work in a large scale environment. Drizzle, being a micro-kernel DBMS, it makes sense to make the cache component pluggable and let the DBA choose the caching solution of their choice. This is exactly what I’m working on at the moment and my first plugin will allow Drizzle to use memcached as its query cache.
For example, a DBA could hook up their memcached pool to Drizzle and use several gigabytes of fast cache space to cache their results.
Things to consider
- Does the DBA really want to cache results?
- Does the result construction take long enough to care?
- Do we want to specify a specific SQL statement to always cache?
- Do we want to enforce a certain table to be cached?
- Transactional Engines
If we can satisfy the above points and achieve modularity, I think its a total win. For those that like diagrams, here is the architecture that is on my mind at the moment:

Benefits of using memcached
memcached is proven to work and help scale web applications in a cost effective fashion by various players in the web industry. It is also fast. The time complexity of fetching a cached result from memcached is O(1), which is an order we all love. Furthermore, by using memcached, the fragmentation issue disappears since this is a problem that the memcached community had to face in the past and successfully overcame by developing the slab subsystem.
Want to scale? with consistent hashing enabled, you can greatly reduce the number of cache misses from adding/removing a node from a live pool. Got spare boxes lying around? hook them up and powerup Drizzle! Need support? both memcached and Drizzle community members are heartwarming people.
Other Solutions work Too!
The beauty of modularity is that you can create and use your own solution for your unique requirements. For example lets assume that there is a webshop that wants to keep the number of physical servers down (e.g. limited monetary/space resource).
To satisfy the requirement stated above, you could cache to a fantastically fast hash database, such as Tokyo Cabinet (much, much faster than BDB). If you haven’t heard of it, you should look at the incredible benchmark comparison). So, what I really wanted to say is that the microkernel property of Drizzle will open up a lot of new possibilities for your application and help you tackle the new requirements that seem to come out of no where.
Where from here?
Currently going through the UDF -> Plugin Architecture conversion done by Mark, and planning on basing the code on his logging plugin while its fantastically simple. My work will be done in:
- lp:~tmaesaka/drizzle/pluggable-qcache
I’ll hopefully have something decent to show soon, and I will keep people updated on my blog, IRC and the Mailing List (drizzle-discuss).
So that is all I have to say for now… If you have any suggestions, please do enlighten me :)
Toru Maesaka drizzle, memcached, oss drizzle, memcached, performance