Archive for the ‘oss’ tag
Open Source Conference in Malaysia
So I’ve been bugging Colin Charles to invite me over to Malaysia for the last couple of months and what does he offer me? an opportunity to speak at a open source conference in Malaysia
FOSS.my is a two day event (9th & 10th next month) and as stated on their conference homepage, the aim of this conference is to cover technical aspects of various OSS projects without any business/sales intervention for those that follow open source technology in South East Asia (and other regions too of course). The cool thing is that after telling mixi about this event, they liked the idea so much they decided to sponsor the event immediately. I didn’t really expect this but hey, awesome.
At this conference, I will be doing two talks where one will go over how mixi uses various OSS technologies to power the largest social networking service in Japan. The other talk will cover how the memcached internals work and latest hot topics in the development community like the upcoming binary protocol.
I’ve never been to Malaysia before so I’m totally looking forward to this trip.
Perl, Binary and Memcached
The last few days I’ve been working on updating the binary protocol test in the latest memcached development branch to comply with the latest binary protocol specification. Prior to this update, the test client was sending an invalid request to the server, which as a consequence made the test hang and never finish.
In brief, this is the big difference:
Previously, CAS (compare and swap) value was treated as part of the extra header that is appended/serialized behind the request/response header. In the latest specification, CAS value is a required 8 byte field in the 24 byte request/response header (header size in the previous version was 16 bytes). Other than that, the rest were minor differences in the packet format of extra fields in certain commands. Easy work
Here is the actual diff:
http://github.com/tmaesaka/memcached/commit/67b4da9eb855ebe7695a197320232b8d25692f84
As you can see, the test suite currently used by memcached is Perl based. This was fortunate for me since Perl is the second language to C that I like. I also made the code style to be more “perl-like” by fixing the indents. Although heh, I can see a Perl programmer arguing that the use of if/else blocks in the test is not best practice.
You know, fixing the test was pretty meaningful to me since it had forced me to study the binary protocol specification, which I knew almost nothing about at the hackathon in Santa Clara, CA back in april. Hopefully I can make productive suggestions at the upcoming hackathon in Menlo Park, CA in october.
Mac OS X, Ubuntu and Drizzle
So admittedly, Mac OS X is currently not the most friendly platform to work with Drizzle, mostly due to library issues.
OS X has several weird hacks in it due to licensing issues (libreadline comes into mind first). Sure, MacPorts, Darwin Ports and etc could get around this problem but should this be necessary? Personally I dislike resorting to these solutions. Fortunately I’ve been doing all my Drizzle work with Ubuntu on a dedicated server so I’ve yet to come across any build related issues. However, it kind of sucks not to be able to take my Mac out to a cafe in the weekend and work there without connectivity.
So to make my life happier, I installed Ubuntu on my MacBook Pro (alongside OS X of course).
I came across few problems like corrupted partition table in the process of getting Ubuntu working but the following Ubuntu threads helped greatly:
General Instructions
Boot related problems when using Hardy Heron (Ubuntu 8.04)
You know, getting Ubuntu running on my Mac was entertaining since I was talking to Monty Taylor about his thoughts on how using a Mac is selling out yesterday. Now what does this make me now?
Happy Hacking ![]()
Drizzle, out in the open
So I’ve been fortunate enough to participate in developing Drizzle, which is a microkernel fork of MySQL that you can read more about on Brian Aker’s blog post.
In brief we are getting rid of components that we find unnecessary in MySQL by default, and instead making them optional by refactoring the server to be modular, aka microkernel. Another words, we are trying to develop a lean, fast, simple and extensible RDBMS that would fit well in mid and large scale web applications.
How? well, take Query Cache for example. QC works well in a one-man database but it has very small (if not no) effect when we start thinking big, and especially in the web industry. So why bother keeping it? what would be better is if we could _optionally_ make Drizzle use a cluster of memcached for query caching, which would also allow many database instances to share a common cache. Same things can be said about many other components, such as ACL and Stored Procedures. This is exactly why we are moving to a microkernel architecture. If you want something special, you should be able to customize the server in a relatively easy fashion and satisfy your requirements, rather than having to refactor the server code yourself.
Indeed, not everyone needs a microkernel database, in fact I assume most people won’t. However, there are enough web developers and companies in the small portion of the pie that would love a microkernel database to solve the problems that they are facing today. This is exactly why we don’t consider Drizzle to be a MySQL replacement.
If you’d like more information, do check out our project page on Launchpad and browse through the mailing list archive. Drizzle development is done in a true open source fashion by using open resources and tools like Bazaar and Launchpad. This means that everyone is free to come up with improvement suggestions/patches and submit it to the drizzle community.
Drizzle has been very fun and I thank Brian for getting me involved in such a fun project
Btw, I wrote a blog post on Drizzle in Japanese on the Mixi engineering blog too.