<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Toru Maesaka</title>
	<atom:link href="http://torum.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://torum.net</link>
	<description>Web addict and a hackaholic based in Tokyo</description>
	<pubDate>Fri, 14 Nov 2008 03:28:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
	<language>en</language>
			<item>
		<title>Kuala Lumpur Conference Trip</title>
		<link>http://torum.net/2008/11/kuala-lumpur-experience/</link>
		<comments>http://torum.net/2008/11/kuala-lumpur-experience/#comments</comments>
		<pubDate>Thu, 13 Nov 2008 13:29:39 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[event]]></category>

		<category><![CDATA[travel]]></category>

		<category><![CDATA[foss]]></category>

		<category><![CDATA[malaysia]]></category>

		<guid isPermaLink="false">http://torum.net/?p=675</guid>
		<description><![CDATA[Despite almost missing my flight to Malaysia due to unfortunate reasons, I successfully managed to get on the plane and make it to Kuala Lumpur last friday. My suggestion from this experience is to not arrive at the airport 30 minutes before your flight&#8230;
It was pretty cool to arrive in Kuala Lumpur on the same [...]]]></description>
			<content:encoded><![CDATA[<p>Despite almost missing my flight to Malaysia due to unfortunate reasons, I successfully managed to get on the plane and make it to Kuala Lumpur last friday. My suggestion from this experience is to not arrive at the airport 30 minutes before your flight&#8230;</p>
<p>It was pretty cool to arrive in Kuala Lumpur on the same day as <a href="http://en.wikipedia.org/wiki/Raja_Petra_Kamarudin">RPK</a>&#8217;s release day and experiencing the delicious <a href="http://en.wikipedia.org/wiki/Ipoh_white_coffee">Ipoh white coffee</a> at the KL airport was a great start of <a href="http://torum.net/2008/10/oss-conference-malaysia/">my conference trip</a>.</p>
<p>The conference was fun and it was nice meeting new people in this region. As for my sessions, the interest in memcached was astonishing and it was great to get mixi&#8217;s name out there. Hopefully &#8220;slow&#8221; media sites in South East Asia (which shall remain nameless) will start utilizing memcached <img src='http://torum.net/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>I also managed to get the word out for <a href="https://launchpad.net/drizzle">Drizzle</a> in this region which was also fun with lots of questions coming my way&#8230; the funniest was being asked &#8220;hang on, is that David Axmark?&#8221; with a totally surprised look.</p>
<p>Here are the <a href="http://docs.google.com/gb?export=download&#038;id=F.7f51182b-29e5-404d-9bbf-e5a43ccaf84a">slides from my memcached talk</a> though my presentation style is <strong>less words on the slides and more talking</strong> so you may not find it informative by itself. Other than that, here are my thoughts from the stay:</p>
<ul>
<li>People are friendly and helpful</li>
<li>Hotel prices there are awesome</li>
<li>Great to hear that the Malaysian Gov supports Open Source</li>
<li>You can get around with just English in Kuala Lumpur</li>
<li>Their curry didn&#8217;t seem hot at start but gradually kicked in</li>
<li>mixi loads much faster than facebook in KL (I still love you guys)</li>
</ul>
<p>For those that are interested, I&#8217;ve taken lots of photos and they are now <a href="http://www.flickr.com/photos/tmaesaka/sets/72157608878943376/">up on Flickr</a>. Thumbs up to the <a href="http://foss.my/organizing-team/">event organizers</a> and thank you for taking care of me while I was there <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/11/kuala-lumpur-experience/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Studying the Malaysian Railway System</title>
		<link>http://torum.net/2008/11/studying-the-malaysian-railway-system/</link>
		<comments>http://torum.net/2008/11/studying-the-malaysian-railway-system/#comments</comments>
		<pubDate>Thu, 06 Nov 2008 15:08:07 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[event]]></category>

		<category><![CDATA[travel]]></category>

		<category><![CDATA[malaysia]]></category>

		<category><![CDATA[transport]]></category>

		<guid isPermaLink="false">http://torum.net/?p=672</guid>
		<description><![CDATA[I&#8217;m flying to Malaysia tomorrow to speak at FOSS.my and hey, until ten minutes ago I didn&#8217;t even bother finding out how to get to the hotel from the Kuala Lumpur International Airport.
Originally I was hoping for a courtesy shuttle from the airport to the hotel since the hotel looks fairly upclass or just grab [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m flying to Malaysia tomorrow to <a href="http://foss.my/schedule/toru-maesaka/">speak at FOSS.my</a> and hey, until ten minutes ago I didn&#8217;t even bother finding out how to get to the hotel from the Kuala Lumpur International Airport.</p>
<p>Originally I was hoping for a courtesy shuttle from the airport to the hotel since the hotel looks fairly upclass or just grab a taxi. The problem is that there is no courtesy shuttle and I don&#8217;t know how and what to instruct the taxi driver. Sure, a printout of Google Maps could help but I didn&#8217;t feel too comfortable about it.</p>
<p>So, after searching for ways to get to Mid Valley (where I&#8217;m staying) it turns out theres a <a href="http://en.wikipedia.org/wiki/Mid_Valley_Komuter_station">fancy Train Station</a> there which I can get to by first getting on the <a href="http://en.wikipedia.org/wiki/KLIA_Ekspres">KLIA Ekspres</a> from the International Airport to <a href="http://en.wikipedia.org/wiki/KL_Sentral">KL Sentral</a> and transfer on to <a href="http://en.wikipedia.org/wiki/Rawang-Seremban_Line">Rawang-Seremban Line</a>, where Mid Valley is the next station.</p>
<p>I think I&#8217;ll give this route a try tomorrow and needless to say, I&#8217;m going to print this entry and carry it with me. Programmers love to solve problems and finding your way to the hotel in a country that you&#8217;ve never been to and can&#8217;t speak the language seems to be no exception. Totally looking forward to this mini adventure <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/11/studying-the-malaysian-railway-system/feed/</wfw:commentRss>
		</item>
		<item>
		<title>ACM ICPC Asia Regional Contest in Aizu</title>
		<link>http://torum.net/2008/10/acm-icpc-in-aizu/</link>
		<comments>http://torum.net/2008/10/acm-icpc-in-aizu/#comments</comments>
		<pubDate>Tue, 28 Oct 2008 08:45:46 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[event]]></category>

		<category><![CDATA[acm]]></category>

		<category><![CDATA[icpc]]></category>

		<guid isPermaLink="false">http://torum.net/?p=669</guid>
		<description><![CDATA[The last couple of days have been really fun. I was at the University of Aizu visiting the ACM ICPC Asia Regional Contest representing mixi as a sponsor. It was really nice to be somewhere in Japan thats not Tokyo for once though the three hour train ride wasn&#8217;t the most exciting transport in my life. Fortunately [...]]]></description>
			<content:encoded><![CDATA[<p>The last couple of days have been really fun. I was at the <a href="http://www.u-aizu.ac.jp/official/index_e.html">University of Aizu</a> visiting the <a href="http://sparth.u-aizu.ac.jp/icpc2008/?lang=en">ACM ICPC Asia Regional Contest</a> representing mixi as a sponsor. It was really nice to be somewhere in Japan thats not Tokyo for once though the three hour train ride wasn&#8217;t the most exciting transport in my life. Fortunately I got a bit of programming done and had <a href="http://www.mariokart.com/mkds/">Mario Kart DS</a> with me <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p style="text-align: center;"><a title="ACM ICPC Asia Regionals by tmaesaka, on Flickr" href="http://www.flickr.com/photos/tmaesaka/2980177971/"><img class="aligncenter" src="http://farm4.static.flickr.com/3149/2980177971_0cbccd1565.jpg" alt="ACM ICPC Asia Regionals" width="500" height="334" /></a></p>
<p>Getting to mingle with Computer Science students from all over Japan at the closing party was awesome. I was never told about the speech that I had to make but I think the crowd liked my last minute &#8220;if you&#8217;re not then you should get involved with open source&#8221; speech. </p>
<p>Congratulations to all the contestants and thumbs up to all the staff!</p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/acm-icpc-in-aizu/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Open Source Conference in Malaysia</title>
		<link>http://torum.net/2008/10/oss-conference-malaysia/</link>
		<comments>http://torum.net/2008/10/oss-conference-malaysia/#comments</comments>
		<pubDate>Thu, 23 Oct 2008 15:58:16 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[oss]]></category>

		<category><![CDATA[conference]]></category>

		<guid isPermaLink="false">http://torum.net/?p=626</guid>
		<description><![CDATA[ So I&#8217;ve been bugging Colin Charles to invite me over to Malaysia for the last couple of months and what does he offer me? an opportunity to speak at a open source conference in Malaysia  
FOSS.my is a two day event (9th &#038; 10th next month) and as stated on their conference homepage, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://foss.my"><img class="alignleft" title="FOSS.my Logo" src="http://foss.my/wp-content/themes/w2_dnd/images/fossmy-logo.png" alt="" width="243" height="74"/></a> So I&#8217;ve been bugging <a href="http://www.bytebot.net/">Colin Charles</a> to invite me over to Malaysia for the last couple of months and what does he offer me? an opportunity to speak at a open source conference in Malaysia <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>FOSS.my is a two day event (9th &#038; 10th next month) and as stated on their <a href="http://foss.my">conference homepage</a>, the aim of this conference is to cover technical aspects of various OSS projects without any business/sales intervention for those that follow open source technology in South East Asia (and other regions too of course). The cool thing is that after telling <a href="http://mixi.jp">mixi</a> about this event, they liked the idea so much they decided to sponsor the event immediately. I didn&#8217;t really expect this but hey, awesome.</p>
<p>At this conference, <a href="http://foss.my/schedule/toru-maesaka/">I will be doing two talks</a> where one will go over how mixi uses various OSS technologies to power the largest social networking service in Japan. The other talk will cover how the memcached internals work and latest hot topics in the development community like the upcoming binary protocol.</p>
<p>I&#8217;ve never been to Malaysia before so I&#8217;m totally looking forward to this trip.</p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/oss-conference-malaysia/feed/</wfw:commentRss>
		</item>
		<item>
		<title>memcached Hackathon #5 at Sun Microsystems</title>
		<link>http://torum.net/2008/10/memcached-hackathon-at-sun/</link>
		<comments>http://torum.net/2008/10/memcached-hackathon-at-sun/#comments</comments>
		<pubDate>Sun, 19 Oct 2008 21:06:29 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[memcached]]></category>

		<category><![CDATA[oss]]></category>

		<category><![CDATA[hackathon]]></category>

		<guid isPermaLink="false">http://torum.net/?p=525</guid>
		<description><![CDATA[Last week I was in the valley for the fifth memcached Hackathon at Sun Microsystems and visiting some friends at Six Apart HQ. The hackathon was so fun, we ended up leaving at 2am on a weeknight! Thanks to Matt Ingenthron and Sun Microsystems for organizing the event and providing food and space for this [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I was in the valley for the fifth <a href="http://www.socialtext.net/memcached/index.cgi?hackathon">memcached Hackathon</a> at Sun Microsystems and visiting some friends at Six Apart HQ. The hackathon was so fun, we ended up leaving at 2am on a weeknight! Thanks to <a href="http://blogs.sun.com/mingenthron/">Matt Ingenthron</a> and Sun Microsystems for organizing the event and providing food and space for this hackathon <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>In the <a href="http://lists.danga.com/pipermail/memcached/2008-April/006734.html">previous hackathon</a>, we mostly exchanged ideas on the binary protocol and the storage engine interface. This time it was more code oriented and we reviewed and tested the progress everyone had made in the latest binary protocol tree. Unfortunately I couldn&#8217;t cover the whole hackathon but here is a summary of discussions from the <a href="http://www.socialtext.net/memcached/index.cgi?hackathon">agenda</a> that I was involved in:</p>
<h3><strong>Binary Protocol - Add an engine specific OPCODE </strong></h3>
<p>No disagreements here. An opcode is represented by a 1 byte unsigned integer so the consensus was that we should dedicate anything over 127 (0&#215;7F) for special operations.</p>
<h3><strong>Storage Interface</strong></h3>
<p>We didn&#8217;t get around to discussing the interface in depth since getting the binary protocol released has greater priority at the moment. <a href="http://blogs.sun.com/trond/">Trond</a> however showed me some of the interesting work that he has been doing which will hopefully be out in the open soon.</p>
<h3><strong>Test Framework</strong></h3>
<p>The issue here is that tests aren&#8217;t actively been written. Opinions voiced on this issue was that some people aren&#8217;t comfortable with Perl, and thus difficult to understand the current Perl based test system.</p>
<p>Switching to a different test framework in a different language is easy but the problem is that this is a never-ending story. People can easily start demanding other languages that they feel comfortable in (python, java, ruby, lua, &#8230;). We briefly discussed that the ideal model is to be able to add tests written in any language but we didn&#8217;t go into depth on how we would actually achieve this.</p>
<p>Personally, I have nothing against the current test framework (mind you I like Perl) but I think if we were to switch, a solely C based framework is a good move. I am saying this because those that would think about opening up the memcached package and editing it can most likely write C (is this an assertive assumption? heh).</p>
<h3><strong>Client Libraries</strong></h3>
<p>Unfortunately I couldn&#8217;t get around to participating in the client talk but client-side replication work was being done for <a href="http://tangent.org/552/libmemcached.html">libmemcached</a> and I heard from <a href="http://krow.livejournal.com">Brian Aker</a> that there was good progress.</p>
<p><a href="http://search.cpan.org/~hachi/">Jonathan (hachi)</a> reviewed my binary protocol patch for <a href="http://search.cpan.org/search?query=cache%3A%3Amemcached&#038;mode=all">Cache::Memcached</a> and found that some protocol negotiation assumptions I made in the code can be improved. He is also looking at optimizing the code by subclassing the patch (reduces the number of conditional selections, perl method calls and hash lookups).</p>
<h3><strong>Scaling on Highly Threaded Servers</strong></h3>
<p>We didn&#8217;t really discuss this in depth since we were busy reviewing and testing the server code but as far as I know, we talked about how locking can be improved in memcached. Looking into and preparing for this is a good idea since we are entering a massively concurrent age. To the contrary, guys from Facebook mentioned that they were getting sufficient throughput with the current locking scheme which was awesome to hear. </p>
<p>The engine plugin rearchitecture should fit well with this project since we can interchange different versions of the slabber engine with different locking strategies and make them compete to be the next default memcached engine.</p>
<h3><strong>Conclusion</strong></h3>
<p>The hackathon was fun and we got a lot done in terms of finding things to improve on. It was great to catch up with guys that I communicate a lot with online and talk tech in person. It was awesome that <a href="http://brad.livejournal.com">Brad</a> turned up as well. As for code improvements, <a href="http://bleu.west.spy.net/~dustin/">Dustin&#8217;s</a> test code found an issue in the stats subsystem always returning a zero for an opaque value. A little bit of coding looked necessary to get around this problem since an opaque value is held by the connection structure, which the engine does not have access to (it shouldn&#8217;t) but I was bored on my flight back to Tokyo so this problem is now <a href="http://github.com/tmaesaka/memcached/commit/86159eb7b5c611d25a33eb1cb6c750c806350912">fixed and pushed</a> to my tree <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/memcached-hackathon-at-sun/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Rethinking the Query Cache for Drizzle</title>
		<link>http://torum.net/2008/10/rethink-query-cache-drizzle/</link>
		<comments>http://torum.net/2008/10/rethink-query-cache-drizzle/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 07:54:02 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[drizzle]]></category>

		<category><![CDATA[memcached]]></category>

		<category><![CDATA[oss]]></category>

		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://torum.net/?p=423</guid>
		<description><![CDATA[There is a mutual understanding in the Drizzle community that the MySQL query cache works well for a small database but isn&#8217;t sufficient for relatively large scale usages. Does your application involve a lot of database updates? if so, you&#8217;ll probably face fragmentation issues in the query cache (though using the query cache isn&#8217;t suitable [...]]]></description>
			<content:encoded><![CDATA[<p>There is a mutual understanding in the Drizzle community that the MySQL query cache works well for a small database but isn&#8217;t sufficient for relatively large scale usages. Does your application involve a lot of database updates? if so, you&#8217;ll probably face fragmentation issues in the query cache (though using the query cache isn&#8217;t suitable for use cases like this).</p>
<p>Caching is the key ingredient in boosting the performance of any software that requires significant amount of computation, hence it is something that can&#8217;t be overlooked. So how can we improve Drizzle?</p>
<p>The idea is to create a pluggable query cache subsystem that can work in a large scale environment. Drizzle, being a micro-kernel DBMS, it makes sense to make the cache component pluggable and let the DBA choose the caching solution of their choice. This is exactly what I&#8217;m working on at the moment and my first plugin will allow Drizzle to use memcached as its query cache.</p>
<p>For example, a DBA could hook up their memcached pool to Drizzle and use several gigabytes of fast cache space to cache their results.</p>
<h3><strong>Things to consider</strong></h3>
<ul>
<li>Does the DBA really want to cache results?</li>
<li>Does the result construction take long enough to care?</li>
<li>Do we want to specify a specific SQL statement to always cache?</li>
<li>Do we want to enforce a certain table to be cached?</li>
<li>Transactional Engines</li>
</ul>
<p>If we can satisfy the above points and achieve modularity, I think its a total win. For those that like diagrams, here is the architecture that is on my mind at the moment:</p>
<p style="border: 0; text-align: center;"> <a title="Drizzle Query Cache Plugin Example by tmaesaka, on Flickr" href="http://www.flickr.com/photos/tmaesaka/2927969599/"><img class="aligncenter" src="http://farm4.static.flickr.com/3042/2927969599_11f887a2ec.jpg" border="0" alt="Drizzle Query Cache Plugin Example" width="500" height="343" border:"0" /></a></p>
<h3><strong>Benefits of using memcached</strong></h3>
<p>memcached is proven to work and help scale web applications in a cost effective fashion by various players in the web industry. It is also fast. The time complexity of fetching a cached result from memcached is O(1), which is an order we all love. Furthermore, by using memcached, the fragmentation issue disappears since this is a problem that the memcached community had to face in the past and successfully overcame by developing the slab subsystem.</p>
<p>Want to scale? with consistent hashing enabled, you can greatly reduce the number of cache misses from adding/removing a node from a live pool. Got spare boxes lying around? hook them up and powerup Drizzle! Need support? both memcached and Drizzle community members are heartwarming people.</p>
<h3><strong>Other Solutions work Too!</strong></h3>
<p>The beauty of modularity is that you can create and use your own solution for your unique requirements. For example lets assume that there is a webshop that wants to keep the number of physical servers down (e.g. limited monetary/space resource).</p>
<p>To satisfy the requirement stated above, you could cache to a fantastically fast hash database, such as Tokyo Cabinet (much, much faster than BDB). If you haven&#8217;t heard of it, you should look at the incredible <a href="http://tokyocabinet.sourceforge.net/benchmark.pdf">benchmark comparison</a>). So, what I really wanted to say is that the microkernel property of Drizzle will open up a lot of new possibilities for your application and help you tackle the new requirements that seem to come out of no where.</p>
<h3><strong>Where from here?</strong></h3>
<p>Currently going through the UDF -> Plugin Architecture conversion done by <a href="http://fallenpegasus.com/">Mark</a>, and planning on basing the code on his logging plugin while its fantastically simple. My work will be done in:</p>
<ul>
<li>lp:~tmaesaka/drizzle/pluggable-qcache</li>
</ul>
<p>I&#8217;ll hopefully have something decent to show soon, and I will keep people updated on my blog, <a href="irc://irc.freenode.net/drizzle">IRC</a> and the <a href="https://launchpad.net/~drizzle-discuss">Mailing List</a> (drizzle-discuss).</p>
<p>So that is all I have to say for now&#8230; If you have any suggestions, please do enlighten me <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/rethink-query-cache-drizzle/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Affection towards Beautiful Typography</title>
		<link>http://torum.net/2008/10/beautiful-typography/</link>
		<comments>http://torum.net/2008/10/beautiful-typography/#comments</comments>
		<pubDate>Thu, 09 Oct 2008 02:57:55 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[webservice]]></category>

		<category><![CDATA[typography]]></category>

		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://torum.net/?p=431</guid>
		<description><![CDATA[Came across a service called Wordle this morning that can create a cloud of beautiful text based on the data you provide. Heres what it generated from my RSS feed:
  
For those that love beautiful typography, you should definitely try it out. The data you provide doesn&#8217;t have to be a feed. You can [...]]]></description>
			<content:encoded><![CDATA[<p>Came across a service called <a href="http://wordle.net">Wordle</a> this morning that can create a cloud of beautiful text based on the data you provide. Heres what it generated from my RSS feed:</p>
<p><a href="http://wordle.net/gallery/wrdl/238239/blog" title="Wordle: blog"><img src="http://wordle.net/thumb/wrdl/238239/blog" style="padding:4px;border:1px solid #ddd"></a> <a href="http://wordle.net/gallery/wrdl/238233/blog" title="Wordle: blog"><img src="http://wordle.net/thumb/wrdl/238233/blog" style="padding:4px;border:1px solid #ddd"></a> <a href="http://wordle.net/gallery/wrdl/238232/blog" title="Wordle: blog"><img src="http://wordle.net/thumb/wrdl/238232/blog" style="padding:4px;border:1px solid #ddd"></a></p>
<p>For those that love beautiful typography, you should definitely try it out. The data you provide doesn&#8217;t have to be a feed. You can directly type in a bunch of text and get Wordle to generate a text cloud. </p>
<p>Heres an idea, copy and paste the lyric of your favorite song or poem at <a href="http://wordle.net/create">http://wordle.net/create</a> and see what you get <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/beautiful-typography/feed/</wfw:commentRss>
		</item>
		<item>
		<title>メモ: DTraceを少し勉強してみた</title>
		<link>http://torum.net/2008/10/learning-dtrace/</link>
		<comments>http://torum.net/2008/10/learning-dtrace/#comments</comments>
		<pubDate>Sun, 05 Oct 2008 14:57:29 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[japanese]]></category>

		<category><![CDATA[dtrace]]></category>

		<guid isPermaLink="false">http://torum.net/?p=305</guid>
		<description><![CDATA[なぜか今まで自分で調べるほど興味が向かなかったDTraceですが、最近よく耳にするようになった事もあり、少し勉強してみました。Mac OS X (10.5) から標準で使えるというのも大きかったですね。正直なところ、Solarisでしか動作しなかったら、DTraceに興味を示さなかったと思います。
ググってみたら、ITMediaなどで詳しく書かれている人がいるので、Yet another DTrace entryになってしまいますが、私の個人メモという事でお許しください。DTraceを調べてて思ったのが、Paul van den BogaardというSUNの人が書かれた、DTrace by Example: Solving a Real World Problemというドキュメントが凄く解り易かったです。
DTraceを簡素にいうと？
DTraceとはSUNがSolarisのために開発したダイナミックなトレーシング機能で、runtimeでシステムの動作をkernelレベルからトレースできるプロダクトです。Paulの資料ではフレームワークという表現がされていて、現在はSolarisの他にMac OS Xで使用する事が可能です。
DTraceを使うとシステム管理者や開発者などは手元のプログラムと、その更に下にあるOSに関する様々な情報を取得する事が可能になり、システムのプロファイリングやデバッグに役立ちます。ただ、DTraceと一言でいっても、様々なコンポーネントが絡んでいるので、気をつけないといけませんね：

OSのカーネル
D言語
dtraceコマンド
dtraceのバーチャルマシン
dtraceのプローブ
dtraceのプロバイダ
dtraceの各種ライブラリ

私が見たところ、DTraceの良いところは、ユーザ（システム管理者など）にとって興味のない情報を含んだ膨大なトレース結果が返されるのではなく、トレースする・返す情報を細かいレベルまで実際にD言語をつかって指定する事が可能な事でしょうか（例えばシステムコールの発行数・実行時間だけを返せとか）。つまりD言語をつかってカスタムなトレーサやプロファイラーを書く事ができるわけですね。
DTraceの流れ
D言語で記述されたトレーススクリプトは、dtraceコマンドによって実行します。スクリプトを受け取ったdtraceコマンドはOSのカーネルに組み込まれたdtraceのバーチャルマシンが理解できる中間形式に変換します（Javaでいうbytecodeっぽいですね）。あとはバーチャルマシンがユーザのスクリプトに記述されたロジックに基づいた集計を行ってくれます。
集計の流れ、そしてProbeとProvider
DTraceにはprobeという概念があり、probeとはカーネル内の計測ポイントを示します。Probeには様々な種類があり、それぞれ特定の条件下で有効になります（とあるシステムコールが発行されたなど）。したがって、D言語のスクリプトに特定のprobeを指定すると、そのprobeが有効になった際に報告される情報を収集できるとの事。
ちなみに私の使っているMacBook Pro (OS X 10.5.5, DTrace API version 1.2.2) のDarwin kernelに組み込まれているprobeの総数を見てみたところ：
# dtrace -l  
&#8230; (省略)
21652 plockstat4556 libSystem.B.dylib    pthread_rwlock_wrlock$UNIX2003 rw-error
21653 plockstat4556 libSystem.B.dylib             pthread_rwlock_unlock [...]]]></description>
			<content:encoded><![CDATA[<p>なぜか今まで自分で調べるほど興味が向かなかった<a href="http://sdc.sun.co.jp/solaris/solaris10/dtrace/guide.html">DTrace</a>ですが、最近よく耳にするようになった事もあり、少し勉強してみました。Mac OS X (10.5) から標準で使えるというのも大きかったですね。正直なところ、Solarisでしか動作しなかったら、DTraceに興味を示さなかったと思います。</p>
<p>ググってみたら、ITMediaなどで<a href="http://www.itmedia.co.jp/enterprise/articles/0504/22/news030.html">詳しく書かれている</a>人がいるので、Yet another DTrace entryになってしまいますが、私の個人メモという事でお許しください。DTraceを調べてて思ったのが、<a href="http://blogs.sun.com/paulvandenbogaard/">Paul van den Bogaard</a>というSUNの人が書かれた、<a href="http://developers.sun.com/solaris/articles/dtrace_example.pdf">DTrace by Example: Solving a Real World Problem</a>というドキュメントが凄く解り易かったです。</p>
<h3><strong>DTraceを簡素にいうと？</strong></h3>
<p>DTraceとはSUNがSolarisのために開発したダイナミックなトレーシング機能で、runtimeでシステムの動作をkernelレベルからトレースできるプロダクトです。Paulの資料ではフレームワークという表現がされていて、現在はSolarisの他にMac OS Xで使用する事が可能です。</p>
<p>DTraceを使うとシステム管理者や開発者などは手元のプログラムと、その更に下にあるOSに関する様々な情報を取得する事が可能になり、システムのプロファイリングやデバッグに役立ちます。ただ、DTraceと一言でいっても、様々なコンポーネントが絡んでいるので、気をつけないといけませんね：</p>
<ul>
<li>OSのカーネル</li>
<li>D言語</li>
<li>dtraceコマンド</li>
<li>dtraceのバーチャルマシン</li>
<li>dtraceのプローブ</li>
<li>dtraceのプロバイダ</li>
<li>dtraceの各種ライブラリ</li>
</ul>
<p>私が見たところ、DTraceの良いところは、ユーザ（システム管理者など）にとって興味のない情報を含んだ膨大なトレース結果が返されるのではなく、トレースする・返す情報を細かいレベルまで実際にD言語をつかって指定する事が可能な事でしょうか（例えばシステムコールの発行数・実行時間だけを返せとか）。つまりD言語をつかってカスタムなトレーサやプロファイラーを書く事ができるわけですね。</p>
<h3><strong>DTraceの流れ</strong></h3>
<p>D言語で記述されたトレーススクリプトは、dtraceコマンドによって実行します。スクリプトを受け取ったdtraceコマンドはOSのカーネルに組み込まれたdtraceのバーチャルマシンが理解できる中間形式に変換します（Javaでいうbytecodeっぽいですね）。あとはバーチャルマシンがユーザのスクリプトに記述されたロジックに基づいた集計を行ってくれます。</p>
<h3><strong>集計の流れ、そしてProbeとProvider</strong></h3>
<p>DTraceにはprobeという概念があり、probeとはカーネル内の計測ポイントを示します。Probeには様々な種類があり、それぞれ特定の条件下で有効になります（とあるシステムコールが発行されたなど）。したがって、D言語のスクリプトに特定のprobeを指定すると、そのprobeが有効になった際に報告される情報を収集できるとの事。</p>
<p>ちなみに私の使っているMacBook Pro (OS X 10.5.5, DTrace API version 1.2.2) のDarwin kernelに組み込まれているprobeの総数を見てみたところ：</p>
<div style="width: 80%; font-size: 0.9em; background: #333; color: #fff; padding: 5px 0 2px 12px; margin-bottom: 20px;"># dtrace -l  </p>
<p>&#8230; (省略)</p>
<p>21652 plockstat4556 libSystem.B.dylib    pthread_rwlock_wrlock$UNIX2003 rw-error<br />
21653 plockstat4556 libSystem.B.dylib             pthread_rwlock_unlock rw-release<br />
21654 plockstat4556 libSystem.B.dylib    pthread_rwlock_unlock$UNIX2003 rw-release</p></div>
<p>2万個以上ものProbeが発見されました。けっこうな数ですね〜。実際はオン・ザ・フライでprobeを生成するproviderが存在するので、probeはもっとあるらしいです。</p>
<p>SUN KKの<a href="http://docs.sun.com/app/docs/doc/819-6259?l=ja&amp;a=load">DTraceドキュメント</a>（日本語）によると、Probeが有効になると以下の情報が取得可能との事です：</p>
<ul>
<li>関数に渡されたすべての引数</li>
<li>カーネル内のすべての大域変数</li>
<li>関数が呼び出された日時を示すタイムスタンプ</li>
<li>関数を呼び出したコードセクションを示すスタックトレース</li>
<li>関数が呼び出されたとき実行中だったプロセス</li>
<li>関数を呼び出したスレッド</li>
</ul>
<p>また、カーネル内でProbeを実際に有効にするカーネルモジュールをProviderと呼びます。Providerには色々な種類があって、特定のプロバイダに関連するprobeは、そのProviderにグルーピングされる様です。</p>
<p>ググってみたらProviderのリスト・種類の説明はDTraceのwikiとSolaris Dynamic Tracing Guideに載っていました：</p>
<p><a href="http://wikis.sun.com/display/DTrace/Providers">http://wikis.sun.com/display/DTrace/Providers</a><br />
<a href="http://docs.sun.com/app/docs/doc/817-6223">http://docs.sun.com/app/docs/doc/817-6223</a>（チャプター17から32まで）</p>
<h3><strong>D言語の書き方</strong></h3>
<p>私自身が人に教えれるほどD言語を把握していない事と、仕様を書くとブログエントリーの域を超えてしまうので、控えさせて頂きますが、<a href="http://docs.sun.com/app/docs/doc/819-6259/6n89m7rn6?l=ja&amp;a=view">DTraceユーザガイドの第3章</a>に基本的な説明が記載されています。</p>
<p>楽な逃げ道を紹介すると、自分で頑張ってスクリプトをガリガリ書かなくても、<a href="http://www.opensolaris.org/os/community/dtrace/dtracetoolkit/">DTraceToolKit</a>という200種類以上もの充実したスクリプト集があります（存在を教えてくれたkiyotakaさんに感謝）。OS Xだと動かないスクリプトも結構あるみたいですが、私が試してみたメモリ関連のスクリプトは大丈夫っぽいです。</p>
<h3><strong>DTraceをMac OS Xで使ってみる（ワンライナー）</strong></h3>
<p>トリッキーな検査でなければ、スクリプトファイルを作成しなくても端末上でdtraceを試す事が可能です。</p>
<p>例えばプロセスがファイルを開いたら即座に報告する：</p>

<div class="wp_syntax"><div class="code"><pre class="d d" style="font-family:monospace;">$ sudo dtrace <span style="color: #66cc66;">-</span>n <span style="color: #ff0000;">'syscall::open*:entry { printf(&quot;%s %s&quot;, execname, copyinstr(arg0)); }'</span></pre></div></div>

<p>read(2)を呼んだプロセスを即座に報告する：</p>

<div class="wp_syntax"><div class="code"><pre class="d d" style="font-family:monospace;">$ sudo dtrace <span style="color: #66cc66;">-</span>n <span style="color: #ff0000;">'syscall::read:entry { printf(&quot;%s&quot;, execname); }'</span>
&nbsp;
<span style="color: #808080; font-style: italic;">// OUTPUT:</span>
<span style="color: #808080; font-style: italic;">// CPU     ID                FUNCTION:NAME</span>
<span style="color: #808080; font-style: italic;">// 1       17602             read:entry mDNSResponder</span>
<span style="color: #808080; font-style: italic;">// 1       17602             read:entry mDNSResponder</span>
<span style="color: #808080; font-style: italic;">// 1       17602             read:entry EchoPod</span>
<span style="color: #808080; font-style: italic;">// 1       17602             read:entry EchoPod</span></pre></div></div>

<p>-p オプションでトレース対象を特定のプロセスに絞る (例, pid=6455)：</p>

<div class="wp_syntax"><div class="code"><pre class="d d" style="font-family:monospace;">$ sudo dtrace <span style="color: #66cc66;">-</span>p <span style="color: #0000dd;">6455</span> <span style="color: #66cc66;">-</span>n <span style="color: #ff0000;">'syscall::read:entry { printf(&quot;%s&quot;, execname); }'</span></pre></div></div>

<p>など、ご覧の通りD言語を少し学ぶとPerlのワンライナー感覚で色々と遊べます。</p>
<h3><strong>余談</strong></h3>
<p>memcached-1.2.6には、ほぼ全てのコマンドに対し<a href="http://consoleninja.net/gitweb/gitweb.cgi?p=memcached.git;a=commitdiff;h=71fa535153a8fd2598044adc907d60cc246368e0">dtrace probeが組み込まれており</a>、configure時に--enable-dtraceを指定するとprobeを適応する事ができるのです。つまり、memcachedに特化したtop(1)的なプログラムを数行で書けてしまうのです。</p>
<p>が、、現状だとOS Xの実装がSolarisと違うため、OS Xだとビルドに失敗してしまいます(dtraceの-Gオプションを認識しない）。この問題はSUNの技術者が絶賛対応中なので、1.3シリーズにはOS Xでビルドできるようになると期待していて、今後はmemcachedを修行相手にしてdtraceの知識や腕を磨きたいな〜、と思っています。</p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/10/learning-dtrace/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Thoughts on UTF-8 over CJK charsets in Drizzle</title>
		<link>http://torum.net/2008/09/utf8-over-cjk-drizzle/</link>
		<comments>http://torum.net/2008/09/utf8-over-cjk-drizzle/#comments</comments>
		<pubDate>Sun, 28 Sep 2008 08:24:29 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[drizzle]]></category>

		<category><![CDATA[oss]]></category>

		<category><![CDATA[charset]]></category>

		<category><![CDATA[encoding]]></category>

		<category><![CDATA[internationalization]]></category>

		<guid isPermaLink="false">http://torum.net/?p=246</guid>
		<description><![CDATA[Internally, Drizzle will use UTF-8 everywhere and _only_ UTF-8. This is simply because UTF-8 is the choice of encoding within the Drizzle community at the moment. To me, this decision makes sense since UTF-8 is popular in the areas that Drizzle is targetting (Web and the Cloud). Limiting to UTF-8 also means that the Drizzle [...]]]></description>
			<content:encoded><![CDATA[<p>Internally, Drizzle will use UTF-8 everywhere and _only_ UTF-8. This is simply because UTF-8 is the choice of encoding within the Drizzle community at the moment. To me, this decision makes sense since UTF-8 is popular in the areas that Drizzle is targetting (Web and the Cloud). Limiting to UTF-8 also means that the Drizzle codebase would become cleaner, thus easier to maintain. However, there are arguments against it in the community so this could change in the future.</p>
<p>So, what does this mean to those that are outside regions that use latin characters, specifically East Asia? Would this cause an uproar?</p>
<p>Few months ago, Brian Aker had asked me about this and after a brief discussion with <a href="http://jpipes.com/">Jay Pipes</a> couple of days ago, I figured I should blog about this so I can keep it as a note for myself and hopefully gain feedbacks from those that stumbles across this entry. Here are my thoughts based on my knowledge on the Japanese web industry:</p>
<h3><strong>Web Industry Standard in Japan</strong></h3>
<p>Looking at the web industry trend in Japan, UTF-8 is becoming the prominent encoding, despite the fact that UTF-8 requires more computation power and space than Japanese CJK charsets. For example, <a href="http://mixi.jp">mixi.jp</a> (one of the largest websites in Japan) still uses EUC-JP (CJK family) due to historical reasons but if you look at their newer features like video sharing, you can see that they&#8217;ve begun adopting UTF-8. <a href="http://www.yahoo.co.jp">Yahoo! JP</a>, <a href="http://cookpad.com">COOKPAD</a>, <a href="http://ja.wikipedia.org">ja.wikipedia</a> and <a href="http://www.livedoor.com">Livedoor</a> are great examples of large Japanese sites too.</p>
<p>The reason UTF-8 is becoming popular in the .jp domain IMHO is:</p>
<ul>
<li>The default encoding of XHTML is UTF-8/UTF-16</li>
<li>All browsers support UTF-8 nowadays (if it doesn&#8217;t you shouldn&#8217;t be using it)</li>
<li>Theoretically, more characters can be represented in UTF-8</li>
<li>Theoretically, existing ASCII functions can be used</li>
</ul>
<p>However, there are certainly cases where web developers might need to use their local encoding for supporting things like mobile devices (Shift-JIS in Japan). These unique requirements IMHO should be handled by the client, such that rather than making DBMS responsible, you should encode the returned result to whatever you like in the application layer before rendering it.</p>
<h3><strong>More overhead per character</strong></h3>
<p>Using UTF-8 means that there is going to be an estimated average of 1 byte overhead per character (typically an EUC-JP character is 2 bytes), hence if you have a lot of textual data already in either of CJK encodings, you&#8217;re definitely going to use more storage (the more data you have, the more significance).</p>
<p>Eating more space may seem significant but to me, whats more significant is the cost reduction in memory and storage mediums nowadays. If you begin facing problems due to having too much data, its probably time to consider horizontal partitioning anyway.</p>
<h3><strong>Conclusion</strong></h3>
<p>The topic discussed in this entry is very sensitive, and it is merely my personal opinion. Every encoding has its ups and downs like all things (they were designed for a purpose after all) and hence there are numerous amount of people with different opinions. Satisfying everyone is difficult, but who knows? UTF-8 alone may satisfy majority of the users that we are targeting. If it doesn&#8217;t then I guess we&#8217;ll have to think again&#8230; We also need to look into internal sort performance if we go pure UTF-8.</p>
<p>The conclusion Jay and I came up with in our brief discussion was that providing a conversion tool in the Drizzle package could be a good start to get people jumping into the UTF-8 boat. There is no specific plan nor we&#8217;ve decided to do this yet but if we were to do it, I&#8217;m thinking that the tool can be something simple that uses <a href="http://www.gnu.org/software/libiconv/">GNU libiconv</a>.</p>
<p>Hey, there is always the brute solution of storing textual data of your choice in binary <img src='http://torum.net/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/09/utf8-over-cjk-drizzle/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Why Stats was refactored in memcached-1.3</title>
		<link>http://torum.net/2008/09/memcached-stats-refactor/</link>
		<comments>http://torum.net/2008/09/memcached-stats-refactor/#comments</comments>
		<pubDate>Thu, 25 Sep 2008 09:43:49 +0000</pubDate>
		<dc:creator>tmaesaka</dc:creator>
		
		<category><![CDATA[memcached]]></category>

		<category><![CDATA[oss]]></category>

		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://torum.net/?p=68</guid>
		<description><![CDATA[For those that do not follow the active development of memcached, the current excitement in the community is the new binary protocol that will be introduced in the upcoming 1.3 series. If you&#8217;d like a quick and easy introduction on the binary protocol, you can see the slides from my presentation.
So, with such significant advances, [...]]]></description>
			<content:encoded><![CDATA[<p>For those that do not follow the active development of memcached, the current excitement in the community is the new binary protocol that will be introduced in the upcoming 1.3 series. If you&#8217;d like a quick and easy introduction on the binary protocol, you can see the <a href="http://www.slideshare.net/tmaesaka/memcached-binary-protocol-in-a-nutshell-presentation/">slides from my presentation</a>.</p>
<p>So, with such significant advances, the 1.3 codebase is obviously going to look a bit different to the 1.2 codebase, but even then the overall software architecture is the same. Whats significantly different however, is how the stats opreration is implemented. This is why I am writing this entry, to answer the questions that people might have in advance.</p>
<h3><strong>Background in a nutshell</strong></h3>
<p>Looking further ahead, beyond the binary protocol, the memcached community is aiming to achieve a pluggable engine architecture, which will allow memcached to satisfy unique requirements that people might have. These unique requirements can be things like, persistent storage, data dumping, server-side replication and etc. All these fancy stuff obviously goes against the original motives of memcached but I will save this discussion for another day, as it is not appropriate for this entry <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Supporting third party engines mean that memcached must be able to send back <strong>engine specific</strong> stats to the client (most likely a system admin). To achieve this, memcached&#8217;s stats handling had to be made flexible by splitting the concept of &#8220;stats&#8221; into two segments, &#8220;core server stats&#8221; and &#8220;engine stats&#8221; and hence the refactoring.</p>
<h3><strong>The new approach</strong></h3>
<p>Previously, stats was done by incrementing/accumulating values inside the stats structure (defined in memcached.h). The actual increments were done mostly in the server code.</p>
<p>In the new approach, an engine does not have to depend on the stats structure because this would limit the engine to this structure. Adding an opaque pointer to the structure for pointing to something engine specific could get around this problem but lets not go there&#8230; no, no, no.</p>
<p>All non-server stats are pushed out to the slabber code since this is the closest thing to an engine in memcached at the moment. In this model, if a client asks for something unique (e.g. &#8220;stats malloc&#8221;), then the server will query the engine for &#8220;malloc&#8221;. If the engine has no clue of what the client is asking for, then the server will simply return an error.</p>
<p>Likewise, if the client asks for non-specific stats (&#8221;stats\r\n&#8221; in the ASCII protocol), memcached will return the merged result of itself (core-server stats) and stats for general purpose from the slabber (bytes written, num of get/set and etc).</p>
<p>If you&#8217;d like to see the actual code, take a look at this branch:<br />
<a href="http://github.com/tmaesaka/memcached/commits/binprot" target="_blank">http://github.com/tmaesaka/memcached/commits/binprot</a></p>
<p>Make sure you checkout the &#8220;binprot&#8221; branch.</p>
<h3><strong>Binary Stats is Packet-Per-Stat</strong></h3>
<p>Before I talk about how stats is implemented, I must mention that with the binary protocol, each statistical information is returned in it&#8217;s own packet (as mentioned in the documentation). The key contains the name of the statistical information and the value contains the associated value. Transmission termination is signaled with a packet with no key and value.</p>
<h3><strong>How it works</strong></h3>
<p>So how does this work? the laziest solution is to enforce the responsibility and implementation of data formatting/serialization to the engine, but this has the potential pitfall of:</p>
<ul>
<li>Server Failure due to incorrect formatting/serialization by the engine.</li>
</ul>
<p>Instead, an engine is given a callback that it can use to format/serialize stats data for returning to the core server. This way we can reduce the likelihood of an engine returning something invalid to the core server (assuming that the implementer uses the callback of course). Specifically, the engine needs to implement the following function:</p>

<div class="wp_syntax"><div class="code"><pre class="c c" style="font-family:monospace;"><span style="color: #993333;">char</span> <span style="color: #339933;">*</span>get_stats<span style="color: #009900;">&#40;</span><span style="color: #993333;">const</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>stat_name<span style="color: #339933;">,</span> uint32_t <span style="color: #009900;">&#40;</span><span style="color: #339933;">*</span>add_stats<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span>
                <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>buf<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>key<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> uint16_t klen<span style="color: #339933;">,</span> 
                <span style="color: #993333;">const</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>val<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> uint32_t vlen<span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #993333;">int</span> <span style="color: #339933;">*</span>buflen<span style="color: #009900;">&#41;</span>;</pre></div></div>

<p>and notice the callback:</p>

<div class="wp_syntax"><div class="code"><pre class="c c" style="font-family:monospace;">uint32_t <span style="color: #009900;">&#40;</span><span style="color: #339933;">*</span>add_stats<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">char</span> <span style="color: #339933;">*</span>buf<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>key<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> uint16_t klen<span style="color: #339933;">,</span>
                      <span style="color: #993333;">const</span> <span style="color: #993333;">char</span> <span style="color: #339933;">*</span>val<span style="color: #339933;">,</span> <span style="color: #993333;">const</span> uint32_t vlen<span style="color: #009900;">&#41;</span>;</pre></div></div>

<p>where the buf argument is the buffer that the entry will be serialized to, the key argument should be the name of the statistical information (e.g. &#8220;bytes&#8221;) and the value should contain the associated value (e.g. &#8220;1024&#8243;). The remaining klen and vlen arguments should represent the length of the key and value (e.g. strlen(&#8221;bytes&#8221;)).</p>
<p>This callback returns the number of bytes it had appended to the provided buffer, which the engine can use to forward the write pointer for further appending. Just make sure you allocate enough memory in advance (each append has a 24 byte overhead for the binary protocol). </p>
<p>Another thing to mention is that the engine does not have to worry whether the return data is for the  ascii or binary protocol, since memcached will give the appropriate callback (with different logic that corresponds to the protocol type) to the engine.</p>
<p>Once the engine populates the buffer with data that it would like to report, it can then simply return it to the core server, where it will be sent back to the client.</p>
<p>So, get_stats() could look something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="c c" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">/* assume, foo_key = &quot;hello&quot; and foo_val = &quot;world&quot; */</span>
<span style="color: #993333;">char</span> <span style="color: #339933;">*</span>buf<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>ptr;
uint32_t nbytes <span style="color: #339933;">=</span> <span style="color:#800080;">0</span>;
&nbsp;
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>buf <span style="color: #339933;">=</span> malloc<span style="color: #009900;">&#40;</span>num_of_bytes<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #000000; font-weight: bold;">NULL</span><span style="color: #009900;">&#41;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #000000; font-weight: bold;">NULL</span>;
&nbsp;
ptr <span style="color: #339933;">=</span> buf;
nbytes <span style="color: #339933;">=</span> add_stats<span style="color: #009900;">&#40;</span>ptr<span style="color: #339933;">,</span> foo_key<span style="color: #339933;">,</span> strlen<span style="color: #009900;">&#40;</span>foo_key<span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
                   foo_val<span style="color: #339933;">,</span> strlen<span style="color: #009900;">&#40;</span>foo_val<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>;
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span>nbytes<span style="color: #009900;">&#41;</span>
    <span style="color: #b1b100;">return</span> <span style="color: #000000; font-weight: bold;">NULL</span>;
ptr <span style="color: #339933;">+=</span> nbytes;
<span style="color: #339933;">*</span>buflen <span style="color: #339933;">+=</span> nbytes;
&nbsp;
...
&nbsp;
<span style="color: #339933;">*</span>buflen <span style="color: #339933;">+=</span> add_stats<span style="color: #009900;">&#40;</span>ptr<span style="color: #339933;">,</span> <span style="color: #000000; font-weight: bold;">NULL</span><span style="color: #339933;">,</span> <span style="color:#800080;">0</span><span style="color: #339933;">,</span> <span style="color: #000000; font-weight: bold;">NULL</span><span style="color: #339933;">,</span> <span style="color:#800080;">0</span><span style="color: #009900;">&#41;</span>; <span style="color: #808080; font-style: italic;">/* seal with terminator */</span>
<span style="color: #b1b100;">return</span> buf;</pre></div></div>

<p>Thats it! minimal coding is required from the engine implementer.</p>
<h3><strong>Good and the not so Good</strong></h3>
<p>Like all things, stats over the binary protocol has its ups and downs. The good thing about the packet-per-row approach is that the client library should be easier to write, especially for languages that aren&#8217;t so string friendly (e.g. C compared to Perl). I&#8217;ve already heard that it made libmemcached&#8217;s life happier.</p>
<p>The downside however is the network cost of binary stats compared to the ASCII protocol. Because a packet must be created for each statistical information, the total bytes to transmit over the wire can be relatively large. For example, if you want to return ten stat rows back to the client, then the number of bytes to transmit is:</p>
<p style="text-align: center;"><strong>&#8220;264 bytes (sum of packet headers, including terminator) + size of each key and value&#8221;</strong></p>
<p>whereas with the ASCII protocol it would be just:</p>
<p style="text-align: center;"><strong>&#8220;size of each key/value + 20 bytes (sum of CRLF) + 5 bytes (terminator)&#8221;<br />
</strong></p>
<p>Sure, the size difference may look trivial and you may not issue the stats command much but some system admins might care&#8230;</p>
<h3><strong>Conclusion</strong></h3>
<p>As you can see, a decent amount of thought has been put into the 1.3 series by the memcached community, and as a result, memcached will keep getting better. It will stay simple as it always were and at the same time it will hopefully be able to do new things by accepting external engines in the future. </p>
<p>The stats code refactoring is a small (but important) step towards this goal <img src='http://torum.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2008/09/memcached-stats-refactor/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
