<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Toru Maesaka &#187; pthread</title>
	<atom:link href="http://torum.net/tag/pthread/feed/" rel="self" type="application/rss+xml" />
	<link>http://torum.net</link>
	<description>Hackaholic and a Web Addict based in Tokyo</description>
	<lastBuildDate>Tue, 28 Feb 2012 10:52:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>BlitzLock and RWLOCK Comparison</title>
		<link>http://torum.net/2009/11/blitzlock-and-rwlock-comparison/</link>
		<comments>http://torum.net/2009/11/blitzlock-and-rwlock-comparison/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 12:47:11 +0000</pubDate>
		<dc:creator>Toru Maesaka</dc:creator>
				<category><![CDATA[drizzle]]></category>
		<category><![CDATA[oss]]></category>
		<category><![CDATA[blitzdb]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[locking]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[pthread]]></category>

		<guid isPermaLink="false">http://torum.net/?p=2308</guid>
		<description><![CDATA[As pointed out by Jay Pipes, I thought it would be nice to test and publish how BlitzLock performs against what I was originally intending on using for BlitzDB (pthread&#8217;s rwlock). So, I asked my colleagues in the operations group at Mixi to test BlitzLock on nice hardware that I don&#8217;t have access to. They [...]]]></description>
			<content:encoded><![CDATA[<p>As pointed out by <a href="http://www.jpipes.com/">Jay Pipes</a>, I thought it would be nice to test and publish how BlitzLock performs against what I was originally intending on using for BlitzDB (pthread&#8217;s rwlock). So, I asked my colleagues in the operations group at Mixi to test BlitzLock on nice hardware that I don&#8217;t have access to. They kindly accepted and ran the BlitzLock sandbox on a 16 core machine running Fedora.</p>
<p>If you haven&#8217;t read my <a href="http://torum.net/2009/11/blitzdb-and-tc-concurrency-model/">previous entry on BlitzLock</a> and why I started writing it, you should. This entry won&#8217;t make sense otherwise.</p>
<h3>Disclaimer</h3>
<p>Before I step any further, please remember that I&#8217;m not trying to say BlitzLock is better than pthread&#8217;s rwlock. My interest is to write a lock mechanism that is optimized for Tokyo Cabinet (TC). What I wanted to gain from this test was to see if BlitzLock has enough potential for me to keep working on it.</p>
<h3>Method</h3>
<p>There were three kinds of workloads: &#8220;Read Oriented&#8221;, &#8220;Write Oriented&#8221; and &#8220;Neutral&#8221;.  Read Oriented test has a 70% probability that each thread will call a read routine, whereas Write Oriented is the opposite where there is a 70% chance that the table state will be changed. In the Neutral test, both read and update calls have an even chance of being called. The seed value for the random number generator was identical for all tests.</p>
<p>Each worker sleeps for 10 milliseconds in the critical section and another 10 milliseconds right after it releases the lock. This was done to help cause context switching. Each test ran for 60 seconds.</p>
<p>You can obtain the standalone BlitzLock sandbox <a href="http://torum.net/code/cc/blitzlock.cc">from here</a>. I&#8217;ll upload a test friendly version that can accept startup options soon (I _really_ need to tidy it up).</p>
<h3>Results</h3>
<p>Below is a result from a load emulation where there was significantly more read calls than updates.</p>
<p align="center"><a href="http://www.flickr.com/photos/tmaesaka/4130661036/" title="BlitzLock Benchmark (1) by tmaesaka, on Flickr"><img src="http://farm3.static.flickr.com/2729/4130661036_c0c5965bfb.jpg" width="500" height="306" alt="BlitzLock Benchmark (1)" /></a></p>
<p>As seen above, BlitzLock is nicely scaling the workload without exhausting update threads. This is important since one of the concerns involved in the current implementation of BlitzLock is starvation (covered later). I think the read/write ratio above is the sort of ratio that is typically seen in the web industry and something I&#8217;m mostly concerned with. So how about a write intensive application? Next graph is a result of when there is significantly more update operations than read.</p>
<p align="center"><a href="http://www.flickr.com/photos/tmaesaka/4130695000/" title="BlitzLock Benchmark (2) by tmaesaka, on Flickr"><img src="http://farm3.static.flickr.com/2605/4130695000_20303186f3.jpg" width="500" height="302" alt="BlitzLock Benchmark (2)" /></a></p>
<p>As seen above, BlitzLock is nicely scaling update tasks without neglecting readers. Compared to the first graph, we&#8217;re seeing an opposite result between update and scanner threads which is expected due to the <a href="http://torum.net/2009/11/blitzdb-and-tc-concurrency-model/">nature of BlitzLock</a>. This is exactly what I was hoping to gain. Next graph is a result from when there is an even chance of read and update operations to occur.</p>
<p align="center"><a href="http://www.flickr.com/photos/tmaesaka/4130023819/" title="BlitzLock Benchmark (3) by tmaesaka, on Flickr"><img src="http://farm3.static.flickr.com/2551/4130023819_19f6e8eba8.jpg" width="500" height="303" alt="BlitzLock Benchmark (3)" /></a></p>
<p>As seen above, the throughput evens out for both read and update operations. I was expecting pthread&#8217;s rwlock to show noticeably lower update throughput than read (since it&#8217;s a single writer lock) but it turned out to even out. I&#8217;m not quite sure how I should interpret this but I guess the writer&#8217;s lock had a greater priority than reader&#8217;s lock in the environment that the test was run in. Nevertheless, this &#8220;even out&#8221; characteristic is something I&#8217;d like to welcome.</p>
<h3>From Here and Weaknesses</h3>
<p>I&#8217;m convinced to keep working on BlitzLock and use it as the default locking mechanism for BlitzDB. Ideally I should code BlitzDB to be able to switch between various locking mechanisms. This would make my life much easier when someone decides to write a locking mechanism that is better than BlitzLock for my use-case.</p>
<p>Thanks to <a href="http://pbxt.blogspot.com/">Paul McCullagh</a>&#8216;s feedback, I&#8217;ve come to realize that BlitzLock was broadcasting more often than it needs to. Functionally it still works but I should be able to save CPU usage by applying Paul&#8217;s feedback (thanks Paul!). There is also the potential lock starvation problem (when certain types of threads hog the lock) that I need to further investigate. If it&#8217;s going to cause noticeable issues, I&#8217;ll have to add a condition to BlitzLock saying &#8220;certain number of threads can obtain a certain lock at once&#8221;.</p>
<p>There is still another minor scheduling logic that I need to throw into BlitzLock but once I get that done (along with testing), I can integrate BlitzLock into BlitzDB and see how it performs (I can then hack on indexing!).</p>
<p>Yep, there&#8217;s still quite a bit to do but I&#8217;m having fun :)</p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2009/11/blitzlock-and-rwlock-comparison/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BlitzDB and Tokyo Cabinet Concurrency Model</title>
		<link>http://torum.net/2009/11/blitzdb-and-tc-concurrency-model/</link>
		<comments>http://torum.net/2009/11/blitzdb-and-tc-concurrency-model/#comments</comments>
		<pubDate>Thu, 19 Nov 2009 13:29:35 +0000</pubDate>
		<dc:creator>Toru Maesaka</dc:creator>
				<category><![CDATA[drizzle]]></category>
		<category><![CDATA[oss]]></category>
		<category><![CDATA[blitzdb]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[locking]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[pthread]]></category>
		<category><![CDATA[tokyocabinet]]></category>

		<guid isPermaLink="false">http://torum.net/?p=2307</guid>
		<description><![CDATA[Yesterday I sat in front of a whiteboard for few hours with Mikio, the author of Tokyo Cabinet discussing/debating what the optimal concurrency model would be for BlitzDB. I think we came to a pretty good conclusion so I&#8217;m going to note it on this entry. But before I step any further, allow me to [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday I sat in front of a whiteboard for few hours with Mikio, the author of Tokyo Cabinet discussing/debating what the optimal concurrency model would be for BlitzDB. I think we came to a pretty good conclusion so I&#8217;m going to note it on this entry. But before I step any further, allow me to go over Tokyo Cabinet&#8217;s concurrency model.</p>
<h3>Tokyo Cabinet&#8217;s Concurrency Model</h3>
<p>Tokyo Cabinet is <a href="http://www.google.com/search?rls=en&#038;q=tokyo+cabinet+single+writer">often quoted</a> as &#8220;single writer, multi reader&#8221; but this is <strong>not quite</strong> true. At the time of this blog entry, this statement holds true for TC&#8217;s B+Tree database but TC&#8217;s hash database can actually allow multiple writers to update and/or delete records concurrently.</p>
<p>If you look at the entry point of tchdbput(), you will notice that it is actually obtaining a reader&#8217;s lock (in terms of <a href="http://en.wikipedia.org/wiki/Readers-writer_lock">rwlock</a>). TCHDB then hashes the provided key and obtains the bucket index number where the record of interest belongs to. Given the bucket/block to work on, TC then looks at the 8 most significant bits of the hash value and attempts to obtain a granular update lock from slots of 256 mutexes (2 ^ 8 = 256). So, things are still concurrent at this stage though there are <em>some</em> chances of collision that would block a thread.</p>
<p>If a record already exists, TC will go on and happily update that block but if the record is new (as in the key doesn&#8217;t exist), TC will lock the tail block of the database and write the new record there. So, only writing a new record is treated as a  single writer and the rest can be processed concurrently. This is why I said it&#8217;s <strong>not quite</strong> true.</p>
<h3>BlitzDB&#8217;s Concurrency Model</h3>
<p>Taken the above into mind, this is what BlitzDB&#8217;s concurrency model  looks like:</p>
<ol>
<li>SELECT queries can run concurrently.</li>
<li>SELECT queries are blocked when UPDATE and/or REPLACE queries are being processed.</li>
<li>UPDATE, REPLACE, DELETE queries can run concurrently.</li>
<li>INSERT is never disrupted by BlitzDB and scheduled by TC.</li>
</ol>
<p>In an ideal world, I would allow Drizzle&#8217;s worker threads to _directly_ interact with TC and let TC handle thread synchronization. This would make my life fantastically easy but unfortunately life isn&#8217;t so easy.</p>
<p>For example, if a record is deleted while BlitzDB&#8217;s table scan is occurring, the table scanner will stop scanning at the position where the deleted key existed. I would not have this problem if I used TC&#8217;s native iterator but my table scan implementation uses TC&#8217;s <a href="http://torum.net/2009/10/iterating-tokyo-cabinet-in-parallel/">hidden API</a> that won&#8217;t babysit me in this regard. In return I can gain maximum concurrent read throughput from TC which was a tradeoff I happily accepted.</p>
<p>So, there are several little gotchas like this which forces me to implement concurrency control in BlitzDB. Here&#8217;s how I&#8217;m planning on doing it (with demo code!).</p>
<h3>Implementation (with demo code)</h3>
<p>In the past I&#8217;ve gone through several experimental stages with BlitzDB where I used pthread&#8217;s rwlock to control concurrency. Short answer to the result is, &#8220;IT WORKS!&#8221;. However it was not taking full advantage of TC&#8217;s concurrency model.</p>
<p>For example I did not want to protect UPDATE queries with a writer&#8217;s lock since it would block other UPDATE/DELETE queries. So why not protect it with a reader&#8217;s lock? The issue here is that any query that can change the state of the table cannot be processed while a scanner is running (which btw is protected by a reader&#8217;s lock). Furthermore, a non-index based update/delete means that the scanner _is_ running so there&#8217;s a problem there too.</p>
<p>What I need is a scheduler that can allow multiple INSERT/UPDATE/REPLACE/DELETE queries to run when the scanner is not running. On the other hand the scheduler must allow multiple scanners to run when an UPDATE/REPLACE/DELETE queries aren&#8217;t being processed _BUT_ let INSERT queries come through to TC.</p>
<p>Implementing the above is probably possible by using multiple mutexes but it would bring complexity to the codebase and possible deadlocks that can be difficult to debug. So we decided to learn from pthread&#8217;s rwlock implementation and write an original lock mechanism similar to rwlock but something that allows us to write our own rules for scheduling.</p>
<p>Here&#8217;s my first attempt at a standalone sandbox of the model:</p>
<ul>
<li><a href="http://torum.net/code/cc/blitzlock.cc">http://torum.net/code/cc/blitzlock.cc</a></li>
</ul>
<p>You can compile and run it to get a grasp of how threads are coordinated:</p>
<pre>$ wget http://torum.net/code/cc/blitzlock.cc
$ g++ -Wall -pedantic blitzlock.cc -lpthread &#038;&#038; ./a.out</pre>
<p>If you got the program running and wondering what the output means, think of the &#8220;updater&#8221; as a thread that performs either UPDATE, REPLACE or DELETE.</p>
<p>There are much more that I&#8217;d love to go on about but I think I&#8217;ve bloated this entry enough so I will save my urge for another day :)</p>
]]></content:encoded>
			<wfw:commentRss>http://torum.net/2009/11/blitzdb-and-tc-concurrency-model/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

