Drizzle, BlitzDB and HTON_STATS_RECORDS_IS_EXACT
Recently I enabled HTON_STATS_RECORDS_IS_EXACT in BlitzDB to let the optimizer know that BlitzDB can instantaneously return the number of rows in a specified table. As a result, the Drizzle kernel can directly call the Cursor::info() function to get the row count. To users, it means that SELECT COUNT statements can be executed in O(1). So it’s a great thing in general.
Something Broke
After I enabled HTON_STATS_RECORDS_IS_EXACT, I noticed that issuing SELECT statement on a table with 1 row would no longer return a resultset. Weird indeed! after investigating with GDB, I noticed that rnd_next() is only called once instead of twice on a table with 1 row (second time is to find EOF) when HTON_STATS_RECORDS_IS_EXACT is enabled. This makes sense because the kernel knows that there is only 1 row and therefore it doesn’t need to keep scanning for EOF. However, this made me scratch my head since this shouldn’t break BlitzDB’s table scanner.
Remedy
Logically, I was confident that BlitzDB’s table scanner was functioning properly so I decided to look at what was going on beyond the engine API. Turns out that join_read_system() in sql_select.cc looks at the table->status value and decides that it’s an error if 0 isn’t assigned to it. What’d you know? I realized that I wasn’t assigning anything to the status variable. It’s more that I didn’t know that I was meant to update an internal structure. You’d think that engine developers aren’t meant to touch those. It’s not mentioned in the Engine Documentation at MySQL Forge either. Nevertheless, the important thing is that it works now. Oh and SELECT COUNT is fast now too.
Eye Opener
This experience among other occasions where I had to read the kernel’s source made me think that it would be nice to provide an intensive up to date documentation on how to develop storage engines for Drizzle in the future (when the API becomes stable). Needless to say, this would be co-ordinated within the Drizzle community. I’m not a license person but it should hopefully be provided with a freely available license too.

It would be great for this to be done. I think that only Drizzle is able and willing to make improvements like that.
Hi!
Agreed. It would be even better if we could distribute it in a Kindle friendly format :) I’m not sure about Drizzle being the only community though. Folks at Monty Programs seem dedicated to things like this too.
Personally, my motivation for things like this is to save people from having to follow the same frustrating path as me.
IMHO, Cursor::info() is, well, a complete turd-pile of a method. I’d love to get rid of it and use something like:
Cursor *cursor= engine->getCursor(READ_CONSISTENT);
if (engine->supportsExactRowCount())
{
cursor->getRowCount();
}
Much cleaner, IMHO.
-jay
Jay,
That would be neat too. As long as there’s a way for the kernel to directly lookup this information, I have no complaints what-so-ever :)