Journal of Storage Engine Development on Drizzle
I’ve decided to start a series of blog entries on not-so-obvious findings that I’ve found while working on my new project. By archiving the findings, I’m hoping that I can help those that are looking into developing a storage engine for the MySQL family in the future.
Accumulating these mini-knowledge would also be useful for me since I can refer back to it when I forget something. Also, once I write enough entries I’m planning on summarizing them and making it available on the Drizzle Wiki. If MySQL is interested in updating the engine documentation, I would be more than happy to help there too.
So to begin with, I’ll describe something trivial that I stumbled across while trying to catch an error on duplicate primary key insertion to the data table.
Background
In brief, the database kernel does not care if the INSERT query contains a duplicate primary key for a given table or not. It is the storage engine’s job to tell the kernel that the request was invalid due to key collision. If a storage engine fails to do this, the kernel will acknowledge that the query was successful (given that no other errors were thrown) and will keep doing what it needs to do.
Mechanics
Data insertion is handled inside the write_row() function that your engine must implement. The return value of this function is an integer that represents the status of the work it had done. After looking through the possible error statuses in “drizzled/base.h”, I immediately found this:
#define HA_ERR_FOUND_DUPP_KEY 121 /* Dupplicate key on write */I also looked through MyISAM and InnoDB to confirm that this was indeed the correct error status to return on duplicate primary key. Here is the snippet of my row insertion at the time:
/* TC's tchdbputkeep will not insert a row to the table if there was a collision */ if (tchdbputkeep(data_table, primary_key, primary_key_length, buf, table->s->reclength) == false) { my_errno = HA_ERR_GENERIC; /* check for primary key collision */ if (tchdbecode(data_table) == TCEKEEP) my_errno = HA_ERR_FOUND_DUPP_KEY; return my_errno; }
On first glimpse, this seems right but the error I was getting from the command line prompt always differed with MyISAM and InnoDB despite returning the same error status. Specifically, this is what I was getting:
ERROR 1022 (23000): Can't write; duplicate key in table 't1'
whereas I was getting this error on other engines:
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
At this stage I couldn’t make sense of what I was doing wrong but it turned out that the solution was pretty simple.
Solution
After talking to Stewart Smith about my issue in #drizzle @ freenode, it turned out I am supposed to keep track of which key the duplication was found in write_row() and inform it to the kernel via the info() function.
You can do this by setting the errkey integer variable to the key number that is used internally by the kernel. So, obtaining the internal primary key number with this call in write_row():
share->errkey = table->s->primary_key;
and adding the following code to info():
if (flag & HA_STATUS_ERRKEY) { errkey = share->errkey; }
happily fixed the issue I was experiencing. Yay.
I guess reading the section on info() in the document gives a hint that this is where you supply the key number on key-error but frankly, this is really easy to forget and miss since the importance isn’t so emphasized.
Anyhow, thats all I have to say in the first of this series and hopefully I’ll write something more interesting in the upcoming entries. Until then, happy hacking ;)
