Home > drizzle, oss > Drizzle Storage Engine Dev: Determining Query Type

Drizzle Storage Engine Dev: Determining Query Type

November 25th, 2009

Determining what kind of SQL query is requested at the handler level is pretty important for BlitzDB since the strategy is to obtain the most suitable lock for a given request. Unfortunately there is no intuitive way to get this information. So, I took a peek into InnoDB’s sourcecode and found my solution (open source saves the day as usual).

Solution

In Drizzle, there is a function called session_sql_command(Session *session) which returns an integer that corresponds to one of the command type constants (which are accessible from the engine). Ideally I would like to call this function from anywhere in the engine but since it requires a session object as an argument, I could only call it from store_lock().

My solution was to add a variable in the handler class and assign the appropriate value to it from store_lock(). This turned out to be okay since store_lock() is called before any other API functions but the concern here is that store_lock() is planned for removal in the future.

Now I can do things like:

ha_blitz::rnd_init(bool drizzled_will_scan) {
  if (sql_command_type == SQLCOM_UPDATE)
    /* get the most suitable lock type for this task */
  else if (sql_command_type == SQLCOM_SELECT)
    /* get the most suitable lock type for this task */
  ...
}

Personal Request to Drizzle

Although I would like to see store_lock() disappear from the storage engine API, I would like storage engines (technically worker threads) to have ability to gather meta information on the query before any real work is done.

My request is for store_lock() to become something along the line of gather_information() where it gives the handler (or worker threads) a chance to gather information about the query. Needless to say, drizzled must call this function before any other API calls are made.

Toru Maesaka drizzle, oss ,

  1. November 25th, 2009 at 17:37 | #1

    Hi!

    Why not use the info call?

    store_lock() is no longer called for anything but non-temp engines in certain cases. It will be going away completely sometime in the next few weeks.

    Cheers,
    -Brian

  2. November 25th, 2009 at 17:47 | #2

    How would you like the API to be? What about having a GPB message that is returned to the engine which described the query in question? Before I code something up, though, I need to know the information that you would most like to see in the message sent form the kernel. The command type and what other attributes?

    Cheers!

    Jay

  3. November 26th, 2009 at 03:08 | #3

    Hi Brian,

    Glad to hear store_lock() is going away soon! Goes well with my philosophy that engines should have full responsibility for concurrency control.

    As for info(), I don’t think this will work since the session object needs to be accessible to get meaningful information. The reason I did what I did in store_lock() is:

    (1) Session object is accessible through the argument.
    (2) It is guaranteed to run before any other calls.

    I think Jay’s suggestion is spot-on.

  4. November 26th, 2009 at 03:15 | #4

    Hi Jay,

    I think using protobuf would be fantastic. It would allow us to keep adding attributes to it as Drizzle matures without screwing up storage engines.

    Right now, I can only think of “command_type” for attributes but I think that would be enough to start with since we can keep adding what we want to the .proto definition as we come up with useful attributes.

    I think existing and upcoming new engines would benefit quite a bit from this model of providing query attributes via protobuf. Definitely helps as a “hint” for engines to optimize the task. Opposite of what engines do for the optimizer in info().

  5. November 27th, 2009 at 00:59 | #5

    I’d prefer not to use a protobuf message here… they’re very much for data serialisation, not for passing things around inside a program (especially in performance critical parts).

    As we improve optimizer and execution engine parts of code, we should end up with a much nicer interface for engines to look at.

    However, as a short term solution, it should be possible for engines to look at the Statement while running Cursor methods.

  6. November 27th, 2009 at 01:00 | #6

    (also, this is another good example on why we should merge plugins… then we’d immediately see the BlitzDB usage and have to think about it on a global scale, not just leave it for Toru to find and fix the world :)

  7. November 28th, 2009 at 05:44 | #7

    Hi Stewart,

    Awesome. Right now, I’m happy as long as there’s a way for me to get statement information so I’m glad to hear that it might be possible to pull this off. It would be even better if this could be done before Brian deletes store_lock() from the API. Guess I should take a look at how to pull this off in drizzled’s code too :)

    I’d love to get BlitzDB merged with the trunk but it’s not quite there yet. I’m nearly done rewriting the base of it with a new design and locking mechanism so we could talk about it once that’s done… Ideally I want you guys to test it when I support at least one secondary index.

  1. No trackbacks yet.