Monday, September 17, 2012

libcouchbase is async, but my app isn't...

In the previous example I showed you how to hook libcouchbase into your own event loop, but not everyone wants to use it asynchronously. The best part of an asynchronous library is that it's pretty easy to make a synchronous library on top of it. All you need to do is to call the function and wait for it to complete, but in libcouchbase I went the extra mile. You may use it fully asynchronous, you may use it fully synchronous or you may use it somewhat in the middle.

The easiest way to use libcouchbase in a synchronous way is to toggle it's internals to the syncmode setting by calling:

lcb_behavior_set_syncmode(instance, LCB_SYNCHRONOUS);

One "drawback" with the syncmode setting is that you can't execute a chunk of commands concurrently. Lets look at the following example:

lcb_store(instance, NULL, 15, store_commands);
lcb_get(instance, NULL, 2, get_commands);

With syncmode enabled all of the store commands must have been executed and the response received before any of the get commands is sent (even if they don't hit the same servers). It would improve the latency of your application if you could shuffle as much data as possible over the network before blocking to wait for the result. That's when you want to set the syncmode to LCB_ASYNCHRONOUS (the default) and use lcb_wait() instead:

lcb_store(instance, NULL, 15, store_commands);
lcb_get(instance, NULL, 2, get_commands);
lcb_wait(instance);

when lcb_wait() returns we know that all of the above commands are executed (and the appropriate callbacks called).

So let's whip up a small program that shows you how to use the syncmode setting. To keep the example as small as possible I'm going to skip error recovery etc. Let's jump straight to the main() method:

int main(void)
{
    lcb_error_t error;
    lcb_t instance = create_instance();

    if ((error = lcb_connect(instance)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to connect to cluster: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    set_key(instance);
    get_key(instance);

    lcb_destroy(instance);
    exit(EXIT_SUCCESS);
}

As you see, the program is pretty simple. First we create a libcouchbase instance and connect to the cluster; set and get a key before we clean up and exits.

Let's look inside create_instance:

static lcb_t create_instance(void)
{
    lcb_t instance;
    struct lcb_create_st copt;
    lcb_error_t error;

    memset(&copt, 0, sizeof(copt));
    copt.v.v0.host = "myserver:8091";

    if ((error = lcb_create(&instance, &copt)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to create libcuchbase instance: %s\n",
                lcb_strerror(NULL, error));
        exit(EXIT_FAILURE);
    }

    lcb_behavior_set_syncmode(instance, LCB_SYNCHRONOUS);
    lcb_set_error_callback(instance, error_handler);
    lcb_set_store_callback(instance, store_handler);
    lcb_set_get_callback(instance, get_handler);

    return instance;
}

This looks pretty much like how we normally do stuff except for the line I've marked as bold. From that line all of the lcb_ functions will block until we've received a reply from the server for the requested operation. You might be curious about how my handler functions looks like, and they are pretty boring:

static void error_handler(lcb_t instance, lcb_error_t err, const char *info)
{
    fprintf(stderr, "FATAL! an error occured: %s (%s)\n",
            lcb_strerror(instance, err), info ? info : "none");
    exit(EXIT_FAILURE);
}

static void store_handler(lcb_t instance, const void *cookie,
                          lcb_storage_t operation, lcb_error_t error,
                          const lcb_store_resp_t *resp)
{
    (void)cookie; (void)operation;

    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to store the key on the server: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    if (resp->version == 0) {
        fprintf(stdout, "Successfully stored \"");
        fwrite(resp->v.v0.key, 1, resp->v.v0.nkey, stdout);
        fprintf(stdout, "\"\n");
    }
}

static void get_handler(lcb_t instance, const void *cookie,
                        lcb_error_t error, const lcb_get_resp_t *resp)
{
    (void)cookie;

    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to read the key from the server: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    /* Validate that I read the correct key and value back */
    if (resp->version != 0) {
        fprintf(stderr,
                "WARNING: I don't support this version of libcouchbase\n");
        exit(EXIT_FAILURE);
    }

    fprintf(stdout, "I received \"");
    fwrite(resp->v.v0.key, 1, resp->v.v0.nkey, stdout);
    fprintf(stdout, "\" with the value: [");
    fwrite(resp->v.v0.bytes, 1, resp->v.v0.nbytes, stdout);
    fprintf(stdout, "]\n");
}

If we jump back to our main function the next instructions to execute would be:

    if ((error = lcb_connect(instance)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to connect to cluster: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

Now that we've enabled the LCB_SYNCHRONOUS mode in libcouchbase this will be a blocking call (I fixed that bug today Sept 17th), so that when the method returns we know the topology of the Couchbase cluster and we're about to store our object there in the set_key method that looks like:

static const char const *key = "mykey";
static const char const *value = "myvalue";

static void set_key(lcb_t instance)
{
    lcb_store_cmd_t cmd;
    lcb_error_t error;
    const lcb_store_cmd_t * const commands[] = { &cmd };

    memset(&cmd, 0, sizeof(cmd));
    cmd.v.v0.key = key;
    cmd.v.v0.nkey = strlen(key);
    cmd.v.v0.bytes = value;
    cmd.v.v0.nbytes = strlen(value);
    cmd.v.v0.operation = LCB_SET;

    if ((error = lcb_store(instance, NULL, 1, commands)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to store key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }
}


This looks exactly like how we would have done it in our asynchronous application, and here is a very important detail. The return code of the function is not the return code of the actual operation! It means the exact same thing as it used to: if we encountered any problems initiating the operation. It is the callback function that will tell you the result of the operation from the server!

To make the example complete let's look at the get_key function as well:

static void get_key(lcb_t instance)
{
    lcb_get_cmd_t cmd;
    lcb_error_t error;
    const lcb_get_cmd_t * const commands[] = { &cmd };

    memset(&cmd, 0, sizeof(cmd));
    cmd.v.v0.key = key;
    cmd.v.v0.nkey = strlen(key);

    if ((error = lcb_get(instance, NULL, 1, commands)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to get key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }
}

Pretty simple and straight forward.

Happy hacking!

9 comments:

  1. It is good to have this level of discussion about how this works, but it is far from a complete explanation of how one would typically actually use a database get method that returns the requested data to a callback function. Two questions:

    1) How hard would it be for Couchbase to provide a Synchronous get method in C which actually returns the result and data to the initiating method rather than a callback?

    2) Is there an example anywhere that shows how someone might actually use a get callback to put data into a real object, and do something useful with it? Writing data to stdout is a long way from showing how someone would write some code to really use the callbacks in a useful way. I am imagining that the cookie needs to be used to somehow communicate what the incoming data is and/or what to do with it. Presumably, when someone asks for data, it's because they want to store it someplace and/or use it for something, and the calling thread that wanted to do that, can only put in a future request and leave notes of some sort to itself to continue doing what it had in mind, after the callback returns with the needed data. Is there any example or discussion that comes close, or does every C Couchbase developer need to invent their own system for doing this?

    2)

    ReplyDelete
  2. I have the exact same question. The example with print statements is way too simplistic to be useful at all. I'm sad that this hasn't seen a response

    ReplyDelete
  3. Hi, If you want a more complex example you could always look at the PHP driver at https://github.com/couchbase/php-ext-couchbase It is using libcouchbase to implement a driver for PHP. There you'll find code to serialize an object from PHP and store it in couchbase and then deserialize it upon retrieval.

    If you have a good idea to an app I may implement as an example (which won't take forever to write) I'll be happy to post it as an example here.

    ReplyDelete
  4. What would be most useful is if you could just take the simple example given and actually show a way to pass data back from the callback. We've actually managed to get this working with cookies, at least we believe we have, but it took quite a lot of experimentation to figure out something that would work. Also, it's not entirely clear what a cookie is, and how lcb_set_cookie relates to the cookie that is passed to, say, lcb_get cookie parameter (if they are indeed related).

    ReplyDelete
    Replies
    1. There is two ways of passing stuff to the callbacks. You can either tie a piece of userspecific data to the instance by using the lcb_set_cookie and lcb_get_cookie. Each instance may only hold a _single_ cookie at any given time. Trying to set a new one cause the old one to be forgotten. The other way of passing user-data to a callback is to provide a command-cookie to the operation you're calling. These cookies are just an address libcouchbase remembers for you. libcouchbase will not try to preserve the value there, so its up to you to provide whatever you want and ensure that it is something meaningful (ex: using the address of a stack variable and then exit the function is not a good idea, because then you don't know whats in that area etc).

      Did you try to read the man pages for lcb_get_cookie and then the man pages for lcb_get and lcb_set_get_callback ? (or for any of the other operations?). Could you please let me know what you think is missing from those pages so that I can improve them instead?

      Delete
    2. Yes, read those man pages. So let me see if I understand correctly:

      Cookie is a synonym for a pointer. This pointer must point to pre-allocated memory.

      lcb_set_cookie() allows you to register a pointer to some buffer that will be one-per-instance.

      So, to clarify, this cookie is NOT the cookie that is optionally passed in to the callback. That cookie is passed via the second parameter to, say, lcb_get for instance, this will be passed to the cookie parameter of the callback, correct?

      The const'ness of the cookie parameters also seems to be a bit of an issue, since most stdlib functions to copy data into buffers use non-const pointers. But it's only a warning, so no big deal. Just making sure that we're not missing something here and misusing them.

      If there was an example showing the "appropriate" way to shuffle data out of the callback, I think that would solve a lot of the confusion for people.

      Thanks!

      Delete
    3. Cookie is a synonym for "whatever you want" that fits into 64 bytes. It may be a pointer, or it may be a counter or whatever. The constness there is to indicate to you as a user that the library won't modify the data in any way.

      So you have a couple of choices on how you want to do this. You could do stuff like:

      static void my_get_callback(lcb_t, const void *cookie, lcb_error_t error, const lcb_get_resp_t *resp)
      {
      int *counter = (int *)cookie;
      /* do something with counter */
      }

      and use it like (in c++):

      int numCallbacks = 0;
      lcb_get_cmd_t *cmds[] = { &cmd };
      lcb_get(instance, &numcallbacks, 1, cmds);

      or you could let the cookie be a pointer to a class/structure you want to initialize with the content of what you're using like:

      static void my_get_callback(lcb_t, const void *cookie, lcb_error_t error, const lcb_get_resp_t *resp)
      {
      User *user = (User *)cookie;
      user->initializeFromStorageFormat( ... );
      }

      and pass a pointer to a newly created object as the command cookie. Alternatively you could set the instance cookie to some data structure that you know how works and get it back from there.

      Delete
    4. This comment has been removed by the author.

      Delete
    5. Awesome, thank you for the reply. So using the example from:

      http://www.couchbase.com/develop/c/current

      I have modified it as follows:

      https://gist.github.com/nathanejohnson/5413512

      Does this look like a good way to return data / document body from the callback?

      Thanks!

      Delete