Thursday, August 30, 2012

libcouchbase overhauling..

When I laid out the requirements for libcouchbase one of my goals was that it should be binary compatible, and we've managed to keep that promise pretty well. We've extended the API in a number of ways, but always in a compatible way. We've been able to build client libraries for other languages such as Ruby; PHP; Perl and node.js (possibly others I forgot as well) on top of libcouchbase and life has been good..

We've been talking about how great it would be if we could store a "datatype" field on each item you're storing, so that different clients could "do smart things"™with the objects. Unfortunately I didn't create the API in an extendible way to make it easy to do so. I would either have to break the API, or I could extend the API with more specialized methods. Ideally I wouldn't like to break the API, but the thought of extending the API with all sorts of specialized functions would just make the API a nightmare to use.

Given that Couchbase server 2.0 is going to give you a lot of new cool stuff, it would be a good idea to change the API before people started to use all of this goodness and make the transition to a new API harder then it has to be.

The first major change you'll notice is that we've shortened all of the names from libcouchbase to lcb. The header file is still libcouchbase/couchbase.h and you should link with -lcouchbase. This change isn't very hard to work around, simply do a search and replace libcouchbase_ with lcb_.

We've also removed a lot of the entry points (all of the _by_key-methods), and renamed some of the methods to make the name clearer (get_locked instead of getl etc). Adapting to those changes shouldn't be that hard either. The change in the API that'll require some effort to fix is however that we've changed the signature of almost all of the functions and the callbacks. I've tried really hard to make the API consistent so that once you know how it works you can "guess" the signature for the function you're going to use. The common form is:

lcb_error_t lcb_operationname(lcb_t instance, 
                              const void *cookie, 
                              int number, 
                              const lcb_operation_cmd_t * const * commands);

What?? I have to send in an array of pointers to commands?? You might think that sucks and I'm only trying to make your life as a user miserable, but there is actually a good reason for not using an array of objects. To explain that, lets start looking at the internals of these lcb_operation_cmd_t. In order to allow us to extend the functionality in the future without breaking the binary compatibility we're using versioned structs. Let's take a look at the operation-struct you would use to retrieve an object:

    typedef struct {
        int version;
        union {
            struct {
                const void *key;
                lcb_size_t nkey;
                lcb_time_t exptime;
            } v0;
        } v;
     } lcb_get_cmd_t;

The version member in the structure tells us how we should interpret the rest of the structure. Given that the size of this struct may vary over time, we can't pass in an array of such objects (well, we could, but the internals in the function would have a harder time figuring the offset for the next object since it would need to know the alignment and struct packing you used).

So let's show you a working example on how you would use the new API to retrieve a value from your cache.

In C99 I would typically write something like this:

    const lcb_get_cmd_t c = { .version = 0,  
                              .v.v0 = { 
                                .key = "mykey", 
                                .nkey = 5 
    const lcb_get_cmd_t* cmds[] = { [0] = &c };
    lcb_get(instance, NULL, 1, cmds);

We've added constructers for the objects in C++, so all you need there is:

    const lcb_get_cmd_t c("mykey", 5);
    const lcb_get_cmd_t* cmds[] = { [0] = &c };
    lcb_get(instance, NULL, 1, cmds);

We're also using the same method in the callbacks (I've tried to make them as consistent as possible just as the commands), so the get callback now looks like:

    void get_callback(lcb_t instance,
                      const void *cookie,
                      lcb_error_t error,
                      const lcb_get_resp_t *resp)

A typical implementation could look like:

static void get_callback(lcb_t instance,

                         const void *cookie,
                         lcb_error_t error,
                         const lcb_get_resp_t *resp)
    if (error == LCB_SUCCESS) {
        if (resp->version != 0) {
            // I don't know how this object look like
            // Do some proper error handling
            return ;

        // Use the information in resp->v.v0 

    } else {
        // Do error handling for miss/errors

We've tried to update the documentation in the headerfile with more information and examples for each of the entry function. Please don't hesitate to drop me an email (or ask questions on #libcouchbase on IRC).

We're in the middle of updating clients to use the new API and write more and more automatic tests!

Happy hacking!


  1. Any chance that libcouchbase works against a stock memcached server (assuming I don't call things that memcached doesn't support like getl)? I'm coding in C++ on Windows and the libmemcached folks just don't seem to be interested in supporting a Windows version. My original plan was to use couchbase but Management shot that idea down. I know that you did a Windows port years ago but it is horribly out of date.

    1. It is supposed to work already through the "compatibility" layer. I haven't "overhauled" this interface (yet), so we might change this as well. Right now the following should work:

      struct lcb_memcached_st memcached;
      lcb_t instance;

      memset(&memcached, 0, sizeof(memcached));
      memcached.serverlist = "localhost:11211;localhost:11212";

      assert(lcb_create_compat(LCB_MEMCACHED_CLUSTER, &memcached, &instance, NULL) == LCB_SUCCESS);

  2. If multiple lcb_get_cmd_t is passed to lcb_get, will the get_callback be called only once with all values, or it can be called multiple times, and each time have part of values?

    1. You'll receive one callback per lcb_get_cmd_t you passed along.

  3. What will happen to an existing key in cluster if I set the exptime to a zero and non-zero value. In fact, I wonder whether the exptime will be updated if the value of exptime is set to a non-zero, and whether it will keep the exptime if the value of exptime is set to zero.

  4. This comment has been removed by the author.