Thursday, December 16, 2010

libmembase - a C interface to Membase

Membase is "on the wire" compatible with any memcached server if you connect to the standard memcached port (registered by myself back in 2009), so that you should should be able to access membase with any "memcapable" client. Backing this port is our membase proxy named moxi, and behind the scene it will do SASL authentication and proxy your requests to the correct membase server containing the item you want. One of the things that differs Membase from Memcached is that we store each item in a given vbucket that is mapped to a server. When you grow or shrink the cluster, membase will move the vbuckets to new servers.

There is no such thing as a free lunch, so accessing membase through moxi "costs" more than talking directly to the individual nodes yourself. We like to refer to such clients as "smart clients". As a developer on Memcached I need to test various stuff, so I went ahead and hacked together a quick prototype of such a library to ease my testing. Initially I wanted to extend libmemcached with this functionality, but that seemed to be a big (and risky) change I didn't have the guts to do at the time.

The current state of the library is far from production quality, and with a minimal list of supported features. So why announce it now? Well I don't think I'll find the time to implement everything myself, so I'm hoping that people will join me in adding features to the library when they need something that isn't there...

I've designed the library to be 100% callback based and integrated with libevent, making it easy for you to plug it into your application.

So let's say you want to create a TAP stream and listen to all of the modifications that happens in your cluster. All you need to do would be:

struct event_base *evbase = event_init();

   libmembase_t instance = libmembase_create(host, username, passwd, bucket, evbase);
   libmembase_connect(instance);

   libmembase_tap_filter_t filter;
   libmembase_callback_t callbacks = {
      .tap_mutation = tap_mutation
   };
   libmembase_set_callbacks(instance, &callbacks);
   libmembase_tap_cluster(instance, filter, true);

Then you would implement the tap callback function as:

static void tap_mutation(libmembase_t instance, const void *key, size_t nkey, const void *data, size_t nbytes, uint32_t flags, uint32_t exp, const void *es, size_t nes)
{
   // Do whatever you want with the object
}


And thats all you need to do to tap your entire cluster :-) Let's extend the example to tap multiple buckets from the same code.

struct event_base *evbase = event_init();

   libmembase_t instance1 = libmembase_create(host, username, passwd, bucket1, evbase);
   libmembase_t instance2 = libmembase_create(host, username, passwd, bucket2, evbase);
   libmembase_connect(instance1);
   libmembase_connect(instance2);

   libmembase_tap_filter_t filter;
   libmembase_callback_t callbacks = {
      .tap_mutation = tap_mutation
   };
   libmembase_set_callbacks(instance1, &callbacks);
   libmembase_set_callbacks(instance2, &callbacks);
   libmembase_tap_cluster(instance1, filter, false);
   libmembase_tap_cluster(instance2, filter, false);

   event_base_loop(evbase, 0);

The instance handle is passed to the callback function so you should be able to tell which bucket each mutation event belongs to.

As I said all of the functions in the API is callback based, so if you want to retrieve an object you have to register a callback for get before calling libmembase_mget. Ex:
libmembase_callback_t callbacks = {
        .get = get_callback
    };
    libmembase_set_callbacks(instance, &callbacks);
    libmembase_mget(instance, num_keys, (const void * const *)keys, nkey);

    // If you don't want to run your own event loop, you can call the following method
    // that will run all spooled commands and wait for their replies before breaking out
    // of the event loop
    libmembase_execute(instance);

The signature for the get callback looks like:
void get_callback(libmembase_t instance, libmembase_error_t error, const void *key, size_t nkey, const void *bytes, size_t nbytes, uint32_t flags, uint64_t cas)
{
   // do whatever you want...
}

So what is missing from the library right now?
  • Proper error handling. Right now I'm using asserts and abort() to handle error situations, causing your application to crash... you don't want that in production ;-)
  • Timeouts.. Right now it will only time out on TCP timeouts../
  • A lot of operations! I'm only supporting get/add/replace/set...
  • Fetch replicas..
  • Gracefully handle change in the vbucket list
  • +++

Do you feel like hacking on some of them?

12 comments:

  1. Hi, I am trying to use your c-api to tap the cluster but I get a segmentation fault which points to Illegal Syntax on stream header. I am using your examples on the source code and other operations(to store or get data) work fine. Any inputs about this problem?

    ReplyDelete
  2. What does the stream look like??

    ReplyDelete
  3. Hi Trond,
    Does Tap works on C-Api. I used to get a segmentation fault. If so which is the latest code I should download?

    ReplyDelete
  4. It is supposed to work, but I haven't tried it lately... The latest code is always available from: https://github.com/couchbase/libcouchbase

    ReplyDelete
  5. How do you install the library in Linux?
    ./configure
    make install
    But I don't see configure file. there is one configure.ac.

    Thanks

    ReplyDelete
  6. ./config/autorun.sh && ./configure && make install

    ReplyDelete
  7. There is an error that could not find m4/version.m4, but there is a pandora_version.m4
    I am wondering if I should change configure.ac file to use that.

    ReplyDelete
  8. Hi Trond,
    I am still having trouble to install couchbase.
    When I run ./configure I get this error:

    configure: error: Failed to locate memcached/vbucket.h

    My I have libmemcached, libvbucket and libevent2 all installed. There is a libmemcached/memcached/vbucket.h in the include directory, but I can't have this point to that.

    How can I get around this error? Thanks.

    ReplyDelete
  9. Hi Trond,
    I was able to install all the libraries needed for libcouchbase.
    Do you have any updated example for tap? Looks like the one you have here is out of date.

    Thanks

    ReplyDelete
  10. You should probably look at: http://trondn.blogspot.com/2011/10/libcouchbase-explore-full-features-of.html

    Cheers,

    Trond

    ReplyDelete
  11. Hi Trond,

    I have looked at the site multiple times, you have not updated the site to use libcouchbase instead of libmembase. I am actually using your code examples, like memdump.c. I get this error:
    "undefined symbol: libcouchbase_create_io_ops"
    Am I missing anything else?
    Here is the code sample:
    libcouchbase_io_opt_st *io = NULL;
    io=libcouchbase_create_io_ops(LIBCOUCHBASE_IO_OPS_DEFAULT, NULL, NULL);

    if (io == NULL) {
    fprintf(stderr, "Failed to create IO instance\n");
    return 1;
    }

    libcouchbase_t instance = libcouchbase_create(host, username,
    passwd, bucket, io);
    if (instance == NULL) {
    fprintf(stderr, "Failed to create libcouchbase instance\n");
    return 1;
    }

    ReplyDelete
  12. More..... if I avoid libcouchbase_create_io_ops(LIBCOUCHBASE_IO_OPS_DEFAULT, NULL, NULL);

    libcouchbase_create works fine, but I get segmentation fault when calling libcouchbase_connect

    Its hard to tell where the problem is.

    ReplyDelete