Tuesday, December 11, 2012

IIS, PHP and Windows

In this short blog post I’m going to show you how to use PHP to access your Couchbase cluster on your Windows machine. I have a limited set of hardware at home, so I used a laptop with Microsoft Windows 7 home premium edition to test this example. Unfortunately for you that laptop came with a Norwegian version of Windows, so I can't tell you the exact name for the various menu items.

The first thing we need to do is to install Microsoft Internet Information Services (I'm going to refer to this as IIS) on your computer. You do so by selecting the "enable/disable windows features", and check the box for "Internet Information Services". Expand that selection and locate and expand "Webservices", and then locate and expand something like "application development" (sorry for not having an english copy of Windows ;)). In this category check the CGI checkbox. At the level right beneath the Internet Information Services you should have an entry containing something like "tools for web management", and in this sub category you should check "IIS-management console".


So lets go ahead and verify that the IIS installation works, and that it doesn’t support PHP already (that would render this blogpost pretty useless right ;)) Go ahead and put the following into c:\inetpub\webroot\test.php:

   <?php
   phpinfo();
   ?>


If you try to connect to http://localhost/test.php, you should either a page with an error message. If you get something else your IIS installation may already be configured (and hopefully we'll make it work).


The next thing we need to do is to install PHP and the Couchbase extension. At the time of writing we don’t have a binary package you can install, but it isn’t really hard to build it yourself. If you follow the steps I outlined in http://trondn.blogspot.no/2012/11/building-php-extension-for-couchbase-on.html you should be up’n’running in no time. I followed that instruction when I wrote this example, but I used a slightly different configure command:


configure --disable-all --enable-cli --with-couchbase=shared --enable-json --enable-cgi --with-config-file-scan-dir=C:\php\conf.d


You should now create C:\php\conf.d\couchbase.ini with the following content:



extension=php_couchbase.dll


Now that we’ve got IIS and PHP installed its time to look at the magic to glue them together. Open up the IIS Service Management Console (as I’ve already said I've got Norwegian windows so I may have used the wrong translation there) and select “Handler Mappings”, and in the actions pane (upper right corner) select “Add Module Mapping”. Fill in the following details starting from the top: “*.php”, “FastCgiModule”, “C:\php\php-cgi.exe” and whatever you want as the description. Be sure to answer “Yes” on the next dialog that asks if it should create a FastCGI application for this executable (if not it won’t work). You may also want to enter the “default document” setting and add index.php as an alternative. By doing that IIS will search for a file named index.php if the URL specifies a directory.

With this in place it’s time to retry our test page. Tune your browser back to http://localhost/test.php, and this time you should see a nice page with a lot of info. Scroll down and you should locate a section with information about the Couchbase extension.

Yay!!! We made that work, but being able to run PHP from IIS isn’t that fun all by itself.. Let’s see if we can get the Couchbase beersample for php working!

Go ahead and download and install Couchbase server 2.0 from http://www.couchbase.com/download, and choose that you want to install the beer sample database during configuration.
With that in place you can download the PHP example program I wrote earlier today (I’ll describe that in another blog post) from https://github.com/trondn/beersample-php/archive/master.zip. Unzip that program in C:\inetpub\webroot. Before we can try the sample application you have to create two new views in Couchbase. The first one should be in a design document named brewery and named by_name and look like:



    function (doc, meta) {
        if (doc.type == "brewery") {
            emit(meta.id, doc.name);
        }
    }


The next one is located in the beer design document and named by_name and looks like:


function (doc, meta) { if (doc.type == "beer") { emit(doc.name, doc.brewery_id); } }


Make sure you remember to press the publish button after you create them, or things won't work.


You should now be able to direct your browser to http://localhost/beersample-php-master and start browsing breweries :)

Happy hacking!

Thursday, November 1, 2012

Building the PHP extension for Couchbase on Microsoft Windows!


I've got a pretty good background from doing cross platform work. My first job as a software developer was to port a system to Trusted Solaris (and add privilege bracketing), but I've been porting stuff to Windows, HP-UX, Linux etc to name a few later on. It is important to me that the software should be portable, and that was one of the fixed requirement I had when I started libcouchbase back in the days.

Two days ago I sat down with Matt planning what I should focus on for the SDK team, and we both felt that we had to give the Windows support for some of our projects a makeover. Given PHPs popularity I figured that I should start with that.

I was kind of surprised to see how easy it was to build PHP with support for Couchbase on windows, but relax; we are going to ship binary versions for you so that you won't have to do this yourself, but being a geek I thought you might find it interesting to read how to do it yourself.

The first thing you have to do is to get set up a development environment. I initially used the description from PHP wiki, with some minor modifications. Feel free to use the steps outlined in that link, but I'll this is however how I did it. First I installed the following software:


  • Windows server 2008 r2
  • Visual studio 2008 professional
  • Windows SDK 6.1
  • git source control system
  • winrar


With that installed, you can open up a "windows sdk" shell, and get execute the following commands:

setenv /x86 /xp /release
mkdir c:\php-sdk

By using winrar I then extracted the php-sdk-binary-tools-20110915.zip into that directory. I could then create the directories and get ready to build the source.

cd c:\php-sdk
bin\phpsdk_setvars.bat
bin\phpsdk_buildtree.bat php-src

Start up a "git-bash shell", and navigate to c:\php-src\vc9\x86, and check out the code. Unfortunately this won't work as of today for you because not all of the patches for the source code have been pushed through yet:

git clone git://github.com/php/php-src
git clone git://github.com/couchbase/libcouchbase
cd php-src/ext
git clone git://github.com/couchbase/php-ext-couchbase couchbase

We're now ready to build the source! The first thing we need to do is to build libcouchbase (the C library that provides the functionality to talk to the Couchbase cluster):

cd libcouchbase
nmake -f NMakefile all install

With that installed we're ready to build the PHP module that expose the PHP functionality:

cd ..\php-src
buildconf
configure --disable-all --enable-cli --with-couchbase=shared
nmake all install
copy ../deps/bin/libcouchbase.dll c:\php

As you see I'm disabling all other modules than the Couchbase plugin (and I'm making that as a shared object). You would have to populate the "deps" directory with other libraries if you wanted to add support for them (but that's beyond the intention of this blog post).

So let's go ahead and try our new php extension. If you don't have a Couchbase cluster running already, now is the time to install one :) With the Couchbase cluster running, you can create a simple php.ini file that tells php to load our module:

extension=c:\php\php_couchbase.dll

I'm pretty sure you all know PHP way better than me, so please forgive me for the stupid code we'll use to show that it's working. I created the file test.php with the following content:

<?php
   $cb = new Couchbase("","","","default");
   $cb->set("hello", "world");
   var_dump($cb->get("hello"));
?>

Running the program with:

cd c:\php
php -f test.php

Should print out:

string(5) "world"

Thats all for this time! Happy hacking!

Shift of focus!


I really like going to conferences, not necessarily because I want to attend the different sessions, but because its a great arena to meet up with users and people interested in the same stuff as I am. Yesterday I attended CouchConf in Berlin, which was a lot of fun. I got to meet old friends, people I've only talked to on IRC (so fun to get a face behind the nick) and new people!

As part of attending CouchConf I spent some time talking to Matt Ingenthron about our SDKs, so I'm happy to announce that as of today I'm going to spend a significant amount of time working on our clients. Up until now I've done most of my contributions to the clients on my spare time, so I'm really looking forward to be able to spend my entire day trying to make your life easier.

Friday, September 28, 2012

YANFS - Yet Another Network File System

In the last few blog posts we've been exploring different kinds of how you might want to utilize libcouchbase from your app. To me it has always been really hard to come up with good example apps, so I always end up creating pretty boring applications.

Those of you who attended CouchConf in San Francisco really missed out if you didn't watch Couchbase Labs demo sessions, where they showed off CBFS. I'm not going to go into any details about what CBFS is (hopefully Dustin Sallings will create a blog post about that), but you can think cbfs as a large object store analogous to S3 (but meant to run in your own environment).

CBFS is implemented in the go language (if you haven't looked at that yet, put that on your todo list!), and it's shipped with it's own client to perform operations (upload files, list files etc). This gave me a great idea for todays blog post! FUSE is a project that allows you to create a filesystem implementation in user space, so the topic for today is an example where we utilize libcouchbase in our implementation for our FUSE driver.

As usual I won't explain the entire code, but I'll try to comment on the important bits. You'll find the source code for the entire project at http://github.com/trondn/mount_cbfs

The readme in the project should tell you how to install fuse and build the stuff, so in this blog post I'm only going to focus on how it works.

Creating a full featured filesystem driver is a lot of work, so I figured it would be better to start with a minimal implementation with a lot of limitations we could fix up later on (or I would never get around building it ;))


  • It is fully single threaded. This should be pretty easy to fix up later on by using a pool of libcouchbase instances, no shared data etc..
  • Read Only. This limits the number of entry points we need to implement.
So let's start looking at the code and how it all fits together. In main we register the struct containing all of the function pointers to the functions our filesystem support:

static struct fuse_operations cbfs_oper = {
    .getattr = cbfs_getattr,
    .open = cbfs_open,
    .read = cbfs_read,
    .readdir = cbfs_readdir,
};

This means that whenever someone tries to open a file in our filesystem, cbfs_open is called etc. So let's walk through an example here: What is the flow when you're typing "ls -l" in your shell.

The first things that happens is that cbfs_getattr is called. Its responsibility is to populate a stats struct with information about the file. Let's skip the details for now, and just assume that we return that this is a directory. Now cbfs_readdir is called with the directory to receive all of the entries in the directory, before cbfs_gettattr is called for every file to get the details of the files.

So how does our cbfs_readdir work like (after all this blog is about libcouchbase, not fuse ;)). CBFS use a http interface so we can request the list of files in a directory through the following URL:

http://cbfsserver:8484/.cbfs/list/path

It returns a JSON document looking something like:

{"files":{"file1":{}},"dirs":{},"path":"/foo"}

All we need to do is to decode the JSON and populate the information to FUSE. So how do we do this through libcouchbase? Through the http interface:

static lcb_error_t uri_execute_get(const char *uri, struct SizedBuffer *sb) {
    lcb_http_cmd_t cmd = {
        .version = 1,
        .v.v1 = {
            .path = uri,
            .npath = strlen(uri),
            .body = NULL,
            .nbody = 0,
            .method = LCB_HTTP_METHOD_GET,
            .chunked = 0,
            .content_type = "application/x-www-form-urlencoded",
            .host = cfg->cbfs_host,
            .username = cfg->cbfs_username,
            .password = cfg->cbfs_password
        }
    };

    return lcb_make_http_request(instance, sb, LCB_HTTP_TYPE_RAW, &cmd, NULL);
}

Since I'm using the synchronous interface to libcouchbase that call will block until we've received the response from the server. In my response handler I just copy whatever data the server returned to me:

static void complete_http_callback(lcb_http_request_t req, lcb_t instance,
                                   const void *cookie, lcb_error_t error,
                                   const lcb_http_resp_t *resp)
{
    struct SizedBuffer *sb = (void*)cookie;

    if (error == LCB_SUCCESS) {
        /* Allocate one byte extra for a zero term */
        sb->data = malloc(resp->v.v0.nbytes + 1);
        sb->size = resp->v.v0.nbytes;
        memcpy(sb->data, resp->v.v0.bytes, resp->v.v0.nbytes);
        sb->data[resp->v.v0.nbytes] = '\0';
    }
}

Both cbfs_read and cbfs_getattr utilize the same function from libcouchbase, but they're only hitting different URL's to get their data.

Adding supports for write operations should be no worse than finding the correct URL and do a PUT etc.

Happy Hacking :)



Saturday, September 22, 2012

Extend a Couchbase Server with your own commands!

I was having a beer with Sergey after yesterday's CouchConf in San Francisco and we started discussing command extensions for memcached. I guess I haven't been very good at advertising about what I've been doing, because I added this to memcached a couple of years ago as part of the plugin extension framework.

I thought that it might be interesting to write up a small blog post that shows you how you may write your own commands to access a Couchbase cluster. Please note that what I'm going to show you in this blog post is abusing Couchbase Server, and nothing you should be doing in production. If you have special needs you should connect with us to let us work with you to make it happen (and ensure that you don't break stuff).

So lets imagine that you're storing relatively big objects with data into a cluster, and you only need fragments of each object at a time. The current SDK only allows you to read and write a complete item at a time, so we're going to extend our Couchbase cluster to be able to give us a fragment of the data at a time.

Luckily, I wrote this extension for memcached a couple of years ago, and it is actually bundled with the Couchbase Server. This does however not imply that it is a supported extension to be using from your Couchbase Server. So let's go ahead and enable the extension on the server:

# cd /opt/couchbase/bin
# mv memcached memcached.bin
# cat > memcached << ...
#! /bin/sh
exec `dirname $0`/memcached.bin \
            -X /opt/couchbase/lib/memcached/fragment_rw.so "$@"
...
# chmod +x memcached

If you are running a cluster you need to do this on all of your nodes (and you need to restart all of your nodes for them to load the module).

You don't learn anything unless we're walking through how the extension actually works. You'll find the module at https://github.com/trondn/memcached/tree/engine-pu/extensions/protocol in the files fragment_rw.c and fragment_rw.h.

So let's look into the details there. During startup memcached will load all shared objects specified with the -X parameter, and try to look up the function named memcached_extensions_initialize. This is where you should initialize your engine, and add it to the system. If you look at the signature for the method, you might wonder where you may specify the configuration. The "per engine" configuration is specified as part of the -X parameter. The fragment module allows you to specify the command id for read and write through "r" and "w", so we could write -X fragment.so,r=5,w=6 to tell it to use the command code 5 and 6 (please note that it would be a conflict with the existing commands ;))

Setting up a command extension happens in two different phases. The first thing you need to do is to registering your command descriptor:


server->extension->register_extension(EXTENSION_BINARY_PROTOCOL, &descriptor);

The descriptor for protocol extensions is pretty simple:

static EXTENSION_BINARY_PROTOCOL_DESCRIPTOR descriptor = {
    .get_name = get_name,
    .setup = setup
};

get_name is a function that just returns the name of the extension, whereas setup is the interesting function:


static void setup(void (*add)(EXTENSION_BINARY_PROTOCOL_DESCRIPTOR *descriptor,
                              uint8_t cmd,
                              BINARY_COMMAND_CALLBACK new_handler))
{
    add(&descriptor, read_command, handle_fragment_rw);
    add(&descriptor, write_command, handle_fragment_rw);
}

So what is happening here. The function is called with a parameter that is a function you may call in order to register handlers for various command codes. The first parameter is you should pass to this callback is the descriptor you registered. The second parameter is the command opcode you want to install a handler for, and the third parameter is the function that implements the logic.

So let's go ahead and look at how the function looks:

static ENGINE_ERROR_CODE handle_fragment_rw(EXTENSION_BINARY_PROTOCOL_DESCRIPTOR *descriptor,
                                            ENGINE_HANDLE* handle,
                                            const void* cookie,
                                            protocol_binary_request_header *request,
                                            ADD_RESPONSE response)

The descriptor is the descriptor you registered, the handle is the handle to the engine keeping all of the data (so that you may interact with the backing store). Cookie is a reference to the client who connected to you. Request is the complete package the client sent for the request, and response is a callback function you may call to send data back to the client. If you look at the implementation for handle_fragment_rw you'll see that it's not that hard to interact with the engine.


The only thing left for us now is to create create client functionality to connect to a server and send such a packet :) I leave that up to you folks for the moment (or if people would like me to post it here I'd be happy to do so. let me know).

Happy hacking!


Monday, September 17, 2012

libcouchbase is async, but my app isn't...

In the previous example I showed you how to hook libcouchbase into your own event loop, but not everyone wants to use it asynchronously. The best part of an asynchronous library is that it's pretty easy to make a synchronous library on top of it. All you need to do is to call the function and wait for it to complete, but in libcouchbase I went the extra mile. You may use it fully asynchronous, you may use it fully synchronous or you may use it somewhat in the middle.

The easiest way to use libcouchbase in a synchronous way is to toggle it's internals to the syncmode setting by calling:

lcb_behavior_set_syncmode(instance, LCB_SYNCHRONOUS);

One "drawback" with the syncmode setting is that you can't execute a chunk of commands concurrently. Lets look at the following example:

lcb_store(instance, NULL, 15, store_commands);
lcb_get(instance, NULL, 2, get_commands);

With syncmode enabled all of the store commands must have been executed and the response received before any of the get commands is sent (even if they don't hit the same servers). It would improve the latency of your application if you could shuffle as much data as possible over the network before blocking to wait for the result. That's when you want to set the syncmode to LCB_ASYNCHRONOUS (the default) and use lcb_wait() instead:

lcb_store(instance, NULL, 15, store_commands);
lcb_get(instance, NULL, 2, get_commands);
lcb_wait(instance);

when lcb_wait() returns we know that all of the above commands are executed (and the appropriate callbacks called).

So let's whip up a small program that shows you how to use the syncmode setting. To keep the example as small as possible I'm going to skip error recovery etc. Let's jump straight to the main() method:

int main(void)
{
    lcb_error_t error;
    lcb_t instance = create_instance();

    if ((error = lcb_connect(instance)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to connect to cluster: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    set_key(instance);
    get_key(instance);

    lcb_destroy(instance);
    exit(EXIT_SUCCESS);
}

As you see, the program is pretty simple. First we create a libcouchbase instance and connect to the cluster; set and get a key before we clean up and exits.

Let's look inside create_instance:

static lcb_t create_instance(void)
{
    lcb_t instance;
    struct lcb_create_st copt;
    lcb_error_t error;

    memset(&copt, 0, sizeof(copt));
    copt.v.v0.host = "myserver:8091";

    if ((error = lcb_create(&instance, &copt)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to create libcuchbase instance: %s\n",
                lcb_strerror(NULL, error));
        exit(EXIT_FAILURE);
    }

    lcb_behavior_set_syncmode(instance, LCB_SYNCHRONOUS);
    lcb_set_error_callback(instance, error_handler);
    lcb_set_store_callback(instance, store_handler);
    lcb_set_get_callback(instance, get_handler);

    return instance;
}

This looks pretty much like how we normally do stuff except for the line I've marked as bold. From that line all of the lcb_ functions will block until we've received a reply from the server for the requested operation. You might be curious about how my handler functions looks like, and they are pretty boring:

static void error_handler(lcb_t instance, lcb_error_t err, const char *info)
{
    fprintf(stderr, "FATAL! an error occured: %s (%s)\n",
            lcb_strerror(instance, err), info ? info : "none");
    exit(EXIT_FAILURE);
}

static void store_handler(lcb_t instance, const void *cookie,
                          lcb_storage_t operation, lcb_error_t error,
                          const lcb_store_resp_t *resp)
{
    (void)cookie; (void)operation;

    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to store the key on the server: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    if (resp->version == 0) {
        fprintf(stdout, "Successfully stored \"");
        fwrite(resp->v.v0.key, 1, resp->v.v0.nkey, stdout);
        fprintf(stdout, "\"\n");
    }
}

static void get_handler(lcb_t instance, const void *cookie,
                        lcb_error_t error, const lcb_get_resp_t *resp)
{
    (void)cookie;

    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to read the key from the server: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    /* Validate that I read the correct key and value back */
    if (resp->version != 0) {
        fprintf(stderr,
                "WARNING: I don't support this version of libcouchbase\n");
        exit(EXIT_FAILURE);
    }

    fprintf(stdout, "I received \"");
    fwrite(resp->v.v0.key, 1, resp->v.v0.nkey, stdout);
    fprintf(stdout, "\" with the value: [");
    fwrite(resp->v.v0.bytes, 1, resp->v.v0.nbytes, stdout);
    fprintf(stdout, "]\n");
}

If we jump back to our main function the next instructions to execute would be:

    if ((error = lcb_connect(instance)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to connect to cluster: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

Now that we've enabled the LCB_SYNCHRONOUS mode in libcouchbase this will be a blocking call (I fixed that bug today Sept 17th), so that when the method returns we know the topology of the Couchbase cluster and we're about to store our object there in the set_key method that looks like:

static const char const *key = "mykey";
static const char const *value = "myvalue";

static void set_key(lcb_t instance)
{
    lcb_store_cmd_t cmd;
    lcb_error_t error;
    const lcb_store_cmd_t * const commands[] = { &cmd };

    memset(&cmd, 0, sizeof(cmd));
    cmd.v.v0.key = key;
    cmd.v.v0.nkey = strlen(key);
    cmd.v.v0.bytes = value;
    cmd.v.v0.nbytes = strlen(value);
    cmd.v.v0.operation = LCB_SET;

    if ((error = lcb_store(instance, NULL, 1, commands)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to store key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }
}


This looks exactly like how we would have done it in our asynchronous application, and here is a very important detail. The return code of the function is not the return code of the actual operation! It means the exact same thing as it used to: if we encountered any problems initiating the operation. It is the callback function that will tell you the result of the operation from the server!

To make the example complete let's look at the get_key function as well:

static void get_key(lcb_t instance)
{
    lcb_get_cmd_t cmd;
    lcb_error_t error;
    const lcb_get_cmd_t * const commands[] = { &cmd };

    memset(&cmd, 0, sizeof(cmd));
    cmd.v.v0.key = key;
    cmd.v.v0.nkey = strlen(key);

    if ((error = lcb_get(instance, NULL, 1, commands)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to get key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }
}

Pretty simple and straight forward.

Happy hacking!

Saturday, September 15, 2012

How do I use libcouchbase with my own event loop?

Writing asynchronous programs seems to be popular these days, so I thought I should whip up an example that shows you how you may utilize libcouchbase with your own event loop.

libcouchbase performs it's IO through an "IO handle". This is a plugin system so you should be able to use whatever mechanism you want. Adding support for a new system is nothing more than implementing a handfull of function calls and place them in a shared object. When I wrote libcouchbase I created a version for libevent and a small version that use select for Windows just to ensure that it should be possible to write such plugins. Later Mark Nunberg wrote the plugin we're using for node.js that is based on libuv. I'm not going to cover how to write your own plugin in this blog post (perhaps I'll get around to do that at a later time), but I'll show you how to let libcouchbase utilize the same event base from libevent that you're using for something else.

To keep the example as short as possible I'm only going to do the very basic error handling: print out an error message and terminate the program. This means that we can start writing our first function; our error handler:

static void error_callback(lcb_t instance, lcb_error_t error, const char *errinfo) {
    fprintf(stderr, "ERROR: %s %s\n", lcb_strerror(instance, error), errinfo);
    exit(EXIT_FAILURE);
}

I'm not very good at coming up with interesting examples, so today we'll just create a small program that connects to a Couchbase cluster, stores a key and then reads the key back out again.

So let's take a look at our main function:

int main(int argc, char** argv) {
    struct event_base *evbase = event_base_new();

    if (create_libcouchbase_handle(evbase) == -1) {
        exit(EXIT_FAILURE);
    }

    event_base_loop(evbase, 0);
    event_base_free(evbase);
    exit(EXIT_SUCCESS);
}


As you see there is nothing fancy there.. We're creating the event base that the rest of our application should use (I just don't use it for anything else in this example to make the example simple), before we create our libcouchbase handle and associate it with the newly created event base (we'll look at that shortly). We then start the event loop that will run the rest of the example through the callbacks.

So how does the create_libcouchbase_handle function look like:

static int create_libcouchbase_handle(struct event_base *evbase) {
    struct lcb_create_io_ops_st ciops;

    memset(&ciops, 0, sizeof (ciops));
    ciops.v.v0.type = LCB_IO_OPS_LIBEVENT;
    ciops.v.v0.cookie = evbase;

    lcb_io_opt_t ioops;
    lcb_error_t error = lcb_create_io_ops(&ioops, &ciops);
    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to create an IOOPS structure for libevent: %s\n",
                lcb_strerror(NULL, error));
        return -1;
    }

Let's stop there for a moment and look what we're doing. The first thing we're doing is that we're creating an instance of the lcb_create_io_ops_st structure. Like the rest of libcouchbase we're using a versioned struct for creating objects. The "constructor" for the IO handle on top of libevent allows us to pass the event base to use as the cookie.

We can now go ahead and create the libcouchbase instance "the normal way, with the only difference that we specify the io member in the create structure.:

    struct lcb_create_st copts;
    memset(&copts, 0, sizeof (copts));
    copts.v.v0.host = "localhost:8091";
    copts.v.v0.user = "Administrator";
    copts.v.v0.passwd = "secret";
    copts.v.v0.bucket = "default";
    copts.v.v0.io = ioops;

    lcb_t instance;
    if ((error = lcb_create(&instance, &copts)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to create a libcouchbase instance: %s\n",
                lcb_strerror(NULL, error));
        return -1;
    }

The next thing we'll do is to set up the different callbacks we're going to use in our program and call lcb_connect to initiate the connect sequence:

    lcb_set_error_callback(instance, error_callback);
    lcb_set_configuration_callback(instance, configuration_callback);
    lcb_set_get_callback(instance, get_callback);
    lcb_set_store_callback(instance, store_callback);

    if ((error = lcb_connect(instance)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to connect libcouchbase instance: %s\n",
                lcb_strerror(NULL, error));
        lcb_destroy(instance);
        return -1;
    }
}


You might be curious why I'm adding a "configuration callback"? This is actually a small trick :-) You might have tried to use libcouchbase yourself and had problems that your operations failed because you forgot to do a lcb_wait() after calling connect. The thing is that libcouchbase needs to know the topology of your cluster before it may perform any operations, and it cannot do that until it receives the first configuration from the server.

So how does this configuration callback look like in our example:

static void configuration_callback(lcb_t instance, lcb_configuration_t config) {
    if (config == LCB_CONFIGURATION_NEW) {
        // Since we've got our configuration, let's go ahead and store a value
        lcb_store_cmd_t cmd;
        const lcb_store_cmd_t * cmds[] = {&cmd};
        memset(&cmd, 0, sizeof (cmd));
        cmd.v.v0.key = "foo";
        cmd.v.v0.nkey = 3;
        cmd.v.v0.bytes = "bar";
        cmd.v.v0.nbytes = 3;
        cmd.v.v0.operation = LCB_SET;
        lcb_error_t err = lcb_store(instance, NULL, 1, cmds);
        if (err != LCB_SUCCESS) {
            fprintf(stderr, "Failed to set up store request: %s\n",
                    lcb_strerror(instance, err));
            exit(EXIT_FAILURE);
        }
    }
}


As the comment tells you, we've received the configuration from the server. This means that it's safe to start using the library. In a real world application you would probably have a more advanced logic here (please note that this callback will be called if you add/remove nodes etc, so it's not safe to assume that it will be called only once!)

When we receive the result for the store command from the server, our store_callback is called:

static void store_callback(lcb_t instance, const void *cookie, lcb_storage_t operation, lcb_error_t error, const lcb_store_resp_t *resp) {
    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to store key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    /* Time to read it back */
    lcb_get_cmd_t cmd;
    const lcb_get_cmd_t * cmds[] = {&cmd};
    memset(&cmd, 0, sizeof (cmd));
    cmd.v.v0.key = "foo";
    cmd.v.v0.nkey = 3;
    if ((error = lcb_get(instance, NULL, 1, cmds)) != LCB_SUCCESS) {
        fprintf(stderr, "Failed to setup get request: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }
}

If the value was successfully stored on the server, we're issuing a single get command to the server to verify that it's there. When we receive the get result from the server, our get_callback is called and we're terminating the program:

static void get_callback(lcb_t instance, const void *cookie, lcb_error_t error, const lcb_get_resp_t *resp) {
    if (error != LCB_SUCCESS) {
        fprintf(stderr, "Failed to get key: %s\n",
                lcb_strerror(instance, error));
        exit(EXIT_FAILURE);
    }

    fprintf(stdout, "I stored and retrieved the key \"foo\". Terminate program");
    exit(EXIT_SUCCESS);
}

This wasn't the most exciting example, but it shows you the basics on how you may utilize libcouchbase with your own event loop. None of the above functions will cause your application to block waiting for socket IO. If that happens I'd be more than happy to fix it if you send me a bug report (unless you wrote your own IO plugin and forgot to enable nonblocking IO ;)

Happy hacking!

Friday, September 14, 2012

How do I utilize Couchbase from my node.js app?

In yesterdays post I told you that I had created an npm for our Couchbase driver, but I didn't give any examples on how you could utilize it so I figured I should create a short blog post about that. I'm the kind of guy who like real world examples instead of a lot of text, so I went ahead and created a small application for this purpose.

I didn't have any good ideas about what I could create, so I decided I should create a version of the vacuum example I did for libcouchbase a while back. The application is a small daemon that will move all JSON documents you store in a given directory into a Couchbase server by using the _id field in the JSON document as the key, and the entire file as the value. Its a pretty simple example, but should show us some of the concepts. You'll find the entire program (including installation notes) at https://github.com/trondn/vacuum.js.

I won't be going through the entire program, but only mention the specifics needed to talk to Couchbase.

The first thing we need to do is to create an instance of the Couchbase driver:

var couchnode = require('couchbase');
var cb = new couchnode.Couchbase(config.hostname,
    config.username,
    config.password,
    config.bucket);

Here we pass the address of the couchbase cluster (normally: hostname:8091), the username to log in with (normally the name of the bucket) with the corresponding password and the bucket we want to connect to. Unlike the C version we may start to use the instance immediately from JavaScript, and it will buffer all of our operations until it's connected.

If you look in the function process_file you'll find the first usage of the Couchbase instance:

cb.set(obj._id, String(data), 0, undefined, set_handler, fullname);

So what happens here? We try to store an object in Couchbase with the key being the content of the "_id" field in the JSON document. The current version of the driver only allows you to store Strings, so we need to create a String of the data. The next field is the expiration time of the object (0 means that the object will never expire). The undefined parameter is for the CAS field, which would allow us to say that we would only set this parameter if the object already exist in Couchbase with that identifier. The set_handler is the callback function when the operation is finished, and fullname is the absolute name of the file we just stored. But wait, why would the Couchbase driver need the absolute name of the file we just stored? The answer to that is that it doesn't. It is just a data field get back in the callback.

So let's look at the callback:

function set_handler(data, error, key, cas) {
    if (error) {
        console.log('Failed to store object: %s', key);
    } else {
        fs.unlink(data);
        process_next();
    }
}


It looks pretty simple? The first parameter is the "fullname" field we passed to the set parameter. It contains the absolute name of the file we wanted to store. The error parameter tells you if the operation succeeded or not. The key is the key we wanted to store, and the last field is the CAS identifier the server assigned to the object (which you may use in successive calls to set if you want to replace this exact version of the object.


Happy hacking!!!!

Thursday, September 13, 2012

couchnode meets npm

Earlier today I played around with npm to see if I could make your life easier when you're going to use couchnode to access your Couchbase cluster from your node app.

It turns out that it was pretty easy to create the package and upload it, but getting it to work the way I want to wasn't just as easy. Couchnode is built on top of libcouchbase, which means that you must have libcouchbase installed before you can try to install couchnode. I tried to figure out the recommended way to check for such dependencies in an npm module, but unfortunately I don't know yet (please drop me a note if you know how I should fix it!!!).

With that in mind, let's go ahead and install libcouchbase. The easiest would of course be to install this globally in /usr, but for the example of it I'll going to install it in my home directory under opt:

wget -O libcouchbase.tar.gz http://packages.couchbase.com/clients/c/libcouchbase-2.0.0beta.tar.gz
tar xfz libcouchbase.tar.gz
cd libcouchbase-2.0.0beta
./configure --prefix=$HOME/opt
make install

Since I didn't install libcouchbase into the "default location" for  libraries on the system I need to export some variables so that npm finds libcouchbase during npm install:

CPPFLAGS="-I$HOME/opt/include"
LDFLAGS="-L$HOME/opt/lib -Wl,-rpath,$HOME/opt/lib"
export CPPFLAGS LDFLAGS

You should now be able to install the couchnode driver with:

npm install couchbase

And you're good to go to use the couchnode driver from your web application :)




Thursday, August 30, 2012

libcouchbase overhauling..

When I laid out the requirements for libcouchbase one of my goals was that it should be binary compatible, and we've managed to keep that promise pretty well. We've extended the API in a number of ways, but always in a compatible way. We've been able to build client libraries for other languages such as Ruby; PHP; Perl and node.js (possibly others I forgot as well) on top of libcouchbase and life has been good..

We've been talking about how great it would be if we could store a "datatype" field on each item you're storing, so that different clients could "do smart things"™with the objects. Unfortunately I didn't create the API in an extendible way to make it easy to do so. I would either have to break the API, or I could extend the API with more specialized methods. Ideally I wouldn't like to break the API, but the thought of extending the API with all sorts of specialized functions would just make the API a nightmare to use.

Given that Couchbase server 2.0 is going to give you a lot of new cool stuff, it would be a good idea to change the API before people started to use all of this goodness and make the transition to a new API harder then it has to be.

The first major change you'll notice is that we've shortened all of the names from libcouchbase to lcb. The header file is still libcouchbase/couchbase.h and you should link with -lcouchbase. This change isn't very hard to work around, simply do a search and replace libcouchbase_ with lcb_.

We've also removed a lot of the entry points (all of the _by_key-methods), and renamed some of the methods to make the name clearer (get_locked instead of getl etc). Adapting to those changes shouldn't be that hard either. The change in the API that'll require some effort to fix is however that we've changed the signature of almost all of the functions and the callbacks. I've tried really hard to make the API consistent so that once you know how it works you can "guess" the signature for the function you're going to use. The common form is:

lcb_error_t lcb_operationname(lcb_t instance, 
                              const void *cookie, 
                              int number, 
                              const lcb_operation_cmd_t * const * commands);

What?? I have to send in an array of pointers to commands?? You might think that sucks and I'm only trying to make your life as a user miserable, but there is actually a good reason for not using an array of objects. To explain that, lets start looking at the internals of these lcb_operation_cmd_t. In order to allow us to extend the functionality in the future without breaking the binary compatibility we're using versioned structs. Let's take a look at the operation-struct you would use to retrieve an object:


    typedef struct {
        int version;
        union {
            struct {
                const void *key;
                lcb_size_t nkey;
                lcb_time_t exptime;
            } v0;
        } v;
     } lcb_get_cmd_t;


The version member in the structure tells us how we should interpret the rest of the structure. Given that the size of this struct may vary over time, we can't pass in an array of such objects (well, we could, but the internals in the function would have a harder time figuring the offset for the next object since it would need to know the alignment and struct packing you used).

So let's show you a working example on how you would use the new API to retrieve a value from your cache.

In C99 I would typically write something like this:



    const lcb_get_cmd_t c = { .version = 0,  
                              .v.v0 = { 
                                .key = "mykey", 
                                .nkey = 5 
                              }
                            };
    const lcb_get_cmd_t* cmds[] = { [0] = &c };
    lcb_get(instance, NULL, 1, cmds);



We've added constructers for the objects in C++, so all you need there is:


    const lcb_get_cmd_t c("mykey", 5);
    const lcb_get_cmd_t* cmds[] = { [0] = &c };
    lcb_get(instance, NULL, 1, cmds);


We're also using the same method in the callbacks (I've tried to make them as consistent as possible just as the commands), so the get callback now looks like:


    void get_callback(lcb_t instance,
                      const void *cookie,
                      lcb_error_t error,
                      const lcb_get_resp_t *resp)

A typical implementation could look like:

static void get_callback(lcb_t instance,


                         const void *cookie,
                         lcb_error_t error,
                         const lcb_get_resp_t *resp)
{
    if (error == LCB_SUCCESS) {
        if (resp->version != 0) {
            // I don't know how this object look like
            // Do some proper error handling
            return ;
        }

        // Use the information in resp->v.v0 

    } else {
        // Do error handling for miss/errors
    }
}


We've tried to update the documentation in the headerfile with more information and examples for each of the entry function. Please don't hesitate to drop me an email (or ask questions on #libcouchbase on IRC).

We're in the middle of updating clients to use the new API and write more and more automatic tests!

Happy hacking!


Monday, July 16, 2012

libcouchbase meets node.js


Matt asked me a while back if I could look at what it would take to utilize libcouchbase from node.js. When I initially designed libcouchbase one of the requirements was that it should be fully asynchronous, so from a design perspective it should fit very well for node.js. Another design goal of libcouchbase was that it should be modular so that I could swap out the underlying methods that does all of the network communications. This API isn't something the average developer would see / use, but I added it to (hopefully) make it easier to port to new systems. I've been using libevent for a couple of years now, and I have to admit that I had that mindset when I designed that API. Unfortunately it doesn't map that well to lets say IOCP or libuv, so I'd like to refactor the API. In order to get full integration with node.js I have to do this refactoring (the current version use it's own event loop and block the global event loop). Mordy implemented a version that allows you to use libuv, but it's not merged into libcouchbase yet. I think I'd prefer a refactor of the current io model before merging the patch.

Anyway. When I started to look into the details I quickly realized that this was a completely new territory for me. I’ve never done anything with Javascript before, so I had no idea what kind of API the hard-core Javascript folks would like. Instead of waiting for someone to come up with an API specification for me, I started playing around trying to figure out how to create a Javascript binding to libcouchbase. I’m a strong believer of that an API should feel natural in the language it’s been used (instead of a “port” of the underlying API), but no matter what I would need to know how to integrate it. If I had a “working skeleton” I could always refactor the API into something the user would like at a later time.

It turns out that building a basic extension for node.js in C++ isn’t hard at all. The one I’ve built got doesn’t qualify as a real extension for node.js (given that it “blocks” the global event notification loop by using it’s own), but it works as a good “proof of concept”. I don’t think looking at the API is that interesting, but given that it was so easy to build the extension I figured I could walk you through it to give you a head start if you want to build your own.

You can find the entire source code I’m talking about at https://github.com/trondn/couchnode. The code examples you'll find here will probably not match entirely to the code you'll find there because I may have removed stuff here to make the example smaller and easier to read (so the stuff you'll find here might not compile ;)). 

The first thing we need to do is to set up the boilerplate code:

#define BUILDING_NODE_EXTENSION
#include
class Couchbase: public node::ObjectWrap {
    static void Init(v8::Handle target);
};
static void init(v8::Handle target) {
    Couchbase::Init(target);
}
NODE_MODULE(couchbase, init)

The above fragment defines the class I'm going to use for the API (Couchbase), and the NODE_MODULE() macro registers the module name “couchbase” and tells node.js that it should call the function named init() to initialize the module. As you can see in init(), I’m calling the Init method in my Couchbase class to let the class initialize itself: 

void Couchbase::Init(v8::Handle target)
{
    v8::HandleScope scope;
    v8::Local t = v8::FunctionTemplate::New(New);
    v8::Persistent s_ct;
    s_ct = v8::Persistent::New(t);
    s_ct->InstanceTemplate()->SetInternalFieldCount(1);
    s_ct->SetClassName(v8::String::NewSymbol("Couchbase"));
    NODE_SET_PROTOTYPE_METHOD(s_ct, "get", Get);

    target->Set(v8::String::NewSymbol("Couchbase"), s_ct->GetFunction());
}

So what does the above code do? It defines the JavaScript API to my class and maps the JavaScript functions to a C++ function, so that when someone writes couchbase.get("foo"); in JavaScript we're calling Couchbase::Get() in C++.

Now let’s look at the Get() method. When I “defined” the API I allowed for get to be called in multiple ways. In the C version of libcouchbase you typically set up a global callback that is called whenever you receive a response for a get call. Originally I made it the same way for JavaScript as well, but that clearly isn't the "node.js-way" of doing stuff. Instead you would write the following JavaScript:

cb.get(function onGet(state, key, value, flags, cas) {
  if (state) {
     console.log("found \"" + key + "\" - [" + value + "]");
  } else {
     console.log("failed for \"" + key + "\"");
  }
}, "foo", "bar");

Given that I was doing all of this as part of my learning curve (I wouldn't be defining the real API now anyway), I decided to support both ways of calling the command. So let's take a look at the method:

v8::Handle Couchbase::Get(const v8::Arguments& args)
{
    if (args.Length() == 0) {
        const char *msg = "Illegal arguments";
        return v8::ThrowException(v8::Exception::Error(v8::String::New(msg)));
    }
    v8::HandleScope scope;
    Couchbase* me = ObjectWrap::Unwrap(args.This());
    void* commandCookie = NULL;
    int offset = 0;
    if (args[0]->IsFunction()) {
        if (args.Length() == 1) {
            const char *msg = "Illegal arguments";
            return v8::ThrowException(v8::Exception::Error(v8::String::New(msg)));
        }
        commandCookie = static_cast(new CommandCallbackCookie(args[0], args.Length() - 1));
        offset = 1;
    }
    int tot = args.Length() - offset;
    char* *keys = new char*[tot];
    libcouchbase_size_t *lengths = new libcouchbase_size_t[tot];
    // @todo handle allocation failures
    for (int ii = offset; ii < args.Length(); ++ii) {
        if (args[ii]->IsString()) {
            v8::Local s = args[ii]->ToString();
            keys[ii - offset] = new char[s->Length() + 1];
            lengths[ii - offset] = s->WriteAscii(keys[ii - offset]);
        } else {
            // @todo handle NULL
            // Clean up allocated memory!
            const char *msg = "Illegal argument";
            return v8::ThrowException(
                    v8::Exception::Error(v8::String::New(msg)));
        }
    }
    me->lastError = libcouchbase_mget(me->instance, commandCookie, tot,
            reinterpret_cast (keys), lengths, NULL);
    if (me->lastError == LIBCOUCHBASE_SUCCESS) {
        return v8::True();
    } else {
        return v8::False();
    }
}

As you see in the code I iterate over all of the parameters the user passed to build up the arguments to pass to libcouchbase_mget. I handle the special case with the function pointer as the first argument by storing it in the "command specific cookie" libcouchbase provides.


In order to build the plugin you need to create a file named wscript with the following content:



def set_options(opt):
  opt.tool_options("compiler_cxx")


def configure(conf):
  conf.check_tool("compiler_cxx")
  conf.check_tool("node_addon")


def build(bld):
  obj = bld.new_task_gen("cxx", "shlib", "node_addon")
  obj.cxxflags = ["-g", "-Wall"]
  obj.ldflags = ["-lcouchbase"]
  obj.target = "couchbase"
  obj.source = "src/couchbase.cc"

And set up the build environment by running:

$ node-waf configure

You can now build and "install" your pugin by running:

$ node-waf build install

You should now be able to test the plugin by creating a JavaScript file and run:

$ node myscript.js 
Happy hacking


Friday, January 27, 2012

So whats the story about libcouchbase and Windows?

A couple of days ago I showed you an example program using libcouchbase to create a small application to put data into a Couchbase cluster, but the code wouldn't compile on Windows. That does by no means imply that libcouchbase doesn't work on Windows, its more that I was in a hurry writing the blog post so I didn't have the time fixing everything up in time for the blog post.

In this blog post I'll show you how easy it is to get everything up'n'running using Windows 7 and Microsoft Visual Studio 2010. In addition to that you need to download and install git to be able to check out the source code (select the option that you want to put git in the path (not the full msys suite, but just git)).

I have to admit that I am far from a "hardcore Windows developer", so there is a lot of things I don't know about the platform. For instance I don't know where I should install third party header files and libraries, so I just decided that I'm going to install all of them into C:\local (with an install, lib and bin directory). I'd be happy if someone could tell me how I'm supposed to do this ;-)

So let's open up the Visual Studio Command Prompt and start building everything:

Setting environment for using Microsoft Visual Studio 2010 x86 tools.
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC> cd %HOMEPATH%
C:\Users\Trond> mkdir build
C:\Users\Trond> cd build

Since we're going to build dll's you need to set C:\local\bin into your path so that the runtime linker finds the dll's:

C:\Users\Trond\build> set PATH=c:\local\bin;%PATH%

We need to install two dependencies before we can compile libcouchbase itself. Let's check out all of the source code we're going to use:

C:\Users\Trond\build> git clone git://github.com/membase/libisasl.git
C:\Users\Trond\build> git clone git://github.com/membase/libvbucket.git
C:\Users\Trond\build> git clone git://github.com/couchbase/libcouchbase.git
C:\Users\Trond\build> git clone git://github.com/membase/memcached.git
C:\Users\Trond\build> git clone git://github.com/trondn/vacuum.git

The first dependency we're going to build is the SASL library. This is the library libcouchbase use for authenticating to the Couchbase servers. To build and install the library, simply execute:

C:\Users\Trond\build> cd libisasl
C:\Users\Trond Norbye\build\libisasl> nmake -f NMakefile install

That will install libisasl with its header files and libraries into c:\local.

The next library we need to build is libvbucket; the library libcouchbase use to figure out where a a vbucket is located (if you don't know what a vbucket is, you don't really need to know). It is just as easy as libvbucket to build:

C:\Users\Trond\build\libisasl> cd ..\libvbucket
C:\Users\Trond\build\libvbucket> nmake -f NMakefile install

The next thing we need to do is to install some headerfiles libcouchbase needs during build time. These header files contains the protocol definitions libcouchbase needs (but it is not needed by the application). So let's go ahead and install them (to make it easier for us to build libcouchbase)

C:\Users\Trond\build\libvbucket> cd ..\memcached
C:\Users\Trond\build\memcached> git checkout -b branch-20 origin/branch-20
C:\Users\Trond\build\memcached> mkdir c:\local\include\memcached
C:\Users\Trond\build\memcached> copy include\memcached c:\local\include\memcached

So let's go ahead and build libcouchbase! 

C:\Users\Trond\build\memcached> cd ..\libcouchbase
C:\Users\Trond\build\libcouchbase> nmake -f NMakefile install

I guess that most Windows developers don't use nmake during their development, but use the full IDE instead. That's why I've created a project you may open in the vacuum project. So feel free to open that project now, and it should build without any problems. 

Now we're going to need a Couchbase server we can connect to. If you don't have any running, you should download and install one now. 

Let's go ahead and create the spool directory and start the vacuum server...

C:\Users\Trond\build\vacuum\Debug> mkdir c:\vacuum
C:\Users\Trond\build\vacuum\Debug> vacuum -h 127.0.0.1:8091

And you can start copy JSON files into C:\vacuum and see them being added to the Couchbase cluster!

Tuesday, January 24, 2012

So how do I use this "libcouchbase"?


Some of you may have noticed that we released Couchbase 1.8 earlier today, and a new set of smart clients for various languages. For me personally this is a milestone, because libcouchbase is now a
supported client for the C language.

So why do I care about that? Well, libcouchbase started out of my needs to easily test various components of the server. Since I did most of my development on the components on the server implemented in C, it made sense for me to use C for my testing.

I've received some questions on how libcouchbase work in a multithreaded context, so I should probably start off by clarifying that: libcouchbase doesn't use any form of locking to protect it's internal data structures, but it doesn't mean you can't use libcouchbase in a multithreaded program. All it means is that you as a client user must either use locking to protect yourself from accessing the libcouchbase instance from multiple threads at the same time, or just let each thread operate on it's own instance of libcouchbase. One easy way to solve this is to have a "pool" of libcouchbase instances each thread pop and push its instance to whenever they need to access a Couchbase server. Access to this pool should be protected with a lock (but I guess you figured that out ;-)

In this blog post I'll create a demo program you may use to upload JSON documents into a Couchbase server. You'll find the complete source available at https://github.com/trondn/vacuum if you would like
to try the example.

The idea of this program is that it will "monitor" a directory and upload all files appearing there into a Couchbase cluster. I'm pretty sure most of you start thinking: "how do we do that in a portable way?". That's not an easy task to do, so I'm not even going to try to do that. I'll try to write it in a semi-portable way so that it shouldn't be that hard to implement on other platforms. That means that I'm using the following limitations:

  • I'm using opendir and readdir to traverse the directory. This can easily be reimplemented with FindFirst and FindNext on Microsoft Windows.
  • Monitor of the directory means that I'm going to scan the directory, then sleep a given number of seconds before running another scan. I know some platforms supports subscribing of changes to the filesystem, but I'm not going to spend time on that (at least not right now ;-)).
  • To avoid file locking or accessing the file while others are writing the file, the clients should write the file into the directory with a leading "dot" in the filename, and then rename the file when they are done. The program ignores all files starting with a dot.

So let's jump to the code. The first piece of code that might be interesting to look at would be where we create the libcouchbase instance in main():

    instance = libcouchbase_create(host, user, passwd, bucket, NULL);
    if (instance == NULL) {
        fprintf(stderr, "Failed to create couchbase instance\n");
        exit(EXIT_FAILURE);
    }

The above code snippet creates the libcouchbase instance. There is no way you can use a static structure for this, because doing so will make it incredible hard to maintain binary compatibility. I like to be able to fix bugs within the library and release new versions you may use without having to recompile your program, and by hiding the internal datastructures from the clients makes it easier to ensure that the client don't depend on their size. The first parameter to libcouchbase_create is the name (and port) of the REST port for the couchbase server (default: localhost:8091). The second and third parameter is the credentials you'd like to use to connect to the REST port to get the pool information (default is to not authenticate). The forth parameter is the bucket you'd like to connect to, and if you don't specify a bucket you'll end up in the "default bucket". The fifth argument is a special object you may want to use if you are going to use "advanced" features in libcouchbase. Most users will probably just use the defaults and pass NULL here.

The next thing we need to do is to set up some callback handlers to be able to figure out what happens. In the example we're only going to use one operation (to load data into the cache) so we'll need to set up a handler to catch the result of storage operations. Unfortunately we may also encounter problems, so we need to set up an error handler (we'll get back to work in a bit).

    libcouchbase_set_storage_callback(instance, storage_callback);
    libcouchbase_set_error_callback(instance, error_callback);

Now that we've created and initialized the instance, we need to try to connect to the Couchbase cluster:

    libcouchbase_error_t ret = libcouchbase_connect(instance);
    if (ret != LIBCOUCHBASE_SUCCESS) {
        fprintf(stderr, "Failed to connect: %s\n",
                libcouchbase_strerror(instance, ret));
        exit(EXIT_FAILURE);
    }

Due to the fact that libcouchbase is fully asynchronous, all that happened above was that we initiated the connect. That means that we need to wait for the server to be connected to the Couchbase cluster and connect to the correct bucket. If our program should do other stuff now would be the time to do so, but since we don't have any other initialization to do we can just wait for it to complete:

    libcouchbase_wait(instance);

One of the "cool" features we've got in libcouchbase is that it provides an internal statistics interface, so we may tell it to collect timing information of the operations with the following snippet:

   if ((ret = libcouchbase_enable_timings(instance) != LIBCOUCHBASE_SUCCESS)) {
      fprintf(stderr, "Failed to enable timings: %s\n",
              libcouchbase_strerror(instance, ret));
   }

Our program is now fully initialized, and we can enter the main loop that looks like pretty much like:

   while (forever)
   {
      process_files();
      sleep(nsec);
   }

So how does our process_files() look like? I'm not going to make the example too big by pasting all of it, but the first piece in there looks like:

   if (de->d_name[0] == '.') {
       if (strcmp(de->d_name, ".dump_stats") == 0) {
           fprintf(stdout, "Dumping stats:\n");
           libcouchbase_get_timings(instance, stdout, timings_callback);
           fprintf(stdout, "----\n");
           remove(de->d_name);<
       }
       continue;
   }

As you see from the above code snippet we'll ignore all files that starts with a '.' except for the file named ".dump_stats". Whenever we see that file we dump the internal stats timings by using the timings_callback (I'll get back to that later).

The next thing we do is to try to read the file into memory and decode it's JSON before we try to get the "_id" field to use as a key. If all of that succeeds, we try to store the data in Coucbase with:

      int error = 0;
      ret = libcouchbase_store(instance, &error, LIBCOUCHBASE_SET,
                               id->valuestring, strlen(id->valuestring),
                               ptr, size, 0, 0, 0);
      if (ret == LIBCOUCHBASE_SUCCESS) {
         libcouchbase_wait(instance);
      } else {
         error = 1;
      }

The &error piece here is quite interesting. It is a "cookie" passed to the callback, so that I may know if I encountered a problem or not. You'll see how I'm using it when I discuss the storage_callback below.

This is basically all of the important logic in the example. I promised that I would get back to the different callbacks, so let's start by looking at the error callback:

   static void error_callback(libcouchbase_t instance,
                              libcouchbase_error_t error,
                              const char *errinfo)
   {
       /* Ignore timeouts... */
       if (error != LIBCOUCHBASE_ETIMEDOUT) {
           fprintf(stderr, "\rFATAL ERROR: %s\n",
                   libcouchbase_strerror(instance, error));
           if (errinfo && strlen(errinfo) != 0) {
               fprintf(stderr, "\t\"%s\"\n", errinfo);
           }
           exit(EXIT_FAILURE);
       }
   }

As you see from the above snippet libcouchbase will call the error_callback whenever a timeout occurs, but we just want to retry the operation. If we encounter a real error we print out an error message and terminate the program.

The next callback we use is the storage_callback. It is called when the store operation completed, so it is the right place for us to figure out if an error occured while storing the data. Our callback looks like:

   static void storage_callback(libcouchbase_t instance,
                                const void *cookie,
                                libcouchbase_storage_t operation,
                                libcouchbase_error_t err,
                                const void *key, size_t nkey,
                                uint64_t cas)
   {
      int *error = (void*)cookie;
       if (err == LIBCOUCHBASE_SUCCESS) {
           *error = 0;
       } else {
           *error = 1;
           fprintf(stderr, "Failed to store \"");
           fwrite(key, 1, nkey, stderr);
           fprintf(stderr, "\": %s\n",
                   libcouchbase_strerror(instance, err));
           fflush(stderr);
       }
   }

As you see we're storing the result of the operation in the integer passed as the cookie. The observant reader may see that we might as well could unlink the file and remove the memory from within the callback (if we provided that information as the cookie instead ;))

The last callback to cover is the timings callback we're using to dump out the timing statistics.

   static void timings_callback(libcouchbase_t instance, const void *cookie,
                                libcouchbase_timeunit_t timeunit,
                                uint32_t min, uint32_t max,
                                uint32_t total, uint32_t maxtotal)
   {
      char buffer[1024];
      int offset = sprintf(buffer, "[%3u - %3u]", min, max);
      switch (timeunit) {
      case LIBCOUCHBASE_TIMEUNIT_NSEC:
         offset += sprintf(buffer + offset, "ns");
         break;
      case LIBCOUCHBASE_TIMEUNIT_USEC:
         offset += sprintf(buffer + offset, "us");
         break;
      case LIBCOUCHBASE_TIMEUNIT_MSEC:
         offset += sprintf(buffer + offset, "ms");
         break;
      case LIBCOUCHBASE_TIMEUNIT_SEC:
         offset += sprintf(buffer + offset, "s");
         break;
      default:
         ;
      }

      int num = (float)40.0 * (float)total / (float)maxtotal;
      offset += sprintf(buffer + offset, " |");
      for (int ii = 0; ii < num; ++ii) {
         offset += sprintf(buffer + offset, "#");
      }

      offset += sprintf(buffer + offset, " - %u\n", total);
      fputs(buffer, (FILE*)cookie);
   }

When you request the timings from libcouchbase it reports all of the timing metrics collected by calling the timings callback. As you can see from the API you'll get the minimum, maximum value for the range, and the number of operations performed within that range. These metrics are not to be considered as exact numbers, because they depend on when what you do in your client code from the time you call the operation until you call libcouchbase_wait for the operation to complete.

So let's run the go ahead and run the program. I've prepopulated /var/spool/vacuum with a number of JSON files, to have the program do something.

trond@illumos ~> ./vacuum
sleeping 3 secs before retry..

From another withdow I execute the command:

trond@illumos ~> touch /var/spool/vacuum/.dump_stats

And when the timer expires in first window, it prints out:

Dumping stats:
[ 60 -  69]us |######################################## - 18
[ 70 -  79]us |## - 1
[240 - 249]us |## - 1
----
sleeping 3 secs before retry..

Hopefully this blog revealed how easy it is to use libcouchbase to communicate with a Couchbase cluster. We've got various clients for other programming languages like PHP and Ruby built on top of libcouchbase, so I can promise you that you'll see more functionallity added!