Wednesday, September 18, 2013

What is the configuration cache

You might have seen the term configuration cache if you've played around with libcouchbase or the Couchbase PHP extension, but its not very well documented anywhere. So what is the configuration cache? To answer that question I should probably start by describing how the
client work.

libcouchbase is what we call a smart client; which means it reacts to changes in the topology of the cluster. So how does the client do this? When it is instanciated the first thing is does is to connect to the cluster (streaming REST call) receive notifications for changes in the topology. This doesn't sound like a problem, but it might become one. Unfortunately these REST streaming connection is not only time consuming to set up (compared to the data connections) and relatively resource hungry on the server side, so they don't fit very well in a scenario where you have a large amount of client or short lived connections.

It is not uncommon for people deploying PHP applications to run thousands of PHP processes, which would mean thousands of connections to the REST interface. The data connections are really cheap and fast to set up, so they're not causing any problems with this kind of deployment. In older versions of Couchbase I have unfortunately seen the cluster becoming unstable with such a high number of clients connecting to it.

When you think of it most clusters are running in a steady state most of the time. You don't add/remove nodes to the cluster very often, so I would guess that in 99% of the time clients will NEVER receive an update from the cluster that its topology is changing. This makes the information extremely well suited for caching, and thats exactly what the configuration cache is. It is really simple, but yet so effective:
  • When the client is instanciated it looks for a named file in the filesystem.
    • If it is there it assumes that to be the current cluster configuration and starts using it.
    • If it isn't there it starts the normal bootstrap logic to get the configuration from the cluster, and writes it to the file.
  • Whenever we try to access an item on a node and the node tells us that we tried to communicate to the wrong node, we invalidate the cache and request a new copy of the configuration cache.

So how do you go ahead and use the configuration cache? From PHP it is extremely easy, all you need to do is to add the following to php.ini:

couchbase.config_cache = /tmp

And the Couchbase php driver will start storing configuration cache files in the tmp directory. From C you would use the "compat mode" when you create your instance:

lcb_t instance;
lcb_error_t err;
struct lcb_cached_config_st config;

memset(&config, 0, sizeof(config));
config.createopt.version = 1; = "host1";
config.createopt.v.v1.user = "mybucket";
config.createopt.v.v1.passwd = "secret";
config.createopt.v.v1.bucket = "mybucket";
config.cachefile = "/var/tmp/couchbase-config-cache";

err = lcb_create_compat(LCB_CACHED_CONFIG, &config,
                        &instance, NULL);
if (err != LCB_SUCCESS) {
     ... error ...

Happy hacking!


  1. Say Trond, my gmail is sknuijver , could you let me know about the following? When one of our servers which is a java app, uploads images to couchbase, they are compressed (gzip) but is it possible to unzip them with libcouchbase when taking them out? When putting them in, I think this is done with a transcoder:
    Does libcouchbase have that too?

  2. There is no transcoders etc in libcouchbase. It stores and retrieves binary blobs.