Sunday, December 20, 2009

Persistent storage engine

A lot of people keep asking about persistent storage engines for memcached, so I thought that I should create an example to show you how easy it is to create a storage engine that stores the items on disk for persistence. Please note that this is an example on how to do it, not a highly tuned version for performance (that would be up to you to implement ;-)) You wouldn't want to access the filesystem every time you want to access an item, so I'm going to beef the example up a bit by creating a two-level cache. All items will eventually be stored down to disk, but I will serve all of my items from memory. You might think that this sounds like a lot of work to implement, but you couldn't be more wrong about that. The way the default storage engine in memcached is implemented makes it a perfect starting point for us. I've pushed the source code I'm going to discuss to git://github.com/trondn/memcached-engines.git, and you should look at the source code in src/persistent. So how does this thing work? The short answer there is that whenever you store an item into the cache, I will also store the item in my persistent layer. We don't want to create a dead slow cache, so I am going to do the actual storing to the persistent media asynchronously. To speed up things more, I'm not going to let the application have to wait for the data to be written to the persistent media. The drawback for this is that the application will never know if anything failed while I tried to write the object to a persistent media. Whenever the user tries to get an object from the cache, I'll search the memory table first and if it isn't there, I'll read it from the persistent media and return it to the caller. I've decided to use SQLite for my persistence layer, so I created two extra threads in my engine:
  • SQLiteReader to read items from the database and store them in memory
  • SQLiteWriter to write items from memory to the database
Now let's look at the details.. When you try to store (add, set, replace etc) an item in memcached, you will eventually end up in do_store_item in items.c. This is the first place we are going to make some modifications. To keep the example simple I am not going to implement a proper add, append, prepend and replace function (to be specific, I am not going to check if they are in the persistent media if they aren't located in memory when you call the specific command. It should be pretty obvious how to implement that if you want it, so let's rather keep the example easy to understand. At the end of do_store_item we know if the item was successfully inserted into memcached, and this is where I ship the item to my persistent layer.:
if (stored == ENGINE_SUCCESS) {
   *cas = item_get_cas(&it->item);
   if (notify) {
       sqlite_io_store_item(engine, it);
   }
}
You will find the implementation for sqlite_io_store_item in sqlite.cc, and if you look at the code there you will see that all it does is to bump the reference counter for the item (to ensure that the object isn't evicted from the cache), and put the item into the work queue for the writer thread. With this simple modification to the default engine, I was able to always store the items in the SQLite database when I was storing items to the cache. To verify that this worked, I created the SQLiteCacheWarmup class that does a simple "select * from kv;" and inserts the content into the database during startup. The above code works perfectly if your entire cache set fits in memory, but if you start to evict items you would probably want to be able to asynchronously get items back into the cache again. What I did here was to add the following code snippet to my implementation of the get function in the engine API:
hash_item *it = item_get(engine, key, nkey);
if (it != NULL) {
   *item = &it->item;
   return ENGINE_SUCCESS;
} else {
    sqlite_io_get_item(engine, cookie, key, nkey);
    return ENGINE_EWOULDBLOCK;
}
The real magic here is the sqlite_io_get_item() function. What it does is to put a request in the SQLiteReader threads queue to load the item identified by the key. The SQLiteReader thread will try to read the item from the database and insert it into memory before it will call the function notify_io_complete() from the engine interface when it is done (with either ENGINE_SUCCESS, or ENGINE_KEY_ENOENT if the key isn't in the database either). Please note that the asynchronous interface in the core memcached server isn't fully implemented yet, so you need to pull my engine branch for the memcached server in order to try it out. Happy hacking :-)

2 comments:

  1. Trond,

    This and your previous post on an STL plugin engine are great, as was the associated Flash engine presentation. Exactly what I was looking for.

    However, I am missing something very basic here: I am not sure what to use as the base to plug the STL (or my own) engine into. The 1.4.4 memcached release branch does not seem to have engine.h and associated support for this. I find such in some different dev repos, but I am hoping to make something production ready here.

    Is it currently possible to use the API for plugging in another storage engine for a production ready system, or do I need to wait for a future release?

    Thanks!

    ReplyDelete
  2. Sorry, I didn't see this comment earlier.

    The 1.4.x series of memcached will not get the engine API. We are releasing that as a new branch (I would _guess_ 1.6). It is coming along good, so we should hopefully be able to push a new and updated branch relatively soon!

    ReplyDelete