Thursday, December 24, 2009

2009

2009 has been a really interesting year for me. We didn't have much snow during the holiday season, but during the first weeks in January we got a lot causing a complete chaos in the traffic. I spent two hours driving to work and two hours driving home one day (I normally only use 20-30 minutes), so I decided to work from home until things got back to normal.
Eric Lambert visited from California in February to start planning our next sprint on our project. "Unfortunately" he didn't get to see any extreme winter weather, but it was nice to be able to work face to face with my team-mate instead of only discussing on IRC.
Unfortunately for me I can't spend as much time as I want with my brother and his family, because they live in California. I have however been so lucky the last few years that I have been able to stay at his place for a month and work from there. Last year was no exception, but this time I was also giving a presentation on MySQL Users Conference. The users conference is a highlight of the year for me, because I meet so many good friends from all over the world there.
I've seen a lot of movies and heard a lot about Alcatraz, but I have never found the time to go there. It was therefore a really pleasant surprise when Matt Ingenthron asked me if we should go there a weekend. Everything there was much smaller than I had imagined (the cells are really tiny). The weather was really nice that day, but it was kind of cold there anyway.. I can really imagine myself how it must have been to stay there on a winters night...
During the Users Conference I also left Suns Database Group, and joined a fantastic team lead by Lee Bieber (aka the Drizzle team).
We spent the summer vacation on Oslo this year, going to amusement parks and museums. One of the highlights this years was my cousins wedding in Geiranger. We went on a 3 hour long boat trip in the fjords before we arrived at the hotel, so I got to see "syv søstre" and "friaren" up close. Thats a memory for life. The view out my hotel window was really awesome!
In September I went to Seattle for a team meeting, and it was really great to see the rest of the team again (You can see the team at http://www.flickr.com/photos/brianaker/3834956962/ ). Working from Norway I spend a lot of time talking with them on telephone / IRC, so it is really great to meet them face to face once in a while. Luckily for me I got to meet other friends during my stay there. Dustin Sallings, Steve Yen and Patrick Galbraith from NorthScale joined in on the open events to discuss the community work on Drizzle / Gearman and Memcached.
Back in Trondheim I continued the renovation om my house. I am going to build two rooms in the garage attached to my house, so I started by tearing down the wall and constructing a new wall. I'm 99% done with everything on the outside now, but I need to find time to finish up inside.
This was a quick summary of 2009, and I am pretty sure that 2010 will become even more exciting!!

Sunday, December 20, 2009

Persistent storage engine

A lot of people keep asking about persistent storage engines for memcached, so I thought that I should create an example to show you how easy it is to create a storage engine that stores the items on disk for persistence. Please note that this is an example on how to do it, not a highly tuned version for performance (that would be up to you to implement ;-)) You wouldn't want to access the filesystem every time you want to access an item, so I'm going to beef the example up a bit by creating a two-level cache. All items will eventually be stored down to disk, but I will serve all of my items from memory. You might think that this sounds like a lot of work to implement, but you couldn't be more wrong about that. The way the default storage engine in memcached is implemented makes it a perfect starting point for us. I've pushed the source code I'm going to discuss to git://github.com/trondn/memcached-engines.git, and you should look at the source code in src/persistent. So how does this thing work? The short answer there is that whenever you store an item into the cache, I will also store the item in my persistent layer. We don't want to create a dead slow cache, so I am going to do the actual storing to the persistent media asynchronously. To speed up things more, I'm not going to let the application have to wait for the data to be written to the persistent media. The drawback for this is that the application will never know if anything failed while I tried to write the object to a persistent media. Whenever the user tries to get an object from the cache, I'll search the memory table first and if it isn't there, I'll read it from the persistent media and return it to the caller. I've decided to use SQLite for my persistence layer, so I created two extra threads in my engine:
  • SQLiteReader to read items from the database and store them in memory
  • SQLiteWriter to write items from memory to the database
Now let's look at the details.. When you try to store (add, set, replace etc) an item in memcached, you will eventually end up in do_store_item in items.c. This is the first place we are going to make some modifications. To keep the example simple I am not going to implement a proper add, append, prepend and replace function (to be specific, I am not going to check if they are in the persistent media if they aren't located in memory when you call the specific command. It should be pretty obvious how to implement that if you want it, so let's rather keep the example easy to understand. At the end of do_store_item we know if the item was successfully inserted into memcached, and this is where I ship the item to my persistent layer.:
if (stored == ENGINE_SUCCESS) {
   *cas = item_get_cas(&it->item);
   if (notify) {
       sqlite_io_store_item(engine, it);
   }
}
You will find the implementation for sqlite_io_store_item in sqlite.cc, and if you look at the code there you will see that all it does is to bump the reference counter for the item (to ensure that the object isn't evicted from the cache), and put the item into the work queue for the writer thread. With this simple modification to the default engine, I was able to always store the items in the SQLite database when I was storing items to the cache. To verify that this worked, I created the SQLiteCacheWarmup class that does a simple "select * from kv;" and inserts the content into the database during startup. The above code works perfectly if your entire cache set fits in memory, but if you start to evict items you would probably want to be able to asynchronously get items back into the cache again. What I did here was to add the following code snippet to my implementation of the get function in the engine API:
hash_item *it = item_get(engine, key, nkey);
if (it != NULL) {
   *item = &it->item;
   return ENGINE_SUCCESS;
} else {
    sqlite_io_get_item(engine, cookie, key, nkey);
    return ENGINE_EWOULDBLOCK;
}
The real magic here is the sqlite_io_get_item() function. What it does is to put a request in the SQLiteReader threads queue to load the item identified by the key. The SQLiteReader thread will try to read the item from the database and insert it into memory before it will call the function notify_io_complete() from the engine interface when it is done (with either ENGINE_SUCCESS, or ENGINE_KEY_ENOENT if the key isn't in the database either). Please note that the asynchronous interface in the core memcached server isn't fully implemented yet, so you need to pull my engine branch for the memcached server in order to try it out. Happy hacking :-)