Archive for February, 2007

Rails expire_fragment(regex) Considered Harmful

Sunday, February 4th, 2007

Recently I discovered that one of our Urbanspoon actions was taking nearly two seconds to complete. This particular page stuck out like a sore thumb once I start crunching the numbers contained in our production log file. The slowness was especially puzzling because the action in question seemed to be one of the simplest actions in the entire application. I quickly determined that:

  • The slowdown wasn’t in page rendering.
  • The slowdown wasn’t coming from the db.
  • I couldn’t make it happen on my dev box.
  • It occurred even if the production machine wasn’t busy.

So, where was the problem?

In my previous Ruby at 60 post, I mentioned that Ruby’s dynamic nature can make it hard to figure out what’s happening beneath the covers. This debugging session was a perfect example. Over the course of an hour or two, I laboriously inserted benchmarking code into various bits of Rails. First validation. Then ActiveRecord callbacks. Maybe the problem was inside the logger? Unfortunately, it’s not easy to find all the major pieces of Rails that affect each request.

Eventually I was able to trace the slowdown to the following line of code in one of our cache sweepers:

expire_fragment(/base\/xyz.*/)

At the moment we use File-based Fragment Caching to speed up Urbanspoon. Some of our pages have a cached section at the top (base/xyz_top), a dynamic section in the middle, and another cached section at the bottom (xyz_bottom). When the underlying data changes, a cache sweeper would jump in and expire the two cached sections using a regex, base/xyz.*.

Silly me, for some reason I thought that FileStore would expire the regex as follows:

  1. Look in the base directory.
  2. Find all files that match xyz.*
  3. Delete them.

I couldn’t be more wrong. Instead, the code in UnthreadedFileStore works more like this:

  1. Iterate every single file in the fragment cache.
  2. Delete files which match base/xyz.*

Our production server’s fragment cache usually contains in excess of 5,000 cached fragment files. Every time this action was invoked we were iterating all of them. Ouch!

The bug was easy to fix – simply replace the regex with two separate calls to expire_fragment, one for the top fragment and one for the bottom. Somewhere in the back of my mind I knew I’d have to make this change eventually, since we’ll be switching to memcached in the not-so-distant future. I just didn’t anticipate the fire drill.

Anyway, take my advice. Avoid expire_fragment(regex). It’s seductive if you have multiple fragments to expire, but it’ll cost you in the long run.