Rails expire_fragment(regex) Considered Harmful
Recently I discovered that one of our Urbanspoon actions was taking nearly two seconds to complete. This particular page stuck out like a sore thumb once I start crunching the numbers contained in our production log file. The slowness was especially puzzling because the action in question seemed to be one of the simplest actions in the entire application. I quickly determined that:
- The slowdown wasn’t in page rendering.
- The slowdown wasn’t coming from the db.
- I couldn’t make it happen on my dev box.
- It occurred even if the production machine wasn’t busy.
So, where was the problem?
In my previous Ruby at 60 post, I mentioned that Ruby’s dynamic nature can make it hard to figure out what’s happening beneath the covers. This debugging session was a perfect example. Over the course of an hour or two, I laboriously inserted benchmarking code into various bits of Rails. First validation. Then ActiveRecord callbacks. Maybe the problem was inside the logger? Unfortunately, it’s not easy to find all the major pieces of Rails that affect each request.
Eventually I was able to trace the slowdown to the following line of code in one of our cache sweepers:
expire_fragment(/base\/xyz.*/)
At the moment we use File-based Fragment Caching to speed up Urbanspoon. Some of our pages have a cached section at the top (base/xyz_top), a dynamic section in the middle, and another cached section at the bottom (xyz_bottom). When the underlying data changes, a cache sweeper would jump in and expire the two cached sections using a regex, base/xyz.*.
Silly me, for some reason I thought that FileStore would expire the regex as follows:
- Look in the base directory.
- Find all files that match xyz.*
- Delete them.
I couldn’t be more wrong. Instead, the code in UnthreadedFileStore works more like this:
- Iterate every single file in the fragment cache.
- Delete files which match base/xyz.*
Our production server’s fragment cache usually contains in excess of 5,000 cached fragment files. Every time this action was invoked we were iterating all of them. Ouch!
The bug was easy to fix - simply replace the regex with two separate calls to expire_fragment, one for the top fragment and one for the bottom. Somewhere in the back of my mind I knew I’d have to make this change eventually, since we’ll be switching to memcached in the not-so-distant future. I just didn’t anticipate the fire drill.
Anyway, take my advice. Avoid expire_fragment(regex). It’s seductive if you have multiple fragments to expire, but it’ll cost you in the long run.
Friday, March 2nd 2007 at 1:39 pm
Any way to force it to behave like glob instead of a stupid regex? Unix solved making globbing fast about 30 years ago.
Monday, March 5th 2007 at 12:34 pm
Ruby does have a glob method in the Dir class:
http://www.ruby-doc.org/core/classes/Dir.html#M002374
Couldn’t you just use that to get the list of files (hopefully quickly) and then call expire_fragment on the results?
Friday, November 16th 2007 at 7:07 pm
glob() is better than that but still not fast - it just gets all the directory entries and does an fnmatch() on each one.
On my ~1-year-old CentOS 5 Linux machine with warm cache, I just timed a non-matching glob() at 2 us per directory entry. Each unlink() also takes about 2 us. So that’s an unnecessary 10 ms if you have 5,000 cache entries, or an unnecessary 200 ms if you have 100,000. Much better to just unlink() the entries that might be there than look at every entry first.