Complex SQL Sorts with Rails/ActiveRecord

On Urbanspoon, we often need to efficiently sort a subset of records that were retrieved using a different sort order. For example, our Most Popular Restaurants in Seattle page first selects the top 100 restaurants, then allows the user to sort by name or price.

We need to perform a sort, then perform a secondary sort on a subset of the results. Here is an example of a two step sort in action:


unsorted

sort by popularity

sort by price, but
only the first 100

I tried various implementations:

  • Perform the second sort in Ruby. This is inelegant, inefficient, and impractical for expensive sorts like distance in miles.
  • Use a subselect to get the ids in your :conditions. Unfortunately, Mysql doesn’t support LIMIT in subselects. This also breaks sql_calc_found_rows, which we use with some of our complicated sorts.
  • First select the ids, then manually construct the :conditions from the ids.

After several fumbling attempts, I eventually settled on the last approach. This technique requires an additional query, but it let’s you use things like ActiveRecord’s :include, which is essential for some of our pages.

1. Perform the first sort with :limit, and grab the IDs

First, perform the sort with :limit and grab the IDs. For example:

r = Restaurant.find(:all,
                     :order => :popularity,
                     :limit => 100)
ids = r.map { |i| i.id }

Better yet, let’s just select the ids instead of populating the entire restaurant object:

r = Restaurant.find(:all,
                     :select => ‘id’,
                     :order => :popularity,
                     :limit => 100)
ids = r.map { |i| i.id }

If you’re a freak like me, you might want to get rid of some of that ActiveRecord overhead. Why bother creating those Restaurant objects at all? We can avoid creating Restaurant objects if we write a bit of SQL:

r = ActiveRecord::Base.connection.select_all("select id from restaurants order by popularity limit 100")
ids = r.map { |i| i[‘id’] }

Hm. There must be a better way. Can we use ActiveRecord to write the SQL, but avoid creating the restaurant objects? You bet, if we call send on ActiveRecord::Base’s construct_finder_sql method. This is perfect for my purposes, because my SQL skills are pretty weak. I can use ActiveRecord to write the SQL, but avoid the unnecessary overhead of creating all those objects.

options = {
  :select => ‘id’,
  :order => :popularity,
  :limit => 100
}
sql = Restaurant.send(:construct_finder_sql, options)
r = ActiveRecord::Base.connection.select_all(sql)
ids = r.map { |i| i[‘id’] }

2. Sort the subset

Now that we have the IDs of our subset, we can sort it using ActiveRecord:

Restaurant.find(:all,
                 :conditions => "id in (#{ids.join(’,')})",
                 :order => ‘price’)

We can also take advantage of :include to populate our objects with everything we need.

You’ll Like It

The example above is a bit contrived, but I needed this technique to efficiently render many of the pages on Urbanspoon. Enjoy!

3 comments to “Complex SQL Sorts with Rails/ActiveRecord”

  1. Comment by Arrix:

    You may want to use select_values() so that you don’t need to map.

  2. Comment by Sudden:

    Try

    r = Restaurant.find(:all,
    :select => ‘group_concat(id) as ids’,
    :group => “‘a’”,
    :order => :popularity,
    :limit => 100)

    to get the already comma-separated ID-list returned by the mysql-server.
    r[:ids] then contains that string.

    You need to make sure mysqls group_concat_max_len is large enough.

  3. Comment by Sudden:

    :group => “‘a’” (String ‘a’) makes sure, that all found rows are grouped together