Archive for November, 2006

Gupta-Sproull Antialiased Lines

Wednesday, November 29th, 2006

Six months ago, I managed to generate the following table. It’s not remotely relevant to Urbanspoon.

{ 0xc7, 0xc6, 0xc2, 0xbc,
  0xb3, 0xa9, 0x9c, 0x8e,
  0x7f, 0x70, 0x62, 0x54,
  0x46, 0x3a, 0x2f, 0x25,
  0x1c, 0x15, 0x0e, 0x09,
  0x05, 0x03, 0x01, 0x00 }

This is the coverage table for rendering antialiased lines using the Gupta-Sproull algorithm. Apparently, this little table is included in their original paper, first published in 1981. I couldn’t find the coverage table online. I eventually bought a copy of the paper from the ACM for $10, but it didn’t include the damned table!

Finally, I bit the bullet, sat down and read the paper. Then I hacked together a Java class to generate the table. I wasted two full days of my short life on those 24 numbers.

You’ll thank me later.

I spent two weeks trying to efficiently render antialiased lines as part of a consulting project for a consumer electronics company. During those two weeks I was completely obsessed with line rendering. I raved about it to my friends, long after they’d stopped listening. I dreamed of lines. I learned a lot about line rendering, most of which I’ve forgotten.

I eventually implemented Wu lines with a Gupta-Sproull style coverage table. In addition to the body pixel coverage table above, I also calculated an endpoint coverage table. Divide a pixel into a 16×16 grid. Move the fractional line endpoint to the nearest grid location, round the slope to the nearest 1/16, and then perform a lookup in the table. The lookup returns the coverage for the 6 pixels surrounding the endpoint. For those of you following along at home, the table is 16*16*16*6 bytes, or 24k.

The original paper dealt with integer (not fractional) endpoints. The authors suggested that it might be possible to support fractional endpoints by generating a table similar to the one I created, but that it would be too big for practical use. That was in 1981. Here in 2006, even portable consumer electronics devices can spare 24k. The Gupta-Sproull lines were much faster than agg’s lines, and looked just as good.

I came up with a great hack to generate the table. I drew some really fat lines in Java and then manually measured the coverage! A picture is worth a thousand words:

The moral of the story? A mediocre pixel tweaker like me needs to piggyback on 25 years of Moore’s Law in order to implement a basic line rendering algorithm.

Complex SQL Sorts with Rails/ActiveRecord

Saturday, November 18th, 2006

On Urbanspoon, we often need to efficiently sort a subset of records that were retrieved using a different sort order. For example, our Most Popular Restaurants in Seattle page first selects the top 100 restaurants, then allows the user to sort by name or price.

We need to perform a sort, then perform a secondary sort on a subset of the results. Here is an example of a two step sort in action:


unsorted

sort by popularity

sort by price, but
only the first 100

I tried various implementations:

  • Perform the second sort in Ruby. This is inelegant, inefficient, and impractical for expensive sorts like distance in miles.
  • Use a subselect to get the ids in your :conditions. Unfortunately, Mysql doesn’t support LIMIT in subselects. This also breaks sql_calc_found_rows, which we use with some of our complicated sorts.
  • First select the ids, then manually construct the :conditions from the ids.

After several fumbling attempts, I eventually settled on the last approach. This technique requires an additional query, but it let’s you use things like ActiveRecord’s :include, which is essential for some of our pages.

1. Perform the first sort with :limit, and grab the IDs

First, perform the sort with :limit and grab the IDs. For example:

r = Restaurant.find(:all,
                     :order => :popularity,
                     :limit => 100)
ids = r.map { |i| i.id }

Better yet, let’s just select the ids instead of populating the entire restaurant object:

r = Restaurant.find(:all,
                     :select => 'id',
                     :order => :popularity,
                     :limit => 100)
ids = r.map { |i| i.id }

If you’re a freak like me, you might want to get rid of some of that ActiveRecord overhead. Why bother creating those Restaurant objects at all? We can avoid creating Restaurant objects if we write a bit of SQL:

r = ActiveRecord::Base.connection.select_all("select id from restaurants order by popularity limit 100")
ids = r.map { |i| i['id'] }

Hm. There must be a better way. Can we use ActiveRecord to write the SQL, but avoid creating the restaurant objects? You bet, if we call send on ActiveRecord::Base’s construct_finder_sql method. This is perfect for my purposes, because my SQL skills are pretty weak. I can use ActiveRecord to write the SQL, but avoid the unnecessary overhead of creating all those objects.

options = {
  :select => 'id',
  :order => :popularity,
  :limit => 100
}
sql = Restaurant.send(:construct_finder_sql, options)
r = ActiveRecord::Base.connection.select_all(sql)
ids = r.map { |i| i['id'] }

2. Sort the subset

Now that we have the IDs of our subset, we can sort it using ActiveRecord:

Restaurant.find(:all,
                 :conditions => "id in (#{ids.join(',')})",
                 :order => 'price')

We can also take advantage of :include to populate our objects with everything we need.

You’ll Like It

The example above is a bit contrived, but I needed this technique to efficiently render many of the pages on Urbanspoon. Enjoy!

Rails SQL Logging Improvements

Thursday, November 9th, 2006

The logging system built into rails works pretty well out of the box. The development.log file contains timing information, SQL statements, and error traces. I especially like the ANSI color coding, which makes the file much easier to eyeball. Still, there is room for improvement. Here are a few changes I’ve made to my Rails logging setup while working on Urbanspoon.

In a future post maybe I’ll discuss how these stats exposed problematic pages. Several pages were transferring 20x more SQL data than needed. :include may be considered harmful.

Logging SQL bytes transferred per page

ActiveRecord is an interesting beast. It’s very easy to use, but doesn’t provide much in the way of caching. Associations can be eagerly loaded using :include, but how does this affect performance? Timing benchmarks are the ultimate arbiter, but I often want to know other statistics. How many queries (SQL roundtrips) did it take to render this page? How many SQL bytes were transferred for this page?

If you plug this snippet in verbatim, you will start seeing a few changes in your log files. First, each SQL select will be followed by a line that reads 21 rows, 8.3k to indicate how many rows/bytes were transferred for the select. Second, the familiar ActionController timing statements will include the number of SQL bytes transferred:

Completed in … | Rendering: … | DB: 0.01312 (3%) 23.6k | …

A few things to note while reading this code:

  • SQL statistics are only turned on for the development environment.
  • It only works for mysql, though I’m sure the same technique can be used for the other adapters.
  • Minor fudging… I’m reporting “string bytes returned by mysql select”, not actual bytes transferred on the wire. If anyone has a suggestion for a simple way to get at the latter I’m all ears.
  • As always with mixins, this isn’t guaranteed to work with all versions of Rails. I’m using 1.1.6.

Add this code to environment.rb:

# only run this code in development
if ENV['RAILS_ENV'] == 'development'

  # modify MysqlAdapter to track transfer stats
  class ActiveRecord::ConnectionAdapters::MysqlAdapter
    @@stats_queries = @@stats_bytes = @@stats_rows = 0

    def self.get_stats
      { :queries => @@stats_queries,
        :rows => @@stats_rows,
        :bytes => @@stats_bytes }
    end

    def self.reset_stats
      @@stats_queries = @@stats_bytes = @@stats_rows = 0
    end

    def select_with_stats(sql, name)
      bytes = 0
      rows = select_old(sql, name)
      rows.each do |row|
        row.each do |key, value|
          bytes += key.length
          bytes += value.length if value
        end
      end
      @@stats_queries += 1
      @@stats_rows += rows.length
      @@stats_bytes += bytes
      @logger.info sprintf("%d rows, %.1fk", rows.length, bytes.to_f / 1024)
      rows
    end

    alias :select_old :select
    alias :select :select_with_stats
  end

  # modify ActionController to reset/print stats for each request
  class ActionController::Base
    def perform_action_reset
      ActiveRecord::ConnectionAdapters::MysqlAdapter::reset_stats
      perform_action_old
    end

    alias :perform_action_old :perform_action
    alias :perform_action :perform_action_reset

    def active_record_runtime(runtime)
      stats = ActiveRecord::ConnectionAdapters::MysqlAdapter::get_stats
      "#{super} #{sprintf("%.1fk", stats[:bytes].to_f / 1024)}"
    end
  end
end

Adding SQL bytes transferred to your layout

For added fun, add this to the bottom of your application.rhtml layout file. This is a technique we used at Jobster to provide immediate stats to developers.

<% if ENV['RAILS_ENV'] == 'development'
   stats = ActiveRecord::ConnectionAdapters::MysqlAdapter::get_stats %>
  <%= sprintf("  (%.1fk, %d queries)", stats[:bytes].to_f / 1024, stats[:queries]) %>
<% end %>

Suppress blob logging

You may have noticed that Rails likes to dump your SQL blobs to the log file. This will quickly cause your log file to balloon to gargantuan proportions. If you’re especially unlucky, you’ll run out of disk space and you might be forced out of business entirely. I recommend you add this to your environment.rb file immediately:

# trim blob logging
class ActiveRecord::ConnectionAdapters::MysqlAdapter
  def format_log_entry(message, dump = nil)
    if dump
      dump = dump.gsub(/x'([^']+)'/) do |blob|
        (blob.length > 32) ? "x'#{$1[0,32]}... (#{blob.length} bytes)'" : $0
      end
    end
    super
  end
end

Again, I’ve only tested this with mysql and rails 1.1.6.