Google in Action and Other Graphs
Tuesday, January 23rd, 2007In my endless quest to learn more about Urbanspoon’s explosive growth, I put together a script to generate graphs illustrating various aspects of our traffic. There are many interesting questions that we can now answer:
- Which cities are getting the most traffic from search engines?
- How long does it take google to index a new Urbanspoon city?
- etc.
I whipped up a script that periodically crunches our logs offline and creates graphs using the excellent (but cryptic) rrdtool. The graphs are generated on the hour. awstats is nice, but sometimes you have to dig in and get your hands dirty. We also use munin to keep an eye on our hardware.
GoogleBot
Below is a recent snapshot of GoogleBot crawling Urbanspoon. Green is Seattle, blue is Chicago, and red is New York. X axis is time, Y axis is pages per minute. I’ve removed the Y labels to obscure our actual numbers.

Notice the flat tops on each bulge of GoogleBot traffic - GoogleBot caps its crawl rate at a certain number of pages per minute. Over the past few weeks they’ve been gradually ramping up the rate at which they crawl Urbanspoon. Perhaps they looked at our response times and concluded that our site can handle it. Also note that they’re running out of Seattle pages to crawl. Strangely, GoogleBot tends to go to sleep around midnight PST.
Not everyone at Google is so polite. For a brief period last week Google’s mobile crawler was hitting our site with over 7000 requests per hour.
Other Robots
GoogleBot hits us far more than any other robot. To put this in perspective, here is a graph of noticeable robots hitting Urbanspoon recently. GoogleBot dominates. Maybe Yahoo should just throw in the towel and start using Nutch for their crawls.

Traffic
Our referral rate from Google is increasing rapidly but not uniformly. For example, the graph below indicates that we have more work to do in Chicago:
