Page:Aaron Swartz s A Programmable Web An Unfinished Work.pdf/31

From Wikisource
Jump to navigation Jump to search
This page has been proofread, but needs to be validated.

CHAPTER 3

19

Building for Search

Engines: Following REST

Let’s talk about vacuum cleaners. It’s an all-too-common story. You’ve got a nice shiny new apartment, but it doesn’t stay that way for long. Dust falls on the floor, crumbs roll off your plate, flotsam, jetsam, and the little pieces from Jetsons’ toys begin to clutter your path. It’s time to clean.

Sweeping is fun at first—it gives you a little time to get lost in thought about your web application while you’re doing an ostensibly-useful repetitive-motion activity—but soon you grow tired of it. But liberal guilt and those Barbara Ehrenreich articles you read make you resistant to hiring a maid. So instead of importing a hard-up girl from a foreign country to do your housework, you hire a robot.

Now here’s the thing about robots (and some maids, for that matter): it’s not at all clear to them what is trash and what is valuable. They (the robots) wander around your house trying to suck things up, but on their way they might leave tire-treads on your manuscript, knock over your priceless vase, or slurp up your collection of antique coins. And sometimes it gets caught on the pull-cord for the blinds, causing the robot to go in circles while pulling the shutters open.

So you take precautions—before you run the robot, you pick the cords off the floor and move your manuscript to your desk and take care not to leave your pile of rare coins in the corner. You make sure the place is set up so that the robot can do its job without doing any real damage.

It’s exactly the same on the Web. (Except without the dust, crumbs, Jetsons, maids, tire treads, vases, coins, or blinds.) Robots (largely from search engines, but others come from spammers, offline readers, and who knows what else) are always crawling your site, leaving no nook or cranny unexplored, vacuuming up anything they can find. And unlike the household variety, you cannot simply unplug them—you really have to be sure to keep things clean.[1]

  1. Although see http://ftrain.com/robot_exclusion_protocol.html.