25 March 2007


Indexers indexing infinitely ... like monkeys

Three ideas have merged.

First, there's the idea I published last December as "A needle in a haystack with 100,000,000 blades," where I argued how the Web, or an approximation thereof, could be indexed by humans for a reasonable amount of money.

Second, there's The New York Times article "Artificial Intelligence, With Help From the Humans," in which we learn that the Amazon Mechanical Turk service subcontracts human workers to perform tasks that are especially challenging for computers to accomplish, such as matching images to textual descriptions. For some jobs, Turkworkers might make one penny per transaction.

And finally, there's the infinite money theorem, which states that a monkey hitting keys at a typewriter for an infinite amount of time will "almost surely" type the complete works or Shakespeare, or something similar. I first heard this ideas as a "million monkeys and million years," but I bet the math's a bit different. After all, "infinite" is much bigger than a million million.

Putting these ideas together seems to provide a rather obvious solution: third-world indexers. After all, if it costs only a nickel to get someone to write a few keywords for something, we can get a lot of indexing done very cheaply; I say "third world" because no indexer I've ever known is willing to work for a penny per word.

The indexing industry is facing the very real possibility that our workload will be taken from us and delivered to those in economies that allow lower prices. But what if we went a step further and, instead of looking for less expensive indexers with good qualifications, we decided to look for dirt cheap indexers with no qualification other than time to waste? What if, I ask, we asked monkeys to pound away at their keyboards?

I find the idea amusing but too close to the truth. After all, the intelligence behind Google is the social intelligence, the uneven and culturally biased workings of millions of Internet users plugging away at their disparate tasks. What Mechanical Turk has going for it, then, is the human decision making at the back end. Whereas most search engines look for better and greater stores of metadata with which to judge content, one man in a back room can make smarter decisions upon command. No, the real problem is that today's human intelligence is worth only pennies per word. Computers do their best, and humans sweep up afterwards. Our natural intelligence isn't worth a whole lot, I guess.

That's how we know computers are smart. Computers own us monkeys.

Hi Seth,

Even though you have not met me personally, we kind of know each other 'online'. However, despite being Jamaican and there labelled 'third world' I surely would never index for a penny per word.

Great blog by the way, it's added to my favourites' list.
