27 August 2006
Information is owned by the few
Then came the middlemen, resellers like Sears, who discovered that if you brought a number of competing products into the same show room, customers came to that show room to make an educated decision. No longer convinced to buy from one manufacturer, you could shop among several models. This was how resellers made their money: providing you a service you'd pay extra money for. Products and manufacturers that failed to compete well in side-by-side arrangements were abolished in the face of consumer choice.
And finally came the Internet. The World Wide Web provided you with not only all the same information the resellers had, but much more: professional and amateur reviews, community-level and industry-specific emails filled with recommendations and warnings, and manufacturers' contact information in case you had questions. Now you could shop intelligently around the world. Much of the resale industry was demolished, now that their services paled in comparison to what consumers could do themselves. Look at the fate of independent bookstores, who all-but-vanished in a wired world where consumers read reviews and compare prices among Amazon.com, BN.com, and Borders.com, only to buy the book from an Internet-based reseller with massively discounted prices. Travel agents, too, disappeared in the face of Expedia.com and Travelocity.com.
It is thus believed, therefore, that the Internet has empowered the individual.
Not true. I'm sorry to say that it's all an illusion.
First of all, the online resellers are no better than the brick-and-mortar resellers. After browsing the options available at an online travel agency, it's often cheaper to then go to the airline site itself to buy your tickets. For example, if I want to fly from Boston to San Francisco, I'll plug my dates into a search engine at Expedia (and later, Travelocity), find the cheapest option at the best times, and then buy my ticket at Delta.com, Southwest.com, or another airline, or perhaps call a human travel agent after all at AAA and start again. As long as I have access to the source of that service or product -- the manufacturer, the service provider, etc. -- the reseller is a source of information without sale.
Second, the online resellers are limited in scope. Thanks to partnerships and other marketing choices, not all of my options are provided. For example, both Expedia and Travelocity tend to overlook small, unaffiliated airlines. Additionally, at one time (and perhaps still today) Expedia charged extra money if I wanted to buy a ticket for USAir flights, without telling me. The bottom line is that going through an online reseller is not necessary more comprehensive or cheaper than my other options.
But biggest of all, however, is that for me to perform ANY search these days, I'm going to have to use a search engine, like Google.
Without even getting into problems with spam, search engines are responsible for providing me with the information I'll need to do anything on the Web, if I don't already know precisely how and with whom to do it myself. Google is the next Sears. If I wanted to find some good choices for a boy's name, Google will provide me with so many choices that I'll inevitably stop after the first twenty (and more likely, stop after three). Google is filtering my search, valuing some choices above others just as my supermarket creates end-of-aisle displays to sell me things. The only difference is that I know the supermarket makes money from the sale. With search engines, you have no way of guaranteeing you're not clicking on a link the search engine company prefers.
Consider the unscrupulous used car salesman. Let's step through the process.
- I approach the salesman asking a simple question: "I want a reliable automobile for a good price."
- The salesman immediately points out a few models. The first one he shows me is way too expensive. The second one is terrible. In comparison, the third one he shows me seems wonderful at first glance, but then I ask more questions.
- The salesman doesn't give me precisely the information I want. Some of his answers sound ridiculous. He's reluctant to show me any more cars. But when I keep pushing, he finally gives in and shows me a fourth car, without much enthusiasm.
- Finally, I ask for specific kinds of cars, things I've heard rumors about. "What about a Toyota Sienna? Is there a good Ford minivan?" The salesman is completely unhelpful. Clearly this was a terrible place to come shopping. Maybe I'll visit some dealers, or talk to my neighbor.
Let's compare this to a Google search for boys' names. I choose Google here because it's currently a very popular search engine that, people seem to believe, does an honest job in helping people search both online and offline content.
- I start with a simple request: "I want to find a good boy's name." My query is "boys names."
- Google gives me some immediate results. Some of them are immediately terrible and can be skipped over, but it doesn't take long to find something promising. I visit the website and, although looks like what I want might be there, I have a hard time using it. I decide to give up and return to Google and its search result list.
- I try a second website, but I've lost confidence. Maybe it's not Google's fault in any obvious way, but none of these websites is helping me in the way I want to be helped.
- I decide to try some new queries. Maybe "boy names"? Do I need an apostrophe? Or perhaps, because I'm interested in a boy's name that isn't too ethnically different from the names I know in the United States, I should try a search like "American boy names." Unfortunately, my search choices are even worse. I give up. The Web is a terrible place to search for boy names. I'll try the bookstore.
You see? No practical difference.
You might think this exercise was a bit silly, but I'm not wrong. The people, companies, or machines that control what you want are the same entities that control the process. The car salesman controls which cars you buy; even if you trust him, the process is his, not yours. He's just nice about it. The same is true with Google. Sure, we all tend to trust Google -- and what's not to trust or like -- but we do not own the information-seeking process. Google owns it. Here's why:
- Not only doesn't Google find everything, it doesn't tell you there are things missing. The Google database isn't as up-to-date as the Web. Your search words don't match every relevant result in every relevant language. Sure, it looks as if there are 2,600,000 hits for you search, but that doesn't mean it found everything. What's more, you can't even see all 2,600,000 hits if you wanted to! Google shuts you out after only a few hundred.
- Google doesn't explain what it's doing, or why. The search algorithm is never explained; it's a patent secret. We know what kinds of ingredients go into the mix, but we don't know the precise details. And although sponsored links appear separate from search results -- something not all search engines do -- we have no certainty that there are some other sponsorships happening in there.
- If Google is biased, we have no way of knowing. I guarantee Google is biased, because its algorithm is based on how people use the Web. Google News collects stories more often from the AP Wire than the Boston Globe, and more often from the Globe than the Arlington Tab. That's well-intentioned bias. There are less favorable biases, too, like social biases. Because there are fewer computer users who are poor or homeless, the websites of interest to these people never show up at the tops of list. Because the Google default language in the United States is English, U.S.-based news articles are far favored over newspapers in other countries, even when the news takes place in those countries. And because most people have heard of large companies like Amazon.com, smaller companies like independent booksellers are pushed into obscurity. There are also language-based biases. It's easier to find websites related to money because this word is both singular and plural, whereas finance has a plural form. It's easier to search for words like mistress and misogyny, which exist, than for the nonexistent gender-opposite versions. And it's nearly impossible to find a company that sells windows because your search results will be overwhelmed by companies that sell [Microsoft] Windows.
But we don't have a choice. There is too much information in the world. We must go through an information repackager if we're not going to do the work ourselves. (Librarians do the work themselves; the results are of excellent quality, of limited quantity, and of almost negligible relevance for our day-to-day needs of airline tickets and boys' names. Libraries have some excellent information with which we can arm ourselves -- like using Consumer Reports to choose a quality used car -- but in general we still have to take the final steps on our own.)
Regardless of their motives, search engines OWN the information access. Maybe that's good enough. Maybe you're comfortable performing your searches in ignorance of the engine's inner workings, generally satisfied with the results most of the time. But please, that doesn't make it a good thing. What if Google started charging you for some of your searches? What if Google integrated its sponsored links into the search engine (as other engines did or do)?
Here's a real-life, immediate example. Search for Pluto. There has been a ton of recent press regarding Pluto's demotion as a planet in our solar system. Where is all that news in the search results page? There's just a tiny news area that most people won't see because it looks different, and then there's a bunch of sponsored links. This is a branding decision; Google thinks "news" and "sites" are very different things and doesn't even combine their results.
Don't kid yourself. The power of the Internet has moved, but not to you.
02 August 2006
We're lost without an information education
"Say index," I said.
"Icks," she replied.
Given how my colleague Rachel taught her three-year-old to recite how bad a book is if it doesn't have an index, it seems I have some work to do. My daughter shouldn't respond to indexing with icks.
In the United States, children learn about indexes when they are old enough to visit the school library and get instruction on how to use its resources. And while many of the printed card catalogs of my youth have been replaced with computer systems, students are still taught how to use the indexes in the backs of some books. After that, their indexing education is complete. They probably never talk about indexing with the librarian again.
Though brief, even this index education is extremely important. Instinctively, children unfamiliar with indexes will look up information just as adults use a dictionary to look up spellings. For example, if you think deceive is spelled decieve, you'll go to the dictionary to look up decieve. Not finding it, you'll look for a neighboring word that looks somewhat similar, and discover the correct "deceive." In other words, you'll enter the dictionary looking for one word, but be satisfied with another. This is how children use indexes, too. They'll look up "Civil War," not find it, and be satisfied with "civil engineering." Then, of course, they'll fail.
(It is worth noting that adults demonstrate this behavior with indexes, too. I might attempt to look up "potatoes" in a cookbook, yet be satisfied with a result of "potatoes and yams.")
Meanwhile, adults don't instinctively understand the metaphor of things inside things inside things. The well-known marushka dolls, in which a large bowling-pin-shaped doll holds a smaller doll that holds another doll, and so on, is endlessly fascinating for children. As adults, we're fascinated by the plots to suspense novels. Each step along our way -- an uncovered doll, a turned page -- is built upon the past in a linear way. We follow events, from first to last, in linear sequence, and we succeed.
Hierarchical organization, in contrast, has no obvious place in human existence. To survive, it's enough to separate things into only two groups at a time: dangerous vs. safe, edible vs. inedible, alive vs. dead, something we like vs. something we don't like, family vs. nonfamily. As intelligent creatures we might create a few more categories at a time -- family, co-workers, non-work friends, acquaintances, strangers -- but rarely do we construct them into layers like "people I know > people I like > people I like to work with." Layering is completely unnecessary in our daily lives. Perhaps it is for this reason that human beings cannot instinctively organize things in a hierarchical way -- in the same way we can't tell the (very big) spatial difference between one million miles and one billion miles. To do these things, we need training.
You know, we don't do math naturally, either. Our instincts tell us the difference between one item, two items, a few items, many items, and very many items, but that's it. We also understand more and fewer. But we don't have an instinct that tells us how to add or multiply, let alone solve calculus problems. (If you don't believe me, then I dare you to cut a pizza or a cake into five equal slices without making a mistake.)
Today, we have math classes. Before math was taught as its own course, certain elements of math were taught within specific subjects. Shipbuilders and shoemakers learned enough math to do their jobs, and that was it. The idea of teaching math independent of application must have seemed very strange. What good is shipbuilders' math to shoemakers? But eventually, the math-proficient individuals in each field spoke to one another and discovered exactly what they had in common: a need to add numbers together, a need to calculate weight, and a need for geometry. Now math is an integral part of standardized testing, which means students aren't allowed to graduate from school without proving themselves in basic math skills, separate from their application.
So why aren't we teaching information the way we teach math? Information classification exists in every field of human exploration, from literature (divisions of author style or message) to sales (styles of negotiation), and from biology (life classifications) to auto mechanics (systems of function). If a student is going to learn anything about anything, he should learn a little something about how information itself fits together.
The impact a basic, application-independent information education can have is astounding. As an example, consider driving directions. In general, we give directions to people in a linear order, something that makes sense given how we travel. Here is how you can get to the post office near my home: "(1) Take route 95 until exit 26. (2) Take route 2 East until exit 59. (3) Take route 60 into Arlington Centre. (4) Turn left onto Massachusetts Avenue. (5) After three blocks, turn right onto Court Street. (6) The post office is on your left at the end of the street." As I said before, you don't need information hierarchy to survive; following these linear directions is quite easy. But suppose you make a wrong turn, or miss your exit? To find your way back to the path I provided, you need to know something about the geographic layers that make up these regions: "greater Boston > north Boston suburbs > town of Arlington > Arlington Centre area > Court Street." You need a hierarchical knowledge of the area! Put another way, what many of us refer to as "a great sense of direction" is actually "a deep understanding of relevant geographical hierarchies." That's why someone who knows their way around New York City will get lost in the woods: they learned how NYC streets fit together (NYC > Manhattan > Upper East Side > etc.) but learned nothing about forests. Get my point? Sense of direction is taught and learned.
It's time for us to start teaching information construction in schools. We're lost without it.