25 February 2006


Custom organization schemes

In previous entries I've talked about exact and ambiguous schemes. Now I want to talk about "everything else that remains," the custom organization scheme.

The three most common exact schemes -- alphabetical, chronological, and spatial -- aren't the only exact schemes in the world. There are also numeral (1, 12, 166, 2, 22, 266, 3, ...) and numerical/counting (1, 2, 3, 12, 22, 32, 166, 266, ...) schemes, and there are schemes where the sort style is based on some other intrinsic characteristic or standards, such as the periodic table of elements (numerical order by proton count) and the colors of the rainbow (numerical order by wavelength values). Although it's challenging to think up exact schemes that aren't based on numbers, characters, time, or space, it is incredibly easy to think up subjective (ambiguous) ways of organizing information. Below are several suggestions. Although some of these possibilities seem quite similar to numerical order, number order won't ever change with time, although these can.

by frequency of use
by importance
by logical complexity
by sensory intensity (e.g., brightness)
by mood
by personal interest
by profitability
by likelihood to inspire controversy
by necessity to avoid a lawsuit
by the order in which you thought it up

Further, the subjectivity for each of these goes further, because you can choose your audience for these sorting schemes:

by importance to the author
by importance to the common user
by importance to the experts

... and so on.

It's easy to get confused between complicated exact ordering schemes -- like the periodical table, which is in order of proton count -- and custom ambiguous schemes. In both cases, the sorting scheme may not be obvious, at which point it's easy to assume it's obtusely subjective or completely arbitrary. For example, the following are three ways to organize the English alphabet:

a b c d e f ...
q w e r t y ...
e t a o i n ...

The first is alphabetical order. The second is spatial order (letters on the standard keyword, affectionately known as the QWERTY keyboard). The third is in order of use in the American English language (the letter E is the most commonly used letter in the lexicon). I would argue that for all practical purposes, these are all exact schemes. There is no subjectivity here. To sort the alphabet in a subjective way, I'd suggest by ease of pronunciation. :-)

My point is that in the end, most custom schemes are exact and not ambiguous. Instead, they are translations of intrinsic orders into something exact (again, like the periodic table being in numeric order by proton count or atomic weight). Alternately, they are exact or ambiguous schemes that are time-dependent, such that the scheme itself varies over time or application.

In summary, you should consider using a custom scheme when (a) you have an internal order that you want to obey, or (b) you have an order that changes over time. In both cases, however, it's important that your users understand that your schemes are not necessarily obvious. The periodical table requires training to understand and use; meanwhile, because the search results at Google.com tend to vary over time, some people are disturbed by getting different results for the same search they used yesterday. If you can't use an intuitively obvious scheme (like alphabetizing) or a subconsciously obvious scheme (like task or topic order), it's probably a good idea to find a way to impose a numeric scheme on top of your results. The periodic table has a key, and search engines often have a "relevance percentage."

23 February 2006


Toronto conference

The international indexing conference is scheduled for 15-17 June 2006, in Toronto, Ontario, Canada. It's my job, as president-elect of the American Society of Indexers, to plan this event. Boy, planning a conference can be a challenging thing sometimes.

Finally, though, we have a schedule in place. The ASI website will have that information soon, but in the meantime you can join the asiconference mailing list for current and archived information.

20 February 2006


Even bad entries can be good, in context

In technical documentation, often I've seen main entries like "creating user accounts, 325" and "managing settings," where the first word is a gerund that has questionable value. When I'm asked why this isn't a valuable entry, I explain that the ideas of "creating" and "managing" are rather vague. I also explain that if someone wants to know how to create something, he is more likely to look up the something. For example, if you were interested in how to bake a cake, would you look up "baking" or just "cakes"? I believe you'd look up "cakes" first.

Strictly speaking, however, there is nothing wrong with entries like "creating accounts" and "baking cakes," even though accounts and cakes are the more specific and more likely targets of most users. Entries under these generic or vague terms (creating, managing, etc.) become valuable, however, when these terms have greater meaning within their context. In religious texts, the concept of creation (as in "world creation," often written with a capital C) is worth indexing. In business textbook, the concepts of management is worth indexing. And in an instructional cookbook that explains how ovens are used in the most general sense, an entry for "baking" might be appropriate.

The guideline to avoid these general terms, declaring the resulting entries as "bad," ignores context. The argument that readers are more likely to look up the objects of these actions -- "messages, writing" instead of "writing messages" -- doesn't negate the value of having these gerunds as available access points. Actions are still concepts and deserve to be indexed; in technical documentation, task-oriented language has even greater value than average. Consider these entries:

filtering email messages, 000
spell-checking your document, 000
upgrading applications, 000

The reason these work so well is because the ideas of filtering, spell-checking, and upgrading are distinct ideas of importance to technology users. In fact, the above term document is itself a bit vague (whereas file is not), and beginners know applications better as programs. So you can see just how valuable using those gerunds can be. Dismiss them at your peril.

By the way, if you need a practical clue regarding their use, take a look at the full set of documentation and ask yourself if you can use that gerund much more often than you already are. For example, you might have a section where "creating accounts" is obvious, but does that same book talk about the creation of other things, like passwords, files, security filters, network connections, and so on? Just because the word "creation" isn't used (passwords are invented, filters are applied, networks are initiated, etc.) doesn't mean it's wrong. If after some serious thought you realize that you'd have dozens or more subentries under that gerund, consider getting rid of it as not specific enough. Other candidates for nonspecific gerunds in technical documentation include installing, configuring, customizing, opening/closing, hiding/showing, deleting, starting/stopping/quitting, editing/modifying/altering, and accessing.

15 February 2006


You have sixty days until Tax Day (U.S.)

I have two months to finish preparing my taxes. I know a lot of people out there who much prefer working with tax accountants and advisors, but I much prefer doing the work myself. All year long I throw anything related to my financial condition in a big red crate. Every year, usually in mid-February, I upend that crate and start putting everything into piles. After three days of labor, those piles of papers have been sorted, organized, reviewed, mined for valuable data, calculated, recalculated, tabulated, recorded, and sent in one big envelope each to the Internal Revenue Service and the Commonwealth of Massachusetts.

Three days, yeah. You probably think I'm crazy.

Here's why I go through that, without an accountant. First of all, the three days that I spend sorting and tracking has to be done--by someone at some time--before Tax Day. I could do a little bit of that work every day using software like Microsoft Money or a spreadsheet, or even using a pencil and a ledger. Alternatively I could create folders for myself and sort the paperwork as it arrives, exchanging that one big red crate for two dozen hanging file folders. And if I wanted, I could instead bring that big crate to the nice accountants and say, "Add this for me," and not worry about it all. But because it has to be done, and because I'm much more qualified to know when a certain phone call is a business expense, when a certain receipt is a medical expense, or when a piece of paper fell into the crate by mistake, the person who should be doing most of that adding is, of course, me. And since it really is just addition, why should I pay an accountant when I am fully capable of using a calculator on my own?

Second, the one arena where the ability of accountants completely overshadows my own is in their understanding of tax law, and how those laws apply to my earnings, debts, and expenses. But I don't like not knowing about the laws that affect me. With the same curiosity I feel about who in the world might have my social security number, I want to know how much of money is going to fund my government. Going through the distress of all that mathematics pays off, because now I understand my how Social Security payments are calculated, where the tax tables were derived, and how much of my money was actually spent on medical expenses, mortgage payments, and investment fees.

And there's a corollary to all this, and it's a consequence of waiting until mid-February to look at this paperwork: Doing my taxes is when I figure out exactly how much I make! Sure I watch my checkbook balance go up and down, but I don't waste my time looking at those numbers on a daily, weekly, or even monthly basis. Once a year is good enough for me. February is therefore a big day of numerical realization. For example, this year I discovered that 40% of my gross indexing income came from a single client. Other discoveries include how much less interest I'm paying on my mortgage from last year; how much money I invested on the house (which is important because I work at home and can deduct some of it as an expense); how much money I spent on postage and shipping; how often I used my car for business; and all the other little things like phone bills, photocopying, client and colleague lunches, medical expenses, bank interest, retirement investment, and so on. If ever I needed a reality check, this is it, and it's a lesson on a global scale. It's one thing to see how much money I spent on business travel, but quite another to see that number next to how much money I spent on business advertising.

Not only do I learn about the tax-related information, however, but I really learn about everything related to money. Once a year I thoroughly read my credit card statements, line by line. Under the pretense of looking for those $4.95/mo payments for website hosting at tripod.com, I have the delicious opportunity to reminisce about each year's events, like the birth of my daughter, that vacation with my wife, the day I bought TiVo, the huge party catered by Blue Ribbon Barbecue, etc.

And finally, if ever there's a reason to say "Yes, I do my own taxes," it's to impress everyone! It takes them a moment to cough and gasp with incredulity, and to tell me that I'm insane not to have an accountant because they and everyone they know has an accountant, but then the light dawns. "Wow," they think, "this guy must really ____."

a) be smart?
b) have a lot of patience?
c) enjoy building character?

Fill in your own blanks. That's what I do.

11 February 2006


Organizing ambiguously

In a continuation of my post from 8 Feb, I want to say a few words about ambiguous or subjective sorting. First of all, ambiguous sorting is cool. Unlike the exact methods I wrote about, ambiguous sorting schemes actually pay attention to the meaning of what's being sorted.

Topical schemes sort categories based on what they're about, like the organization of a textbook (simple to complicated). Task-oriented schemes are organized in "doing" order, such as the steps one takes to make a sandwich. Audience-oriented schemes separate items according to who wants them, like the split between members and nonmembers, or the MPAA categories for movie age appropriateness (e.g., PG-13).

Consider how you might organize the steps involved of visiting a website. In task order they would appear as (1) turn on computer, (2) open the browser, (3) type in the website address, (4) read the page, (5) close the browser, and (6) turn off the computer. In topic order, the "turn on" and "turn off" items would be combined, since people pair these together; same with "open the browser" and "close the browser." In alphabetical order, on the other hand, the browser is closed before it's opened, and the computer is turned off before it's turned on. Using alphabetical order is pretty stupid here would be pretty stupid, eh? (Did you ever wonder notice that the options under the File menu aren't alphabetized?)

The problem with ambiguous systems, of course, is that people don't necessary categorize things in the same way. In fact, this is why people can't find things in other people's kitchens! (See my post from 7 Feb.) We don't put measuring spoons, soup spoons, and serving ladles in the same drawer, even though they're all spoons. Instead, we interpret categories according to how we perceive these connections, subjectively.

Think about how restaurant menus are organized. First they are organized in task order: appetizers in the front, entrees in the middle, and desserts near the back. Then they might be organized by ingredient (e.g., all the pasta dishes appear together), although it's unclear in what order these ingredients are listed. And within these categories, what's the order? Maybe it's profitability; maybe it's to show off the chef's skills or the breadth of available selections; maybe it's to put the most popular or intruiguing items at the top.

By the way, in my opinion audience-oriented categorization is the most powerful and useful of all sorting techniques, and yet it's woefully underutilized. Organizing items by popularity is simply a variation on this idea.

As soon as you start really looking at how people use information, you will run away from exact schemes almost immediately, as much as possible. (Long flat lists, like the entries in an index, still demand some sort of umbrella sorting. Short lists, like the options in a computer menu, are fair game. Have you ever noticed that the choices under the File menu are generally in task order?)

Nothing in life is exact, so why do we force it?

09 February 2006


International indexing conference in Toronto, Ontario (15-17 June 2006)

If you're just tuning in, you may not realize that the international indexing conference, co-sponsored by the American Society of Indexers and the Indexing and Abstracting Society of Canada is coming up fast! Conference information is available at both websites.

I'm the American in charge of the conference (and president-elect of ASI), and my Canadian counterpart is Ruth Pincoe. If you have any questions about the conference, please write conference@asindexing.org and I'll be happy to provide an answer. A preliminary schedule of events is about to be released on the asiconference mailing list. (To subscribe, send a blank mail to asiconference-subscribe@yahoogroups.com.)

08 February 2006


Organizing as exactly as possible

It should come as no surprise that information can be organized in many, many ways. According to Rosenfeld & Morville, there are two major types sorting schemes: exact (objective) and ambiguous (subjective). Personally, I like to imagine a third category, custom, that overlaps both and therefore deserves special treatment.

Exact sorting schemes are things like alphabetically ordered, numerically ordered, chronologically order, and spatially ordered. These schemes require that you follow an accepted and hopefully well-known sequence, such as from A to Z, from 1 to 9, or from top to bottom. Exact schemes do allow you to reverse order -- my blog, for example, sorts each entry in reverse chronological order, with the newest entries at the top -- but they don't allow you to start mixing things up. In an alphabetically ordered list of words, words starting with C will always appear between the B-words and the D-words.

There are three major problems with exact schemes:

Of course, these three reasons aren't enough to toss exact schemes into the rubbish bin completely. Exact schemes are easy. You can get a computer to sort things almost instantly, and most audiences have no trouble using them despite their shortcomings. However, my last point -- that they're meaningless -- is why there are so many better options.

I'll talk about ambiguous and custom schemes in my next posting.

07 February 2006


Organizing the kitchen

I've always been fascinated by how challenging it can be to make analogies between indexing and the "real world," when in fact we organize and retrieve things all the time. So I'm always looking at the kitchen as my model of information organization.

First of all, why is it so hard to find things in other people's kitchens? Doesn't everybody keep the trash can under the sink? Isn't cutlery always in a waist-high drawer near the sink? Don't people keep their drinking glasses and coffee mugs in the same cupboard? Apparently not.

We organize our kitchens for ourselves. If we are living alone, we only need to put things where we want them to be. If we are living with others, we do our best to compromise with our home-mates and protect our children. The things we rarely use go way up high; the things we don't want our kids to get are up high or behind a lockable door. Everything else goes where it fits, where we can reach, and next to the areas where we're most likely to use them. So for some people, coffee mugs and water glasses are stored together because they fit neatly beside each other (unlike glasses and bowls). For other people, the coffee mugs are stored closer to the coffee maker, in the same cabinet as the sugar bowl and the coffee filters. Both of these choices involve organizing by function -- the function of drinking, the function of enjoying coffee -- but the results are personal. The kitchen is ours.

Indexing involves turning your kitchen into a place that other people can use just as easily. This means you have to organize your kitchen in such a way that people don't have to ask you where the spoons are, but instead could just walk in and find exactly the spoon or other object they need. Your personal guidance should become unnecessary, because the kitchen is intuitively and universally organized. No one will ever open the wrong drawer or door or canister again.

Yeah, right.

Basically, you have four choices. The first choice is to label everything. Every drawer, every cabinet, every appliance, and every countertop object should have a little piece of paper attached to it. The cultery drawer might be labeled CUTLERY. The refrigerator might be labeled COLD FOOD. But this is not as easy as it sounds. What, other than cutlery, is in your cutlery drawer? A can opener? Twist ties? Napkin rings? Meanwhile, your refrigerator may contain cold food, but what kinds of food are kept cold? Are your apples in there, or are they in a bowl? Do you use fresh milk, or do you buy your milk in those boxes? You see, labeling is only as good as your labels. Don't you dare create a label for SPOONS, because you have teaspoons, dessert spoons, wooden spoons, slotted spoons, sugar spoons, serving spoons, antique decorative spoons, plastic spoons, and sporks in your kitchen.

Clearly the problem is that your kitchen isn't perfectly organized. Why aren't all your spoons in one place? So pull everything out and lay it down on a freshly washed floor, and reorganize it. Put all of your spoons in one place. Everything you might call a plate or a platter goes together. Everything you eat goes against one wall, and everything you don't eat goes against the other walls. And finally, your labels make sense. Of course, you've sacrificed your kitchen for the sake of everyone else, but wasn't that the point? No! This is the problem with the Dewey Decimal System in some public libraries: nobody knows how to find anything except the librarians! But I'll tell you, if you want to learn about a topic, you might just discover that everything on that topic is the same exact place.

Almost guarantee. Go the library with an interest in World War II, and you'll find yourself in the history section to read about history, the romance section to read historical fiction, the fiction area to find some spy thrillers, the newspaper archive to read old news articles, the magazine section to read current articles, the science area to read about radar, the aeronautics section to read about the airplanes, the humor section to read those funny WWII joke books, and so on. The same is true with our kitchen, where the same knife can be used to cut food, spread jam, open envelopes, and even unclog the drain. Your kitchen objects, like words in the English language, are used in many different ways; categorizing them becomes rather subjective. So when that guest comes in looking for fruit, will he find it in the refrigerator, in a bowl, in a box or can, or in the compost bin? Yes, yes, yes, and yes. Wow, I guess we need a FRUIT category.

The third approach, then, is to put everything everywhere! Put a teaspoon in every drawer, on every horizontal surface, in and next to every appliance, in each cabinet, and on every shelf. Now, when someone goes looking for a spoon, it doesn't matter where he thinks the spoon is, because he's right! There's a spoon on top of the microwave, in the Crisper drawer of the refrigerator, and in the sink. Of course, not only are spoons everywhere, but so are everything else: can openers, slices of bread, blenders! One of everything, everywhere! (Of course, to be truly practical, you'd need more than one at every location, since sometimes people need more than one spoon at a time. :-) By the way, this is how people use search engines, like Google. We create a web page, and then we attach as many keywords as possible. We want to make sure that everyone will find our stuff, no matter where they're looking. In fact, some people want their content discovered even when people aren't looking -- stumbling over spoons everywhere.

The final approach is some combination of all of these things: decent labels, better organization, and as much redundancy as the cabinets can stand. It won't be perfect for everyone all the time, but very few people are going to have to open more than one or two drawers until they find what they want, even if what they want is a tiny whisk or an egg timer. Everything is categorized, labeled, and multiply placed.

That's indexing.



I like to think the world needs yet another forum to talk about indexing, but perhaps it's just me. These days, my absolute favorite part of being an indexer, an information architect, a trainer and educator, and an "information guru" (not my words!) is simply talking about the possibilities that come with indexing.

Let's explore together.

This page is powered by Blogger. Isn't yours?