12 December 2006
Indexing moving content
Fact is, the world has a way of throwing curve balls on a regular basis. For me, those curve balls include a family-wide influenza epidemic, teething babies, travel plans, and the like. Trying to keep a grip on life is like trying to catch fish with your hands.
Tonight I give a presentation about trying to index moving targets. I was surprised to discover that of all the presentations I've ever given, this was absolutely the hardest to write. In fact, I just finished a few minutes ago. I've taught three-day classes, with eight hours of material on each day, but this 45-minute presentation really stymied me. There are two reasons for this.
First, trying to index moving content is, no matter what, a mess. The simplest example of a problem is creating an index entry like "software development, 111-121," and then finding out that pages 111 and 121 have moved respectively to pages 113 and 123. With standalone indexing (where you type in the page numbers), the only real way to fix this is manually: go back and rewrite all your page numbers. It's a MESS. So here I am, hoping to provide some tips to indexers and technical writers, something to help them avoid these kinds of corrections -- only to realize that there's no good answer. (A bad answer is to not index at all. :-)
The second problem is that even if I did have a list of useful tools, they don't make for interesting presentation materials. The first draft of my presentation would have resembled a public reading of the weather report for ever American city, in alphabetical order: if you're lucky, you're interested in Albuquerque and Atlanta and can walk out early.
The fact is, our growing reliable on live and custom information is wreaking havoc on the indexing world. It's becoming harder and harder to collate information in relevant chunks. Search will never do it; even if there were human beings out there developing controlled vocabularies, full-text search still retrieves a tremendous amount of flotsam. But creating keywords for something that won't live an hour seems kind of pointless, too. We're all just pounding sand.
I'm looking forward to what the participants have to say. Must we accept the false imprisonment of uncatalogued real-time information flow, or will writers finally catch on that indexers have an important role on the creation side as well?