05 March 2007
Interpretation, not computation
After explaining the limitations of Microsoft Word's auto-indexing feature to one of the many people who write me asking for indexing advice, I got an interesting response. Clearly frustrated by the nonexistence of computer tools to do something as simple as generate a name index, he wrote:
> I'm amazed at the poor development of the science of indexing for printed matter such as books.
I wrote back, "You misunderstand!"
The science of indexing is quite broad, given that it has a history in long-ago library science. What seems undeveloped in this case are the tools, but that's a misunderstanding of what indexing is. Indexing is an editorial field, not an automatic one. You might say it's a lot like writing, in that the writer must decide what their readers want to read, and then the writer must communicate those ideas in an organized and approachable way. Indexing is the same: analysis of text to discover what readers might find interesting, and then multiply labeling and organizing those ideas so people can find them.
Computers will never be able to write indexes because they can't (a) interpret importance of a concept, (b) understand concepts over simple words, and (c) connect ideas in contextually relevant ways. As much as I admire the Google.com search engine for what it can do, once again I will demonstrate what it can't do. Google finds 10,000,000 things when we really only want 3 (or 10 or 20). It finds what we type, but it doesn't find synonyms. And there's no guarantee that Google is searching everything that's out there, though it appears to come close; in book indexing, however, there's a human to make sure every page was considered.
How often has Microsoft Word attempted to auto-correct you in a completely inaccurate way? Spell-check? Auto-format? Auto-complete? Half-intelligent humans don't make the kinds of mistakes that these tools do.
Here's what I wish he had written:
> I'm amazed that people who know full well that computers could never write newspaper articles still believe computers can write indexes.
Another problem, of course, is that indexes aren't respected in the industry. The reason Microsoft Word even has an automatic indexing feature is because the people who wrote that software have no idea of the damage such a tool provides. That Word's {XE} functionality is so miserable is even further proof. There's a nasty cycle: people use inferior tools, quality indexing grows less likely, and inferior tools become the standard.
Indexing is an editorial process, just like writing and editing. Indexing requires interpretation, not computation.
Computers will not and should not be used as indexers. If my job ever dies because computer programmers have found a way to make me obsolete, at least I know I'll be in the enlightening company of human writers and artists.
> I'm amazed at the poor development of the science of indexing for printed matter such as books.
I wrote back, "You misunderstand!"
The science of indexing is quite broad, given that it has a history in long-ago library science. What seems undeveloped in this case are the tools, but that's a misunderstanding of what indexing is. Indexing is an editorial field, not an automatic one. You might say it's a lot like writing, in that the writer must decide what their readers want to read, and then the writer must communicate those ideas in an organized and approachable way. Indexing is the same: analysis of text to discover what readers might find interesting, and then multiply labeling and organizing those ideas so people can find them.
Computers will never be able to write indexes because they can't (a) interpret importance of a concept, (b) understand concepts over simple words, and (c) connect ideas in contextually relevant ways. As much as I admire the Google.com search engine for what it can do, once again I will demonstrate what it can't do. Google finds 10,000,000 things when we really only want 3 (or 10 or 20). It finds what we type, but it doesn't find synonyms. And there's no guarantee that Google is searching everything that's out there, though it appears to come close; in book indexing, however, there's a human to make sure every page was considered.
How often has Microsoft Word attempted to auto-correct you in a completely inaccurate way? Spell-check? Auto-format? Auto-complete? Half-intelligent humans don't make the kinds of mistakes that these tools do.
Here's what I wish he had written:
> I'm amazed that people who know full well that computers could never write newspaper articles still believe computers can write indexes.
Another problem, of course, is that indexes aren't respected in the industry. The reason Microsoft Word even has an automatic indexing feature is because the people who wrote that software have no idea of the damage such a tool provides. That Word's {XE} functionality is so miserable is even further proof. There's a nasty cycle: people use inferior tools, quality indexing grows less likely, and inferior tools become the standard.
Indexing is an editorial process, just like writing and editing. Indexing requires interpretation, not computation.
Computers will not and should not be used as indexers. If my job ever dies because computer programmers have found a way to make me obsolete, at least I know I'll be in the enlightening company of human writers and artists.
Labels: Google, human factors, indexing process, Microsoft Word indexing