Thingy words
                
                If you use the word “content” to talk about stuff on the Web, my friend Doc Searls is likely to give you a stiff talking-to. People don’t write content. They write articles, poems, songs, etc. Worse, content implies that the Web’s a one-way medium in which some people express their ideas and the rest of us “consume” them.
I agree with Doc (and pardon my quick paraphrase): “Content” is a misleading term. But you can see why we came up with it. Prior media tended to be tuned for a particular sense organ: Books are for your eyes. Music is for your ears. But because the Internet is a digital medium, its digits can be translated into any sense, even though smell is lagging. So, if you want to talk about all the stuff that’s on the Net, you need a word more abstract than the usual ones we use for media. You might use the word “stuff” as I just did, but “stuff” is too broad. “Content” at least refers to human-to-human communications. “Content” has the problems Doc points to, but we need some word.
The same sort of need drove the transformation of the word “document” in the 1980s. Until then, a document was a privileged piece of printed matter. Your passport was a document, your signed mortgage was a document, a map of France with Napoleon’s handwritten annotations was a document, but a photocopy of any of those was not. A document was a piece of paper whose physical instantiation brought it special value.
“Document” takeover
                
	
    
                Then word processing was invented. The word processing companies needed a generic term for all the different sorts of multi-page thingies you could create with their new software. So, they took over the word “document.” For many years, the word sounded odd in its new context because word-processed documents are the opposite of the original documents. Originally documents were papers that could not be replaced by a copy of them. In their new meaning, documents were electronic thingies that could be printed out as many times as you wanted. Documents went from unique to intrinsically reproducible. In fact, word processors don’t even have the concept of an “original” copy.
Unlike with “content,” the kidnapping of the word “document” did not bring strong negative connotations with it, and the original sense of the term “document” still works in the right situations. “Don’t forget to bring the documents with you,” a lawyer might tell you and you’ll know to bring the signed originals.
Paul Duguid, a co-author (with John Seely Brown) of the seminal The Social Life of Information and a professor at the University of California at Berkeley’s School of Information, gave a talk a couple of months ago in which he traced the use of the word “information” in the 18th century, even finding a 1778 reference by Vicesimus Knox to “this age of information.”
Needless to say, Knox wasn’t thinking of information in the mathematical sense attributed to the word by Claude Shannon in 1948, but then we hardly ever mean it in that computer science sense either. I won’t try to summarize Duguid’s findings, both because they are too nuanced for a column and because he has not yet published his research, except to say that he finds that the term “information” was used even back in the 18th century as a thingy word useful when you want to talk at a level so general that more precise terms get in the way. For example, information is what people need in order to make good decisions, or a book has twice the information that its competitors have. (Both of these are real examples cited by Duguid.) As Duguid said in response to a question, information “was the unanalyzed term.” And that is how we so often use it these days as well.
We need thingy terms because whatever clear divisions we make, we still sometimes need to talk about the undivided whole. The most general terms necessarily lack precise definition. That is their point. In fact, the real danger comes when we assume they have more definition than they do, as when we impute to Internet “content” meanings drawn from its prior usage. We’d be better off calling it “stuff.”
(Note: This column was stimulated by a discussion with Liam Andrew, an MIT grad student.)