Search: sophisticated yet simple
Video Monitoring Services (VMS) provides competitive intelligence to its customers by analyzing content from television, radio, print and Internet sources. Its monitoring services cover all 210 designated market areas in the United States. Content is captured, stored and then analyzed with a combination of software tools and human experts. VMS then provides a variety of services, including reports, alerts and customized views of relevant information.
VMS uses the Autonomy IDOL platform to perform analyses of video content based on closed-caption text that accompanies the broadcast video. Autonomy also searches across other file types, including text documents and e-mail messages, as well as blogs and Web pages.
“We are using Autonomy in a very proactive way,” says Gerry Louw, CIO for VMS. “As each piece of news comes in, it is applied to about 28,000 profiles that relate to our clients. This real-time processing enables VMS to deliver results immediately to clients. The queries also differ from typical search queries in that they are very sophisticated.”
Most clients receive a daily update, but VMS also prepares reports in response to particular issues. In the world of competitive intelligence, time is a critical factor; clients do not want to wait days to know their standing. Since VMS began using Autonomy IDOL, the company has reduced its media analysis time by 66 percent, which gives VMS its own competitive edge.
A typical client situation might center on the need to find out if an expensive advertising campaign is having an effect. Major media would be monitored to determine the reach of the campaign, and then content from news broadcasts, blogs and other indicators would be analyzed to determine “share of voice.”
“Volume of coverage is important,” notes Louw, “to determine the reach of the advertising initiative. But in addition, companies want to know the nature of the commentary. Is it positive, negative or neutral? And how does it compare to what’s being said about competitors?”
Significant human input is required to come up with the final result. For example, reviewers of the broadcast content capture tonality based on client-defined criteria to indicate whether a statement is positive, neutral or negative. That value-added component builds on the metrics that have been provided by Autonomy’s analysis.
“The objective of our search is not to find a particular article,” emphasizes Louw, “but to find out what’s going on. We are looking for a comprehensive understanding to produce actionable results for our clients.”
In addition to searching via the text that can be extracted from video, Autonomy also offers indexing and encoding for video through its Virage product. Virage’s technology generates metadata from television, video and audio. It is used on blinkx, for example, which contains millions of hours of video. In the future, Virage will provide the ability to extract location data, events and themes. Virage is now integrated with IDOL 7, the latest release of Autonomy’s search platform.
The finer points of Internet search
A variety of search techniques are used in enterprise and Internet search alike, each with its own strengths and limitations. Autonomy’s IDOL solution can handle full keyword functionality and metadata searches, which are standard approaches in search technology. But its solution is best known for a combination of Bayesian inference, which can be used to determine whether a document relates to a user’s query, along with principles of information based on Claude Shannon’s theory that rare or unexpected data is more informative than commonly encountered data.
This mathematical approach overcomes the limitations of other approaches to Internet search. Metadata is good when you can get it, according to Autonomy CEO Michael Lynch, but has some drawbacks. “You can’t always persuade people to agree on tags,” he points out, “and they need to be revised as situations change. One person’s terrorist is another person’s freedom fighter.”
In addition, the optimal degree of specificity for tags cannot always be determined. “If there are not enough levels in the tags, the target content will not be specific enough, but if the categories get too small, the chances of someone finding the information get vanishingly small.”
Keyword searches can bring up hits in abundance, but the results do not reflect any understanding of the content. Although they can work very well when the content is limited, on the Internet, the ambiguity of language and the lack of a taxonomy can
derail a search.
In contrast, the detection of patterns in a document allows it to be grouped with similar documents and linked to related information. “When a large amount of information is analyzed and clustered, the meaning emerges because the context is available,” says Lynch. “You can also perform operations, such as taking one day’s cluster away from that of the previous day, to see what’s new.”