Business Intelligence: The text analysis strategy
reports and other medical documents and to provide quantitative analyses of text is valuable in the research at Louisville. In addition, the close integration of Text Miner with Enterprise Miner provides an easy way to combine analyses of structured and unstructured data.
Mining structured data and unstructured text is something SAS customers have applied to many different industry needs, says Mary Crissey, SAS product marketing manager for data mining and text mining.
For example, American Honda now uses SAS Text Miner to monitor warranty claims, in order to detect early warning signs of engineering problems. Honda analyzes text from call centers, technician feedback and other areas across their dealer network to find patterns in the records that may be early indications of potential problems. Then, Honda engineers can investigate further to pinpoint the root cause of the issue.
The attention to text-based feedback as part of their early-warning analysis is now becoming essential for manufacturing companies who strive to identify potential issues and resolve them quickly before they are allowed to snowball into larger, more expensive problems.
SAS has traditionally been strong in analytics; its Enterprise Miner product is used to mine structured data. In order to apply some of the same skills to text, the company turned to Inxight, which specializes in discovery and visualization of text information.
"Using Inxight's technology in our Text Miner product allowed us to analyze text for concepts using some of the same algorithms we had developed for Enterprise Miner," says Crissey. "We added some graphical interfaces that allow visualization of patterns found in the text, ranging from basics like word counts to a more sophisticated understanding of word usage that might indicate a specific predictive trend."
The issue of combining analyses of structured and unstructured data to provide more meaningful pictures of business and technical information is receiving increasing attention.
"This problem has been around for years," says Michael Corcoran, VP of strategy at Information Builders (IBI). "Portal technology tried to bring the various data repositories together, but what was delivered was an application in which the user still had to know where the data was located."
With more non-technical people (whether internal to the company, customers or supply chain partners) now seeking information, the ability to find information without knowing its location has become critical.
One interface that users know and are comfortable with is that of Google, so IBI opted to incorporate that search technology with its BI solution, webFOCUS, to create its Intelligent search tool. That top-down strategy does require the user to know the search target, as opposed to a text mining situation in which the software discovers patterns. However, for actions such as finding all the information on a particular customer, the combination of structured BI and a search engine offers advantages. Besides the familiar interface, the data can be found no matter in which repository it resides, in contrast to situations where the data must be in a dedicated warehouse. IBI's iWay Software integration tool provides adaptors to many structured databases and document management repositories, making everything accessible.
By the end of this year, IBI expects to have a template that presents BI reports and relevant unstructured content within one interface.
"This information will be available in a dashboard," says Corcoran, "with metrics, call center activity, contract information and whatever else the user defines as beneficial to getting the big picture." IBI has also partnered with SPSS to add predictive analytics to its own reporting capability. Although the practice of combining quantitative analytics and text analytics has not yet been widely adopted, it is becoming increasingly feasible, thanks to advances in technology, and is likely to see greater use in the relatively near term. [SIDEBAR follows]
Competitive intelligence from text mining
The Internet offers a wealth of information on competitors' behavior and their customers' comments, ranging from published articles to discussion boards and blogs. Most of the time, too much information is available, making quick conclusions hard to come by.
One option is to focus on the most relevant, high-value information sources. Factiva provides business news and information from selected content sources. Its Search 2.0 product uses text mining, sophisticated visualization and a specialized taxonomy to zero in on relevant competitive or other business information.
But some of the most valuable competitive information is not in formal collections. The Factiva Insight media measurement and reputation intelligence services access not only print and broadcast information, but also blogs and message boards that contain information generated by consumers.
"Factiva Insight can mine through millions of documents in a very short time," says Alan Scott, chief marketing officer at Factiva. "It does the analysis for you, identifying subjects being discussed with regard to competitive products."
A complete competitive strategy should include both targeted searching that retrieves information on known issues, and monitoring the information environment for previously unknown trends.
"Companies should keep track of what's being said about competitors, and contrast that information to the content about themselves," adds Chris Porter, manager of competitive strategy at Factiva . "The amount of coverage is important, but so is the tone and content." The more quickly information is acquired, Porter adds, the more time can be spent considering it and making information-driven decisions. (See competitive intelligence article, also in this issue).