Text analytics: greater usability, less time to insight
Looking for the unknown unknowns
One of the more challenging issues in analytics is the discovery of “unknown unknowns,” meaning the questions that users did not know to ask. Luminoso, a spinoff from the MIT Media Lab, offers a text analytics product that models the relationships between different concepts by combining statistical information gleaned from a collection of text with ConceptNet, a repository of basic facts about the world. “Luminoso analyzes large collections of text in combination with a body of general knowledge developed over many years to produce insights into the drivers of customer sentiment,” says Dennis Clark, co-founder of Luminoso, “even when users do not know what factors might be potentially relevant.”
Designed to run without human intervention, Luminoso Analytics generates categories dynamically rather than using taxonomies for classification. Luminoso’s Compass software, introduced in 2014, analyzes streaming data in real time and is well suited to monitoring customer feedback. “Our products are faster by a factor of four or five in terms of time to value,” says Clark, “because the models can be built in a matter of minutes and do not require changes in rules for every analysis like other systems generally do.”
As an example of how Luminoso Analytics can be used, Luminoso evaluated the Amazon (amazon.com) Kindle Fire HD, analyzing more than 7,000 reviews on Amazon. Among the conclusions were positive sentiments associated with the screen and speakers, the Android OS and the apps. In the complaints department, users disliked the fact that the device does not come with a charger, for example.
Clark says, “The model Luminoso built from the dataset, coupled with Luminoso’s concept visualizer, permits a user to quickly separate positive from negative concepts, validate discoveries against verbatims [excerpts from the text], quantify the relative importance of different ideas within the dataset, and generate and test new hypotheses. Key topics identified by the analyst within the UI can then be exported as document scores and consumed by traditional BI tools,”
Into the future
“Text analytics is effective at certain things right now,” says Tom Reamy, who heads the KAPS Group, which specializes in text analytics consulting and developing semantic infrastructures. “Things like pattern recognition and concept identification can be done very readily. What is lacking is a deeper analysis and understanding. The meaning inside the text and the context around words are critical to text analytics.”
Automated systems can provide quick insights and the development costs are relatively low, but the systems are not transparent and going to deeper levels usually requires human involvement. “We are in a period now where there is still a need for refinement and adding more intelligence to text analysis,” Reamy continues. “Except for some special cases, a hybrid solution of human and machine is usually best—whether the human input is at development time and/or at application time where machine and human are used to augment not replace each other.”
Opportunities also exist in improving the methods of getting feedback into the system, making text analytics smarter. However, companies are also building in more semantic abilities so that the base starting point is not a blank slate. “Analytics is starting from smarter platforms now,” he says. “This will help expedite the speed and improve the depth of analyses in the future.”
With a predicted global market of $6.5 billion expected by 2020 and a growth rate of 25 percent per year, text analytics has graduated from an exotic niche market to a legitimately mainstream technology. Although text analytics software cannot emulate the functioning of the human brain, when its strengths—the things it does better than humans—are combined with human skills, the results are impressive.