Text Analytics: on the trail of business intelligence
Business intelligence (BI) solutions have typically focused on analysis of quantitative data to measure and predict organizational performance. The analyses help drive decisions about staffing, R&D, marketing and other business activities. However, quantitative analyses do not always provide an indication of causality. Those clues are more often found in unstructured data, which constitutes the great majority of information but is more difficult to analyze. Organizations ranging from intelligence and law enforcement to consumer goods face a major challenge in making effective use of that data, but text analytics is providing some of the answers.
The Chesterfield County Police Department in Virginia is one of the largest in the country, with 500 officers and 100 civilians. Among the staff are about a dozen analysts who review data about ongoing cases to help resolve them. Each case report consists of a set of structured data with fields such as the date of the incident, names of individuals arrested and other related information that is stored in a records management system. The report also includes a narrative that describes the incident. During a typical year, the department generates about a million lines of narrative text.
Because of the volume of the narratives, the department began considering the use of an automated text analysis product from Attensity called the Law Enforcement Analyst Desktop Solution (LEADS). Attensity’s technology is based on computational linguistics, which extracts information about entities—people, places and events—along with information about the relationships among the entities. The resulting data is then converted to XML format and saved in a relational database, where it can be combined with other structured data and analyzed using business intelligence (BI) tools.
After evaluating the potential benefits of LEADS, the Chesterfield County Police Department decided to proceed with its implementation, and is now moving data from its records management system (RMS) into an Attensity database. Data extracted from the narratives will also be put into the relational database, and the combination will form the basis for in-depth analyses. Of particular interest are the links or relationships between people, locations and events. The visual presentation of links and charts of the results provide a different perspective on the data and offer new insights that are not available from the text alone. The department analysts currently conduct extensive link analyses as an important part of their investigations, but LEADS will help speed up the process substantially.
Once LEADS is operational, we will be able to detect patterns in criminal activity much more quickly," says Jack Ritchie, commander of the Administrative Support Bureau at the Chesterfield County Police Department. "For example, the narrative description for several cases may mention a particular type of vehicle at the crime scene or a method of break-in. Or, several victims of crimes may report the same words being used by an attacker, indicating that the same individual may be responsible for each crime."
LEADS will help ensure that no references are missed, because it will find a name, location or any other term every time it appears, and can quickly locate information that might otherwise remain buried in reports.
In addition to providing a way for experienced staff to analyze and visualize the crime report data, LEADS will be helpful to new workers. "New analysts are not in a position to be aware of the history of crimes committed before they joined the department," Ritchie says, "but by using LEADS, they can gain access to relevant historical information very quickly."
"The system is easy to use," Ritchie adds, "and we anticipate that it will have a significant impact on our ability to close our cases and support officers in the field."
The LEADS solution is one of a number of specialized modules that focus on particular issues, including fraud detection, claims analytics and "voice of the customer," which analyzes customer feedback.
"Quantitative data reveals what is happening, but not why," says Michelle DeHaaff, VP of marketing. "The explanations are often contained in comments in surveys, discussion groups or notes taken by call center representatives." Attensity’s use of statistical processes integrated with linguistic extraction allows a more precise interpretation of the content.
"Looking for the word ‘happy’ in the text will not provide an accurate analysis if the word ‘not’ appears nearby," adds DeHaaff. "By integrating statistical and linguistic analyses, we are able to get a more accurate—and actionable—result."
Tuning in to customersThe evolution of text analysis is customer-driven, according to Sid Banerjee, CEO of Claraview. Claraview is a consulting firm that specializes exclusively in business intelligence solutions, primarily in the area of transactional analysis. As a result of customer demand for a solution that integrated transactional data with communications data, such as call center messages, the company spun off a separate business, Clarabridge, to answer that need. Clarabridge developed and markets a text mining/text analytics platform that transforms customer comments, notes, reviews and other unstructured content into quantifiable, statistically meaningful reports and analyses.