Dr. Strangelove, or, 
How I Learned to Stop Worrying and Love BI
                
                I've always had a nagging underappreciation for the whole business intelligence thing. I'm just not that into BI. In the first place, BI is always trailing the trend, not spotting it. BI reports are really, really good at telling you what already happened. They're kinda crappy at telling what will happen next. On top of that, they're also undemocratic. Your run-of-the-mill BI tool is like rocket science to most users, so most users don't get to play with it. So, as a result, well-meaning business managers have to go to the one guy's cubicle who knows how to work the analytical tool, and ask "May I have a report, please?" It's like that scene in Oliver. It's sort of pathetic.
Worse, then, BI reports are b-o-r-i-n-g ... all squiggly-line graphs that look like yesterday's Dilbert cartoon. On top of that, it's bloody expensive. The costs stemming from the software, the hardware, the salaries of the database analysts and the IT support staff are enough to ensure that "business intelligence"— at least in the way we've always thought of it—is a luxury for only the largest and most elite of business society.
So, in the interest of full disclosure, I came into this with a boatload of bias against BI.
But I had a chat recently with Davor Sutija, vice president, strategic market development for Fast Search & Transfer, and I'm beginning to wonder whether there are changes afoot in the BI marketspace that may lead to a people's revolution in BI. The emergence of enterprise search as a key enabler for business intelligence may just be the ticket out for those who, like me, have always held a certain disdain for old-school BI. Here's what I'm thinking...
                
	
    
                Where the Twain Meets
As everyone knows by now, 80% of information in a company is unstructured, in the form of Word documents, emails., etc. There's also a great deal of legacy data that USED to be structured...data from ERP systems, for example. This information needs to be extracted using techniques that give you character strings. These character strings have to be parsed and restructured.
As we're just beginning to learn, without tying together the information residing in various databases with the text residing in various content repositories, we're only getting a partial view of our organizations. Call it a 360-degree view if you must, but without the unstructured portion, it's more like 38 degrees and dropping. And that's cold.
Actually, to be totally accurate, it's not precisely "un"structured content. It's just not very structured. And that's where enterprise search comes in. "An enterprise search system can combine unstructured sources with well-understood data models and also access to legacy systems in one repository that gives you access to the information in a semantically uniform manner," is how Davor puts it.
Basically, enterprise search is emerging as an easy way to do massive data integration. Here's why: People need to have access to multiple repositories of data that are stored in different locations, and arranged into different data models. The old way of doing that is to create a new, formal, yet another, data repository on top of all the other ones. Someone has to find data from the various target sources (which change all the time, by the way), extract the data and recreate it in the new format required by the BI software. This is not trivial, and accounts for a great deal of that budget you're spending for "IT."
Next, because the new data repository is uniquely formatted around the data model demanded by the BI tool and driven by the new information that has been added to it, a query that would result in anything useful has to be correctly fashioned, in the syntax required AND along the lines of the data model. In order to find a customer transaction, you need to know (a.) how to ask the question (syntax), and (b.) where among the rows and columns to direct your question (data model).
All this is expensive and cumbersome, as you can imagine, but also promulgates the same problem you're trying to fix in the first place! By adding new layers of database, continuously, atop all the preceding databases, you are only layering up more complication and cost. Now, imagine if there were an inexpensive tool that could look at all data repositories in your organization, and—because it understands both database models AND language semantics—could re-index everything in real time and provide a way to ask, in natural language, for the information you want. That's what enterprise search can do. "The old way would have been to create a data warehouse, aggregate all your various data types into a single repository under one data model that your BI tool has access to," says Davor. "That is an antiquarian way of doing things," he insists. People want to ask different questions, and there are different types of users with different skill sets. "You don't want to make people go to the IT department's SQL expert and ask for a fancy report based on his idea of what the data model should be," Davor says. Using natural language queries, you can ask for "Anderson." You don't have to know whether that is in the column called "customer" or the column called "last name." Search doesn't care.
At the Center of the Enterprise
If your enterprise search tool can create a virtual repository, constantly re-indexing and establishing relationships among all the many character and data strings it sees, then your search tool can replace all that data integration for ridiculously less money and far less complexity in both the technology and in your organizational make-up.
Davor told me about a police force in Germany that has every criminal file from 1954 onward indexed into a search repository, and that is the only data repository they have. There is no other data warehouse or mart at all. This is a great example of this new trend; the old idea that a search engine is a mirror of some other repository (or repositories), used merely as a front-end for retrieval, is being transformed. The search repository IS the repository.