-->

Keep up with all of the essential KM news with a FREE subscription to KMWorld magazine. Find out more and subscribe today!

  • May 1, 2008
  • By Sue Feldman President, Synthexis, co-founder of the Cognitive Computing Consortium
  • Article

What Makes Search Great?

A great search solution requires more than a search engine. Search engines are useful technologies. But to transform a technology into something that is useful and approachable, search applications must extend the basic search engine to include additional features and technologies. These might include:

  • Tunable relevance to make results more contextually accurate and relevant;
  • Categorization and clustering for browsing results, exploring collections or doing queryless searches;
  • Text analytics to extract concepts; names of important people, places and things; and their relationships to each other;
  • Specialized vocabularies, ontologies and taxonomies to remove ambiguity or to adapt an application to a particular industry or task; and
  • Additional rich media technologies like speech or image recognition to understand what speech recordings, images or videos are about, thereby incorporating these formats into the information platform.

But this is still not enough to create a great search solution. The tools and interfaces that turn the bundle of technologies into a trusted, valued information access platform make the difference between users flocking to or fleeing from a knowledge center, an intranet or a customer service portal. They are the face of the underlying technologies that make it possible to interact with the application, configure it, find information and use it within the daily flow of one’s work. Therefore, while a great search solution includes multiple features and technologies, it also incorporates:

  • Interfaces that make information finding easier and more intuitive;
  • Flexible configuration and administrative tools that provide control and visibility into the system while not overwhelming the non-search expert;
  • Security—at the document and the subdocument levels, applied at either indexing or search time;
  • Ability to search across legacy, current and even future formats, including audio and video content;
  • Connectors to a wide variety of repositories and business applications;
  • Quick response time;
  • Quick—almost real-time—index updating;
  • Index backup;
  • Scalability;
  • Affordable total cost of ownership;
  • Black belt-class customer support that doesn’t have the meter running;
  • Tools for adding metadata automatically and for normalizing relevance across different repositories; and
  • A footprint, or an index size, that isappropriate for the use.

The fact is, enterprise search is often quite complex. The technologies are poorly understood and explained. And the implications of having one combination of tools and technologies over another are not clear to most buyers. Unlike Web search, enterprise search makes different demands on an information access platform: for better accuracy; security; more formats; more reporting tools; more language understanding; and better interaction design. The design and requirements differ with each specific use.

Add to this the quandary that a simple, confined use of search can grow quickly if it is successful, with other departments wanting to emulate the original success. We hear, over and over, that good search wins converts and spreads virally within an organization. Buyers therefore must consider not only their immediate purchase but also whether it has the agility and flexibility to be rolled out for additional uses. It must be scalable enough to accommodate unknown and unexpected caches of information in almost any format.

Case Study: Supporting CA’s Tech Support
Good technical support is crucial for companies like CA. To many of CA’s customers, technical support is the face of the company. Yet, technical support is also expensive. And, if responses are not standardized and predictable, customers may get different answers to the same question—some better than others.

The problem: CA (Computer Associates) knew that improving and standardizing technical support was a complex problem. Its services organization, its distribution organization, its partners and its support teams all had developed different information silos, and each group viewed a problem from a different perspective. CA needed to provide consistent access across all silos, legacy and Web-based applications, to all stakeholders. Sam Detweiler, vice president for CA’s technical support systems, found that even though the accepted wisdom was to centralize all of the knowledgebases, the barriers to locating and moving millions of documents, combined with the organizational politics of the process, were insurmountable. He decided to add search as an umbrella layer that would mask the underlying complexity while making it appear that all of the documents were in a single collection.

The goal was to integrate search with a cross-CA initiative to establish a single technology framework for the future that comprises search, content management, and collaborative technologies. The trick would be to integrate legacy applications and formats into this future framework. CA’s support database was 30 years old. It preceded ODBC.

The solution: Detweiler inventoried his data first to determine the formats and collections he would have to include. Then he got buy-in from other stakeholders by distributing his list of criteria and asking for comments. Before interest in the project waned, he staged a bakeoff among three contending products to find a search engine that:

  • Connected to all legacy applications;
  • Indexed legacy and current formats without reformatting (IBM Bookframe—.bu— wikis, PDF, HTML, ODBC as well as standard Office and database formats);
  • Required little end-user training;
  • Provided tools for easy UI design; and
  • Could be installed easily by CA staff.

Detweiler installed three search engines and measured them for:

  • Accurate relevance ranking;
  • Ease of configuration;
  • Ability to connect to CA’s legacy databases;
  • Time to load; and
  • Time to search.

The search engines were all tested on the same set of 25 million documents indexed from a variety of sources and document formats. All three retrieved documents in less than a third of a second, although the answers appeared in different orders in the top two pages of results. Where they differed was in their ability to load old ODBC database data. One search engine informed him that he would have to turn the data into an XML flat file feed to make it indexable.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues