Assessing the Right Search Platform for Your Enterprise
Advances in technology have made a depth and breadth of information available that challenges the capacity of humans to fully comprehend, easily process and quickly find what they looking for, when they need it. Unfortunately, software applications read algorithms, not minds. And, the "information" intention of an untrained searcher can be as cryptic to a search engine as its own code appears to all but the most skillful programmers. In order to demystify the correct meaning of our searching expressions, a search tool is only as good as its appreciation for connotations and overtones. Like the humans who utter them, words often contain separate identities at work than they do in more traditional settings. Anyone who ever received a literal response to a figurative question knows there's a world of difference between formal and casual dialects when we're "talking shop." The ultimate test for any enterprise search tool is how well it "gets" what you're looking for—whether explicitly stated, broadly defined or even loosely implied. Are you shopping for nuts and bolts? Or, are you more of a "bells and whistles" type of buyer?
This article gives prospective buyers of enterprise search guidance on where to focus their attention, helping to isolate the must-haves from the "cool"—but sometimes superfluous—features associated with the latest advancements in search technology.
The Search for Search Tools
Enterprise search is no longer the domain of the inquisitive and meandering. It is a cost of doing business that has become more visible as the process itself becomes a regular pathway in the everyday routines of organizations.
CHECKLIST: The Business Case.
According to Gartner's Whit Andrews, enterprises overwhelmingly identify relevant results as their key criterion for selecting a vendor to the exclusion of additional features, technology or even cost factors. The power of a relevancy system is that it brokers one person's questions with a community of compiled responses. Just how relevant is a question of continual and spirited debate. Whether your search goals are based on fancy analytics or simple answers, it is important to remember that it's not about the latest search engineering wizardry, it's about finding the information you need fast. Let's take a look at several indicators that typically fuel search engine selections:
Speed of Results—A search platform enables enterprises to index external data sources, such as network drives and enterprise e-mail servers, including public folders. So when a customer decides to search for a file, most of the work is already done, trimming search speeds from minutes to moments.
TIP: Keywords are no match for semantics: Contextually based search engines are able to address prevailing themes that emerge between document collections. This method surpasses the exact matching of keywords to approximate the more discriminating business of user intention.
At the law firm of Fulbright and Jaworski, SERglobalBrain (an intelligent search and retrieval solution from SER Solutions, Inc.) is able to search for an entire paragraph of text in over 25 million documents and yield relevant results in a few seconds. The full-text search engine that was replaced by SERglobalBrain at Fulbright typically timed-out after several minutes of attempting to search the same number of documents.
Index Facility—Indexing performance or facility is the ability to collect data from all relevant sources and formats, including file servers, internal repositories and relational databases in a dependable manner. Unlike the reassuring product demo in the first sales call, your corporate records are not static repositories and need to be updated in near real-time environment. TIP: Try this at home: Some solutions have a hard time responding to unexpected demands and out-of-sequence requests. Re-indexing an entire corpus may occur when an upfront feature needs to be appended as an add-on.
Categorization—The categorization process applies to widely divergent engagement levels—from the ambitious mapping of internal classifications to the hands-off approach of pointing the search device at a destination of documents and grouping them according to similarity. Such approaches improve the speed of the searches as well as the time it takes to classify the large repositories they pull from. TIP: Think billing codes for clients: Such divisions are a green light for consulting groups and service organizations like legal firms whose client base can span broad sets of industries. As a consulting group and/or business services provider, training the search solution system means not having to configure radically different frameworks based on the clientele you serve.
SERglobalBrain "learns by example" much like a human by reading and understanding sample documents to categorize like content without the need to develop and maintain complex rules.
Taxonomy Support—The keyword that many enterprise search vendors like to sound is taxonomy: a system for assembling information together according to a group classification scheme. Taxonomies may need to be built and maintained to leverage the power of your search platform. TIP: Handmade taxonomies may not be built to last: A sound search platform should not require word relationships and is not language-dependent. It searches the emergent patterns in the content and alleviates the need to build and maintain thesauri and taxonomies. With the updating of new data, relationships made in formal taxonomies may not carry over and require additional development. Additional administration is incurred when individual languages must be hosted as separation repositories.
Presentation—Look and feel, mapping workflow and drill-down to content are important "ease-of-use" factors in deploying search technologies. Models for displaying results vary from in-box metaphors to austere Google-inspired interfaces, and faceted search and browsing schemes.
TIP: Insist on presentational flexibility: Examinations of data may represent a dead-end for a master searcher or information overload for a novice user. Paring back such interfaces or selecting situation-appropriate interfaces (such as PDAs for remote sales worker support) can make the difference between a useless additional application and a well-received work enhancement tool.
Pricing—Is enterprise search a shopping-cart item that comes "plug-and-play" ready right out-of-the-box? Does any truly robust and best-fit solution require a full production team to fit and perform up to your exact requirements? Prices are not just across the board. They are all over the lot.
TIP: Hold the line on price: With such an inconsistent range of prices and services, make sure the one outcome you can count on is within your budget and without the risk that the changing nature of the project will rewrite the terms of the deal.
SERglobalBrain, for example, plugs into your existing IT infrastructure without requiring on-going professional services.
CHECKLIST: The Technology Case.
From an IT perspective, time-to-deploy is the largest variance of any single decision factor. Working from sample test data, field trials can range anywhere from a set-up time of three to four days to a span of five to six weeks or longer. Such pre-sales auditions are instructive both in terms of a solution's core offering as well as the additional tweaking needed to customize it to your own requirements. Other IT evaluation factors include:
- Database Support—Oracle (8i or above) and MSQL Server 2000 or above are the standard-bearers. Other solutions run on their own proprietary databases and may expose IT organizations to additional fees and maintenance cycles.
- Scalability—Larger-than-expected data volumes should be an expectation, not a red flag. Vendors typically provide a maximum index size. The nature of the data being indexed, however, can widely impact the index volume, rendering vendor comparisons difficult. For example, documents within an IRS data warehouse can contain 800 pages, whereas legal briefs typically run five to eight pages in length.
According to Paul Revilla, director of development for the law firm Fulbright & Jaworski L.L.P., SERglobalBrain's a solution that can scale larger-than-expected data sets simply by adding hardware: "If you want to search-enable more databases, you can throw more servers at it." Other competing solutions, Revilla remarks, required an additional layer of middleware when new source data was added to the mix.
- Integration—A healthy range of APIs exist to connect search technologies to business applications and portals, and search-enable non-Web documents with XML.
SERglobalBrain for instance, supports COM, JAVA as well as SOAP.
- Deployment—The implementation cycle includes installation, configuration, initialization and rollout. Each of these phases, along with some nominal customization, should be covered under a 3-6 week trial period.
- Security—Search platforms should conform to existing security models, such as single sign-on, without creating new points of maintenance.
Offerings like those from SER pre-select content that the user maintains the right to use. This increases search speeds and avoids the breach of corporate security policies where an innocent user may stumble onto greater access than their granting privileges would otherwise allow.
- File formats supported—Beyond Adobe, MS Office and e-mail formats, buyers must be aware that document types and content feeds supported are not consistent across all vendors. Transcription of video and audio files into searchable text is another flashy new feature which may be "cool," but is it an anticipated requirement?
Due Diligence in the Selection Process
There is no substitute for experience and, even at their most exacting specifics, the preceding decision factors are mere placeholders when it comes to finalizing the best fit solution for your enterprise search requirements.
STEP 1—The Test Phase: Going to Field Trial
Paul Revilla comments the "shoot out" between vendors in a trial setting is a great way not only to test solutions, but the mettle of the teams who build and support them. Revilla advises prospect organizations to assess data sets that exceed normal expectations. Pushing the limits of search platforms increases knowledge about the test products, the speed of the conclusions and ultimately the confidence in the vendor who earns the nod. While organizations and even the workgroups within them may differ about what constitutes the best search, few would argue that they don't want support issues to arise further down the road.
STEP 2—Subject Matter Experts: Driving School for Search Engines In large organizations, it is often a dedicated staff of information architects, corporate record managers and librarians who consult with IT on the internal requirements for enterprise search. What if your organization does not count on a dedicated staff of knowledge engineers and taxonomists? Who do you count on to tune the engine and tweak the search results? "Count on your domain experts to lead the training session when your search solution comes to class," advises Christy Confetti-Higgins, Information Architect with Sun Microsystems' corporate library. More than likely, not every file in your record archive will ever be accessed by human eyes, and not every server will be retrofitted with a common metadata standard. Still, your organization will be in a stronger position tomorrow when it starts closing knowledge deficits with the same zest that it adds to ever-growing content surpluses. Addressing enterprise search is the first step towards cashing in on your information assets—not just storing them.
SERglobalBrain can scale from personal laptop configurations to global enterprises. Its use of content-based search techniques provides fast, efficient access to relevant, up-to-date information and dramatically improves the way information is leveraged and managed. To learn more visit SER Solutions.
For the past 15 years, Marc Solomon has spread knowledge organizing search skills and practices to a broad group of corporate and nonprofit professionals. As president of Cambridge-based Attentionspin Advisory Group, he has conducted thousands of reference interviews and designed research training programs with Fortune 500 brand managers, management consults, social workers and fundraising professionals. He writes on enterprise search and content issues for Searcher, KMWorld and Baseline magazines and consults with search engine vendors on how to create meaningful and action-based search results.