The Difference Between Legal Search and Web Search
What You Should Know About Search Tools for E-Discovery
In some cases, a Web search appliance only keeps the 20,000 most relevant files in its index for a particular occurrence. This search engine is completely useless in a legal context. Furthermore, many Web search technologies cannot index documents that consist of compound documents (e.g., .zip and .pst), bitmap data, multimedia documents, older electronic file formats and encrypted files. If a legal search program runs into these types of documents, it will either separate them through a culling process or will automatically include additional processing to make such files fully searchable. When full-text indexing a document, document and file properties should be automatically extracted as well ("forensic indexing") and made searchable.
Remember too that crawlers need to automatically exclude corrupt, encrypted or unexpected file types, which can crash your crawler.
In sum, full-text indexing is a detailed process, and it illuminates the point that you need to know exactly how your search engine works and how to explain it in court or to opposing counsel. If there is existing case law that refers to the engine you use, that helps build your credibility as well.
Understanding Disclosure
Regulators, courts and opposing counsel often have very specific document format requirements for disclosed data. You should be able to support common legal file formats such as DII, EDRM XML, iPro or Concordance load files. Also, you should be able to redact documents, in which case you need to TIFF-print native electronic files to verify that all non-relevant information is no longer in the disclosed document. In addition, retrieved data needs to be collected and copied to a legal hold server, which is nearly impossible with Web search engines.
Failing to address the points mentioned above will lead to a lot of expensive and inefficient discovery work. Every irregularity, missed deadline or missing piece of data means a potential fine and more reliance on expensive outside vendors. Risk is diminished by understanding the required processes, matching procedures to those processes, using the right tools, and working with the right partners to lessen your exposure and costs.
For more information about standards and best-practices, consult:
- EDRM.net: provides the recognized standard for e-discovery;
- The Sedona Conference: offers reports on search, discovery, legal hold, records management and document production;
- TREC Legal project: evaluates high precision and recall search technologies; and
- www.zylab.com: showcases highly rated developer of award-winning e-discovery solutions used in high profile cases.
Records Management, E-Discovery and Knowledge Management
ZyLAB’s Universal ApproachSince 1983, ZyLAB has worked alongside professionals in the auditing, legal and intelligence communities to develop tools for investigating and managing large sets of archived data. These award-winning technologies have been bundled into the ZyIMAGE Information Access Platform, an integrated document, content and records management solution that enables businesses, auditors and legal professionals to capture, investigate, structure and disclose information in an efficient and secure manner.
ZyIMAGE helps you find more, giving you the proven technology required for comprehensive legal search:
- Support for large and nested complex Booleans, proximity and quorum search;
- Fast fuzzy (supporting first character changes) and advanced wildcard search (a*, *a, a*a, and *a*);
- Hit-highlighting and hit-navigation;
- Reproducible and reliable relevance ranking;
- Forensic indexing of file and document properties;
- Automatic language recognition;
- Indexing capabilities for compound objects such as nested emails, compressed files, email collections, databases and more;
- Extended index and search process auditing and reporting;
- Advanced visualization tools;
- Incremental indexing of live network data;
- Integration with records management, legal hold, identification, collection, legal review, (TIFF) productions and redaction processes;
- Advanced text analytics and machine translation; and
- A search engine mentioned in existing case law.