Common E-Discovery Terms & Definitions Part 1
Learning how to navigate the waters of eDiscovery can be difficult, especially when so many common industry terms can sound like a completely different language. Some terms hold a consistent meaning while others change when they’re being applied to the fields of electronic discovery and digital forensics. Here are a few common terms that occur naturally in conversations around litigation to help lift your eDiscovery literacy.
Admissible | Admissible evidence is evidence that is allowed in a court of law during active litigation. In eDiscovery this typically means that the data produced during discovery is defensible and was collected properly under the FRCP laws. |
Big Data | Big Data is the terminology for describing the vast volume of data in whole. This is data made up of both structured and unstructured data. |
Continuous Active Learning (CAL) | CAL Stands for continuous active learning. Similar to machine learning which is driven by AI, continuous active learning is the growth model of using past experiences to help shape future decisions. Through pattern recognition and prediction, all review decisions automatically train the system, and the system continually updates the predictions as new human classifications are made helping legal reviewers see the bigger picture quicker. |
Container File | A container file is a single folder or structured document that houses multiple documents. |
Custodian | In eDiscovery, a custodian is the person having administrative control of a document or electronic file. This is commonly the owner and/or creator of the electronically stored information. |
Deduplication | The process of removing duplicate files and data. Deduplication is important because it identifies wasted hosting and review costs before they occur. |
eDiscovery Processing | eDiscovery Processing is the process of preparing, converting, and optimizing ESI from the collection phase for document review and production. It’s at this stage where filters can be applied to reduce and refine the data. |
Electronically Stored Information (ESI) | ESI stands for Electronically Stored Information. This is the umbrella term that encompasses any and all data that is stored electronically. When referencing data as a connected whole see: Big Data When referencing the creation of ESI see: IoT |
Filtering | Using search metrics to eliminate electronically stored information in a particular data set that is irrelevant to current litigation. |
FRCP | The Federal Rules of Civil Procedure. These guidelines define the proper protocol for e-discovery as well as other aspects of litigation. |
Hash | Hashes are unique alphanumeric values that are randomly generated through encryption that serve as an identifier assigned to specific documents. Hashes are crucial when it comes to validating the authenticity of a document. You can think of this as the files very own fingerprint. |
Hosting | Hosting is the active of hosting the entire dataset relating to a case both during the document review phases and after the lawsuit in the form of archiving. Hosting is typically discussed in terms of giga-bytes, and tera-bytes on the larger side. |
Hosting Environment | A stand alone environment for hosted data that is kept separate from other databases. These are typically used on large cases where data is meant to be isolated and not in a shared cloud or physical server. |
Information Governance | The process of managing digital information at an enterprise level while remaining cognizant of regulatory guidelines and e-discovery applications. |
Internet of Things (IOT) | IOT stands for The Internet of Things, with “things” referring to physical devices that are connected to the internet for the purpose of creating and exchanging data over a network. This includes everything from smartphones, laptops and computers to wearable technology like smart watches, and health devices like pacemakers. |
Keywords | A specific word or phrase used in search efforts to produce relevant results within a data set. |
Legacy Data | Legacy data is data pre-dating the current matter or case that is in question. This can be when data from a past case has overlap with a current matter. Counsel can obtain and import the legacy data to their hosting environment and reduce costs of the re-collection process. |
Load File | Load Files are files pre-prepared specifically to be imported into the eDiscovery technology platform. Each specific database may have their own structure for load files (Relativity, Summation, and more) but it is also common for load files to first be delivered in CSV files. Load Files are NECESSARY to properly import data into eDiscovery softwares. |
Looking for more terms? Keep an eye out for our next blog – Common eDiscovery Terms & Definitions Part 2. If you would like more information about eDiscovery, forensics, hosting or how TERIS can help you, please contact us!