Case studies have shown that computer-assisted review may save clients millions of dollars and thousands of review hours without compromising quality. In an e-discovery world increasingly fraught with big data and unforgiving deadlines, computer-assisted review may be the difference between a successful review and a regrettable one.
kCura’s Relativity Assisted Review leverages human experts, text analytics categorization technology and a statistically sound quality-control system.
Creating and Training on the Seed Set
In the first phase of the Assisted Review workflow, human reviewers code a sample set of documents, which is referred to as the “seed set.” The seed set may be sourced from previously coded documents, a sample based on keyword searches, a random sample or a combination thereof. Because the seed set is the foundation on which the computer will code the remaining documents, it is imperative that it be coded consistently. To that end, one may consider having a small group of attorneys or a specialized group of reviewers who are experts on the relevant issues review and code this initial sample set of documents. These experts are referred to as “domain experts.” One may consider placing the domain experts in close vicinity to each other so that they may share coding decisions such that the coding is calibrated.
It is important to instruct the domain experts on how the computer “trains” on the seed set. They should know that an initial incorrect or ambiguous call may skew the computer’s training. They also should be instructed on the characteristics of good training documents. For example, because the computer utilizes a text analytics engine, it is best to use documents containing clear relevant text. A document consisting primarily of symbols, numbers or pictures; or containing blurry OCR or heavy formatting is not ideal.
After the domain experts code the documents creating the seed set, the computer trains on it. Using a text analytics engine, the computer will categorize the remaining documents as responsive or not responsive based on the seed set’s coding and the concepts or context of the documents. Multiple training rounds may be required.
Validation of the Results
After the training rounds, the computer will provide a statistical sampling of computer-coded documents for the domain experts to review and determine whether they agree or disagree with the coding. An overturn report is then produced that shows which computer calls the domain experts overturned and the documents in the seed set that resulted in the computer making the wrong calls. The computer also will show the portion of the document population that it did code properly.
Using the overturn report, one can tweak the initial review procedure, e.g., re-code a document in the seed set that led the computer to make an incorrect coding decision. Once one makes the appropriate refinements, the computer will suggest another sample set on which it will train. The process will continue until the computer meets the prescribed confidence level.
The Benefits
The benefits of Assisted Review are hard to ignore. It can:
- help eliminate not responsive documents from the review population,
- prioritize the most relevant documents for review,
- improve coding consistency,
- and ultimately accelerate the review process and reduce costs.
It is no wonder that corporations, counsel and courts alike are turning to computer-assisted review to come to grips with e-discovery demands that once may have seemed insurmountable.