Machine Learning & E-Discovery

Machine learning is a type of artificial intelligence which evolved from the study of pattern recognition. Through the construction of algorithms, machines can make predictions on data versus following preprogrammed commands.

What’s more, these algorithms can make independent decisions when new data is applied. As computer processing power improves, this technology can be applied to much larger data sets for reliable, repeatable results. This is the technology behind self-driving cars, fraud detection and data mining for online advertising.

The legal field has also found an application for machine learning algorithms to reduce the cost of document review. Specifically, these algorithms have been applied to doc review solutions such as; Computer Assisted Review, Technology Assisted Review (TAR), and predictive coding. The growth of email and storing of documents electronically, gave rise to the e-discovery industry in the early 2000’s. The exponential data growth is outpacing the traditional e-discovery methodology of filtering/searching to reduce data sets for review hosting. Pressure to reduce these costs, including pricey attorney review hours, had solution providers looking around for an answer. The potential application of TAR to legal document discovery quickly made it the industry “holy grail”. However, TAR solutions for e-discovery were mostly limited to industry conference circuits and talking heads. By 2012 the technology had matured enough begin to deliver on its promise to revolutionize doc review with a reliable alternative to traditional attorney review.

As technology advances, so does the expectations of courts when handling electronic documents. Judges now have little leniency for incomplete preservation or collection efforts, which has changed since the early 2000’s. Assuming a proper legal hold was in place, attorneys and third party providers are expected to provide all relevant email, user created e-docs, and associated metadata. If not, they face possible sanctions. Similarly, judges are just starting to push firms to use TAR to conserve legal spending resources, versus limiting discovery due to data volume.

The future of machine learning is not only focused on reducing cost, but it also has the ability to push technology review past the limits of keyword filters. This technology is not perfect or 100% accurate, but neither is attorney review. The biggest benefit on the horizon for law firms and their clients will be the application of TAR to an initial “first pass” review of docs. This removes the unnecessary hourly rates of young associates or doc review centers, who review data relevant to the case, or privileged communications. Removing this low hanging fruit conserves attorney billable hours for the more complex review issues of e-discovery. The attorneys benefit from avoiding doc intensive cases that take up a lot of their time, leaving bandwidth for other matters.

How to Avoid Information Governance Pitfalls »

« 5 Reasons Why Businesses Need A Business Email Archive Policy

Tags: aieDiscoverymachine learning

Josh Markarian:

5 Reasons Why Digitizing Oversized Documents is Essential for Modern Organizations
In today's fast-paced world, the ability to access and manage information quickly and efficiently is…
Navigating Enterprise Data: Tips for Corporate Data Mapping
For corporate legal teams, managing data effectively means understanding the organization's data environment and implementing…
Maximizing the Use of Social Media Data in eDiscovery
Social media data has become a crucial source of evidence in legal proceedings, this post…