Customer stories

Major bank

Major bank minimizes review time. Financial corporation gathers close to 98 percent of responsive documents, reduces review costs by 94 percent with OpenText Insight Predict


  • Faced alleged loss of millions due to borrower accounting fraud
  • Initiated effort that required review of 2.1 million documents
  • Lacked time, money to review all files


  • Employed innovative technology to find up to 98% of relevant documents

  • Reduced review costs by 94%

  • Decreased total review effort to only 6.5% of total reviewable population of 2.1 million documents


A large banking institution became embroiled in nasty litigation with a now-defunct borrower. Responding to a production request, the bank conducted an extensive investigation to find responsive documents.

Man working on laptop with paperwork

Flexible input within OpenText Insight Predict allowed the review team to take a multimodal approach to finding responsive documents.

Even after using a variety of techniques to cull those that it found, it was still left with over 2.1 million that needed consideration. Further keyword searching might have resulted in more reductions but the team wasn’t comfortable with what that process might miss.

Realizing they had neither the time nor money to review all 2.1 million documents, the client and counsel turned to OpenText™ Insight Predict, a technology-assisted review (TAR) engine based on continuous active learning (CAL).

Because Insight Predict has no limit on the amount of training seeds it can handle, review managers used 50,000 documents tagged for an earlier matter to jumpstart the review. Almost immediately, relevant documents from the larger collection were pushed to the front of the line.

Insight Predict’s algorithm provided batches made up primarily of the documents it most highly ranked. This ensured the review team was productive immediately, as it was focused on the documents that were most likely relevant. It turn, the trial team quickly got its hands on the most important documents needed to sharpen their analysis of the case.

The review batches included a mix of documents chosen based on their “contextual diversity.” This unique feature is designed to solve the problem of “you don’t know what you don’t know.” Specifically, the contextual diversity algorithm chooses documents for review that are different than those already reviewed. The algorithm clusters unseen documents by their common themes. It then pulls the most relevant examples from each cluster and presents them to reviewers as part of the batch. If the reviewer tags an example as relevant, the ranking engine is cued to promote similar documents. If the example is not relevant, the ranking engine learns that this cluster is of lesser interest.

The continuous active learning protocol within Insight Predict feeds reviewer judgments back to the system to improve the training and thereby the responsiveness rate for subsequent review assignments. As the reviewers release their batches, Insight Predict adds their judgments to further its training. The net result is that the algorithm gets smarter and smarter about finding and promoting relevant documents for review.

Furthermore, Insight Predict’s ability to use flexible inputs allowed the review team to take a multimodal approach to finding responsive documents. As the review progressed, Insight Predict identified promising search terms as well as custodians who held the most likely relevant documents. This enabled the trial team to independently run searches with these key terms and then tag the relevant documents found through searches.

The continuous active learning (CAL) protocol culled the 2.1 million document population in a defensible manner.

As with the regular review, these tagged documents could then be fed into Insight Predict’s ranking engine to further improve the training. This ensured that Insight Predict used every attorney judgment on a document, no matter where that judgment was made.

As the review progressed, relevant documents found by the team were tracked. Toward the beginning, the team tagged 10 percent as relevant. Over time, that figure rose to 25 percent and sometimes as high as 35 percent.

Upon completion, managers ran a systematic sample against the entire document population. Senior attorneys manually reviewed these documents for added credibility. The sample of close to 6,000 documents suggested the team found and reviewed 98 percent of the documents relevant to the production. This conclusion was based on a sample confidence level of 95 percent and a two percent margin of error. Even taking the lower end of the margin-of-error range, estimations found at least 92 percent of the relevant documents, still well beyond levels previously approved by the courts.

All of this was accomplished through a CAL workflow that put attorney reviewers’ eyes on every document produced, yet still required a total review effort of only 6.4 percent of the total reviewable population of 2.1 million documents. Attorneys were saved from having to review 1.97 million documents.