Customer stories

Pharmaceutical manufacturer

Pharmaceutical manufacturer remedies laborious review. Corporation cuts review by 70 percent with OpenText Insight Predict’s continuous active learning


  • Faced claim of patent infringement
  • Started review due to looming deadlines
  • Progressed half-way through process before TAR approval


  • Leveraged completed work as seed set to train the TAR algorithm

  • Reduced remaining review time by 70% with continuous active learning

  • Saved nearly $70,000 after accounting for cost of TAR


“It’s never too late,” people often say. But is that true for technology-assisted review (TAR)? If a legal team has already put substantial time and effort into manual review, can TAR still be worthwhile? That was the issue presented in a patent infringement case where the client’s approval to use TAR came only after the law firm had manually reviewed nearly half the collection. Even that late in the game, OpenText™ Insight Predict produced substantial savings in time and cost.


The client’s approval to use TAR came only after the law firm had manually reviewed nearly half the collection. Even that late in the game, Insight Predict produced substantial savings in time and cost.

The law firm represented a generic pharmaceutical manufacturer that had been sued for patent infringement by a major brand name pharmaceutical company. The plaintiff claimed the company’s generic products infringed its patents.

The total collection to be reviewed after applying search terms and culling numbered approximately 40,800 documents. While not a huge collection, it was nevertheless a lot of documents to get through and would be a significant expense for the client.

Believing TAR would enable them to get through the review more quickly and at less cost, the lawyers recommended it to the client. But looming deadlines demanded they get started on the review even as the client considered the recommendation. It was only after the firm had manually reviewed nearly half the collection that it received the client’s approval to proceed with TAR.

By the time the approval came in, the firm had already reviewed some 18,200 of the 40,800 total documents. Had they used TAR from the outset, they likely would have avoided reviewing even that many documents. Even so, those documents gave them the advantage of providing a ready-made set of seeds to use to train the TAR algorithm.

OpenText used the coding determinations from those 18,200 documents to train Insight Predict, its next-generation TAR engine. Insight Predict uses continuous active learning (CAL), a machine learning protocol that enables it to use any and all previously coded documents as judgmental seeds to start the process. This means there are not separate workflows for training the TAR system and for review, as was the case with first-generation TAR systems. All documents that already have attorney decisions on them can be fed into the system at the start, and the entire population is analyzed and ranked.

Even introduced midway through review, OpenText Insight Predict enabled the team to eliminate the need to manually review 70 percent of the remaining documents and 40 percent of the entire set.

After completing the first ranking, review managers set Insight Predict to automatically create batches of 50 records each. Each batch contained the next-best, unreviewed documents most likely responsive to the opposing party’s production request. Each batch also included a few “contextually diverse” documents to make sure there are no topics or concepts in the collection that go unexplored by reviewers. As the reviewers completed their batches, the system continuously re-ranked the entire population in the background, incorporating their new coding calls to “get smarter” and improve its predictions. Each time a reviewer clicked a button for more documents, the system created a new batch based on the most recently completed re-ranking.

The review proceeded along this track until the reviewers started seeing batches with few, if any, relevant documents. This was an indication that few relevant documents remained. The results were tested by sampling the unreviewed documents. Statistical analysis showed that the review had achieved a very high “recall”—meaning that the team had found the vast majority of the relevant documents.

By the end of the TAR process, the team had reviewed another 6,800 documents, beyond the initial 18,200. There remained another 15,800 documents that they never had to review: once they started using TAR, they had to review only 30 percent of the remaining documents. TAR saved 70 percent of the remaining expense and time the review would have otherwise required. The law firm calculated that using TAR saved the client more than $70,000, even after accounting for the cost of TAR.