The law firm represented a generic pharmaceutical manufacturer that had been sued for patent infringement by a major brand name pharmaceutical company. The plaintiff claimed the company’s generic products infringed its patents.
The total collection to be reviewed after applying search terms and culling numbered approximately 40,800 documents. While not a huge collection, it was nevertheless a lot of documents to get through and would be a significant expense for the client.
Believing TAR would enable them to get through the review more quickly and at less cost, the lawyers recommended it to the client. But looming deadlines demanded they get started on the review even as the client considered the recommendation. It was only after the firm had manually reviewed nearly half the collection that it received the client’s approval to proceed with TAR.
By the time the approval came in, the firm had already reviewed some 18,200 of the 40,800 total documents. Had they used TAR from the outset, they likely would have avoided reviewing even that many documents. Even so, those documents gave them the advantage of providing a ready-made set of seeds to use to train the TAR algorithm.
OpenText used the coding determinations from those 18,200 documents to train Insight Predict, its next-generation TAR engine. Insight Predict uses continuous active learning (CAL), a machine learning protocol that enables it to use any and all previously coded documents as judgmental seeds to start the process. This means there are not separate workflows for training the TAR system and for review, as was the case with first-generation TAR systems. All documents that already have attorney decisions on them can be fed into the system at the start, and the entire population is analyzed and ranked.