Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery

By Gordon V. Cormack & Maura R. Grossman

To appear in the proceedings of  SIGIR 2014: The 37th Annual ACM SIGIR Conference on Research and Development in Information Retrieval

Abstract: Using a novel evaluation toolkit that simulates a human reviewer in the loop, we compare the effectiveness of three machine-learning protocols for technology-assisted review as used in document review for discovery in legal proceedings. Our comparison addresses a central question in the deployment of technology-assisted review: Should training documents be selected at random, or should they be selected using one or more non-random methods, such as keyword search or active learning? On eight review tasks — four derived from the TREC […]