ARCHIVED CONTENT
You are viewing ARCHIVED CONTENT released online between 1 April 2010 and 24 August 2018 or content that has been selectively archived and is no longer active. Content in this archive is NOT UPDATED, and links may not function.Extract from article by Herbert L. Roitblat, Ph.D
The current version of the Federal Rules of Civil Procedure highlights the importance of reasonableness and proportionality. As is widely understood, the cost of dealing with the volume of documents that could potentially play a role in a legal dispute can easily overwhelm the value of the case. Some kind of technology use is essential if we are to maintain a justice system that depends on evidence.
The problem is generally not the number of documents that will ultimately be introduced as evidence; rather it is the winnowing process that goes from the domain of potentially relevant documents down to the ones that must be produced. Ultimately, only a handful of those may end up being critical to a case. If we knew without effort which those documents were, we would not have to go through the complex discovery process.
Discovery involves more than winnowing, of course. The legal team not only has to decide which documents are pertinent to a case, but also understand the content of those documents and how they fit into and guide the theory of the case. Data analysis and understanding has not, historically, had the benefit of a well-structured process, but the winnowing task has. In this context, I am focusing on the problem of identifying the documents to be produced from large collections.
Assessing the reasonableness of any process can be facilitated by measurement. There is a saying that you cannot improve what you do not measure. Although one can use intuition or other forms of judgment to assess reasonableness, intuitive feelings of reasonableness alone may not be sufficient. In these cases, we would like to know how reasonable a process was. For this, we need measurement.
Overwhelmingly, the primary measurement of the efficacy of the winnowing process in eDiscovery is Recall. Of the documents that are relevant in a collection, how many (what proportion) of them have been identified? The idea is that the more complete the identification process, the better it has been. All other things being equal, a better process is a more reasonable process.
Still, from time to time, question arise whether Recall is a good measure for assessing the winnowing process.
As I read it, there are four related arguments about why Recall might be inappropriate as a measure of the eDiscovery winnowing process:
- Recall measures completeness, but completeness is not enough
- Recall is overly sensitive to the easy to find documents
- Recall is insufficiently sensitive to rare, but critical sources of information (smoking guns)
- Recall measures the number of documents that are identified, but not their importance
Read the complete article at Recall, Magical Thinking, and the Assessment of eDiscovery