ARCHIVED CONTENT
You are viewing ARCHIVED CONTENT released online between 1 April 2010 and 24 August 2018 or content that has been selectively archived and is no longer active. Content in this archive is NOT UPDATED, and links may not function.By Bill Dimm
This article looks at a few common misconceptions and mistakes related to predictive coding and confidence intervals. Confidence intervals vs. training set size : You can estimate the percentage of documents in a population having some property (e.g., is the document responsive, or does it contain the word “pizza”) by taking a random sample of the documents and measuring the percentage having that property. The confidence interval tells you how much uncertainty there is due to your measurement being made on a sample instead of the full population. If you sample 400 documents, the 95% confidence interval is +/- […]
Read the original article at: Predictive Coding Confusion