So why is the fact that machine learning (a/k/a “predictive coding”) is a black box such a problem? Is it because human review of documents (i.e., an eyes-on-all-docs full review) is somehow more transparent? Of course not. We have study after study of the greater accuracy and effectiveness of review assisted by machine learning (when used properly).
eDiscovery and legal document management should be platform independent. Litigators should not need to inconvenience themselves with inefficient workarounds and incompatible software.
Daily we read, see and hear more and more about the challenges associated with organizational and individual productivity in the world of social media. This week’s cartoon and clip highlights one proven technique for increasing social media productivity (cartoon) and several cool social media tools that may be useful for increasing social media productivity (clip).
Big data is now big business. In recent years, due to the exponential growth of databases (spurred at least in part by social media and cloud storage) and of the capability of technology to undertake data analytics on a massive scale, organisations have started to appreciate the potential hidden value that could be derived from their data.
Kroll Ontrack surveyed over 550 law firm and corporate ediscovery professionals to gauge the biggest trends and impacts in ediscovery in 2014. This was a great year for the world of ediscovery, and now is the perfect time to share some of the interesting 2014 trends with all of you.
There’s never been a phenomenon like Docker . Eighteen months ago, the company took its core technology, which enables IT people to move software easily between different machines by enclosing it in “containers”, and made it open source.
Rolling intelligence is the enterprise-level equivalent of pay it forward. Effort invested in one business unit, functional area, or type of document storage to visually classify documents benefits the other units, areas, or types of storage that are processed later on. It is the gift that keeps on giving. The reason is that there is a heavy overlap on the types of documents that are used or stored in different areas of an enterprise.
As the size of document collections continue to explode, finding the evidence needle in the ESI haystack is more challenging than ever. Understanding the latest tools, indexing techniques, and features associated with advanced eDiscovery keyword search will allow you to conduct a highly effective review while still delivering your production on time.
As most of you already know, the ranking of all documents according to their probable relevance, or other criteria, is the purpose of predictive coding. The ranking allows accurate predictions to me made as to how the documents should be coded. In part one I shared the idea by providing a series of images of a typical document ranking process. I only included a few brief verbal descriptions. This week I will spell it out and further develop the idea. Next week I hope to end on a high note with random sampling and math.
The one overarching lesson of the TAR decisions to date is that each case stands on its own merits. Courts look not only to the efficiency and effectiveness of TAR, but also to issues of proportionality and cooperation.
What follows is a summary of the cases to date involving TAR. Each includes a link to the full-text decision, so that you can read for yourself what the court said.
Daily we read, see and hear more and more about the challenges and concerns associated with predictive coding. This week’s cartoon and clip highlights a visual depiction of two knowledge workers taking a random sampling approach to predictive coding (cartoon) and some considerations for thinking about the challenges associated with textual analytics-based technology-assisted review platforms. (clip).
While Americans’ associations with the topic of privacy are varied, the majority of adults in a new survey by the Pew Research Center feel that their privacy is being challenged along such core dimensions as the security of their personal information and their ability to retain confidentiality.
Organizations may build their businesses on data, but they don’t necessarily manage it well. That’s why Chief Data Officers (CDO) can play a valuable role in helping the organization value its data across the enterprise. CDOs particularly are on the rise in regulated industries and Gartner predicts that 50% of all companies in regulated industries will have a CDO by 2017, according to Debra Logan, vice president and Gartner Fellow, in her session at Gartner Symposium ITxpo.
Published on November 10, 2014, the new Gartner Magic Quadrant for Enterprise Information Archiving (G00262936) provides information technology and business professionals with information and insight into solutions available to meet compliance and eDiscovery challenges while reducing primary storage costs.
By John Martin Google Maps teaches several lessons about information governance that are worth considering for any major IG initiative that will take in information from many sources, normalize it, keep it current, and present it in a useful fashion. 1. Start with a bold vision. Stephen Covey would call this begin with the end […]
It is time to take another look inside the IGI’s Annual Report 2014. Today, we examine predictions for the information governance (IG) market in 2015. The full report and related infographics are available for download now at: www.iginitiative.com/community (registration required).
“Free eDiscovery Processing” sounds too good to be true. Until now, you may have spent many hundreds of dollars per GB to process native documents like Outlook Email and Microsoft Office files into paginated reviewable formats like TIFF or PDF. So how can those charges simply disappear? The revolutionary answer lies with Lexbe’s secure, scalable cloud technology.
Soon all good predictive coding software will include visualizations like this to help searchers to understand the data. The images can be automatically created by computer to accurately visualize exactly how the data is being analyzed and ranked. Experienced searchers can use this kind of visual information to better understand what they should do next to efficiently meet their search and review goals.
Until a few years ago, there was basically no effort expended to measure the efficacy of eDiscovery. As computer-assisted review and other technologies became more widespread, an interest in measurement grew, in large part to convince a skeptical audience that these technologies actually worked. Now, I fear, the pendulum has swung too far in the other direction and it seems that measurement has taken over the agenda.
Daily we read, see and hear more and more about the organizational risks associated with social media use. This week’s cartoon and clip highlights a unique approach to dealing with social media risk (cartoon) and some considerations for thinking about and evaluating organizational risk and cost related to social media (clip).
Not being able to use names or labels for common objects should also scare content managers. Without labels for the types of documents used every day to conduct business, ECM users have to conduct the corporate equivalent of searching for “tall” and “branches” and “birds land” and “backyard” instead of just “trees.”
In a sea of 600+ e-discovery providers in the US alone, trying to find the right vendor to meet and fulfill your requirements is difficult. Like purchasing a car, you have a choice of vendors that range from local, to regional and national providers. Some that use their own technologies, others that use off-the-shelf products and a few others that provide traditional processing and hosting services spawned from the paper world.
Perhaps the most important conclusion of the study was that an advanced TAR 2.0 protocol, continuous active learning (CAL), proved to be far more effective than the two standard TAR 1.0 protocols used by most of the early products on the market today—simple passive learning (SPL) and simple active learning (SAL).
In a recent blog post on cross-border e-discovery, Sasha L. Hefler and Chris Dale discuss the differences between discovery in the United States and abroad, and the resulting challenges. As they point out, discovery in the United States is much more broad than other common law countries, which put limits on discovery in terms of scope—requiring proportionality—and use of personally identifiable information and other private data. While such differences in approach pose challenges in terms of cross-border discovery, these differences may also hold lessons to be learned for those looking to achieve more reasonable and proportionate discovery here in the United States.