Cluster Text

Cluster Text Blogs

Blog Authors

Latest from Cluster Text

If one algorithm achieved 98.2% accuracy while another had 98.6% for the same task, would you be surprised to find that the first algorithm required ten times as much document review to reach 75% recall compared to the second algorithm?  This article explains why some performance metrics don’t give an accurate view of performance for ediscovery purposes, and why that makes a lot of research utilizing such metrics irrelevant for ediscovery. The key performance metrics…
The audience was challenged to construct a keyword search query that is more effective than TAR at IG3 West 2018.  The procedure was the same as round 4, so I won’t repeat the details here.  The audience was small this time and we only got one query submission for each topic.  The submission for the law topic used AND to join the keywords together and matched no articles, so I changed the ANDs…
The IG3 West conference was held by Ing3nious at the Paséa Hotel & Spa in Huntington Beach, California. This conference differed from other recent Ing3nious events in several ways.  It was two days of presentations instead of one.  There were three simultaneous panels instead of two.  Between panels there were sometimes three simultaneous vendor technology demos.  There was an exhibit hall with over forty vendor tables.  Due to the different format, I was only able…
Text Analytics Forum is part of KMWorld.  It was held on November 7-8 at the JW Marriott in D.C..  Attendees went to the large KMWorld keynotes in the morning and had two parallel text analytics tracks for the remainder of the day.  There was a technical track and an applications track.  Most of the slides are available here.  My photos, including photos of some slides that caught my attention or were not available on…
Text Analytics Forum is part of KMWorld.  It was held on November 7-8 at the JW Marriott in D.C..  Attendees went to the large KMWorld keynotes in the morning and had two parallel text analytics tracks for the remainder of the day.  There was a technical track and an applications track.  Most of the slides are available here.  My photos, including photos of some slides that caught my attention or were not available on…
From a field of hundreds of potential nominees, the Clustify Blog received enough nominations to be selected to compete in The Expert Institute’s Best Legal Blog Contest in the Legal Tech category. Now that the blogs have been nominated and placed into categories, it is up to readers to select the very best.  Each blog will compete for rank within its category, with the three blogs receiving the most votes in each category being crowned…
This iteration of the challenge was performed during the Digging into TAR session at the 2018 Northeast eDiscovery & IG Retreat.  The structure was similar to round 3, but the audience was bigger.  As before, the goal was to see whether the audience could construct a keyword search query that performed better than technology-assisted review. There are two sensible ways to compare performance.  Either see which approach reaches a fixed level of recall…
The 2018 Northeast eDiscovery and Information Governance Retreat was held at the Salamander Resort & Spa in Middleburg, Virginia.  It was a full day of talks with a parallel set of talks on Cybersecurity, Privacy, and Data Protection in the adjacent room. Attendees could attend talks from either track. Below are my notes (certainly not exhaustive) from the eDiscovery and IG sessions. My full set of photos is available here. Stratagies For Data Minimization…
This iteration of the challenge, held at the Education Hub at ILTACON 2018, was structured somewhat differently from round 1 and round 2 to give the audience a better chance of beating TAR.  Instead of submitting search queries on paper, participants submitted them through a web form using their phones, which allowed them to repeatedly tweak their queries and resubmit them.  I executed the queries in front of the participants, so they could see the exact…