Cluster Text

Cluster Text Blogs

Blog Authors

Latest from Cluster Text

The first Mid-Atlantic IG3 was held at the Watergate Hotel in Washington, D.C.. It was a day and a half long with a keynote followed by two concurrent sets of sessions.  I’ve provided some notes below from the sessions I was able to attend.  You can find my full set of photos here. Big Foot, Aliens, or a Culture of Governance: Are Any of Them Real?In 2012 12% of companies had a chief data…
The first Mid-Atlantic IG3 was held at the Watergate Hotel in Washington, D.C.. It was a day and a half long with a keynote followed by two concurrent sets of sessions.  I’ve provided some notes below from the sessions I was able to attend.  You can find my full set of photos here. Big Foot, Aliens, or a Culture of Governance: Are Any of Them Real?In 2012 12% of companies had a chief data…
The annual EDRM Workshop was held at Duke Law School starting on the evening of May 15th and ending at lunch time on the 17th.  It consisted of a mixture of panels, presentations, working group reports, and working sessions focused on various aspects of e-discovery.  I’ve provided some highlights below.  You can find my full set of photos here. Herb Roitblat presented a paper on fear of missing out (FOMO).  If 80% recall is…
This was by far the most significant iteration of the ongoing exercise where I challenge an audience to produce a keyword search that works better than technology-assisted review (also known as predictive coding or supervised machine learning).  There were far more participants than previous rounds, and a structural change in the challenge allowed participants to get immediate feedback on the performance of their queries so they could iteratively improve them.  A total of 1,924 queries…
Ipro renamed their conference from Ipro Innovations to the Ipro Tech Show this year.  As always, it was held at the Talking Stick Resort in Arizona and it was very well organized.  It started with a reception on April 29th that was followed by two days of talks.  There were also training days bookending the conference on April 29th and May 2nd.  After the keynote on Tuesday morning, there were five simultaneous tracks for the…
If one algorithm achieved 98.2% accuracy while another had 98.6% for the same task, would you be surprised to find that the first algorithm required ten times as much document review to reach 75% recall compared to the second algorithm?  This article explains why some performance metrics don’t give an accurate view of performance for ediscovery purposes, and why that makes a lot of research utilizing such metrics irrelevant for ediscovery. The key performance metrics…
The audience was challenged to construct a keyword search query that is more effective than TAR at IG3 West 2018.  The procedure was the same as round 4, so I won’t repeat the details here.  The audience was small this time and we only got one query submission for each topic.  The submission for the law topic used AND to join the keywords together and matched no articles, so I changed the ANDs…
The IG3 West conference was held by Ing3nious at the Paséa Hotel & Spa in Huntington Beach, California. This conference differed from other recent Ing3nious events in several ways.  It was two days of presentations instead of one.  There were three simultaneous panels instead of two.  Between panels there were sometimes three simultaneous vendor technology demos.  There was an exhibit hall with over forty vendor tables.  Due to the different format, I was only able…
Text Analytics Forum is part of KMWorld.  It was held on November 7-8 at the JW Marriott in D.C..  Attendees went to the large KMWorld keynotes in the morning and had two parallel text analytics tracks for the remainder of the day.  There was a technical track and an applications track.  Most of the slides are available here.  My photos, including photos of some slides that caught my attention or were not available on…
Text Analytics Forum is part of KMWorld.  It was held on November 7-8 at the JW Marriott in D.C..  Attendees went to the large KMWorld keynotes in the morning and had two parallel text analytics tracks for the remainder of the day.  There was a technical track and an applications track.  Most of the slides are available here.  My photos, including photos of some slides that caught my attention or were not available on…