We’re kicking off our first episode of 2023 where we’ll analyze the Court’s decision on when search terms are over broad and not proportional to the needs of the case, as well as why understanding the language of the FRCP is key in ediscovery.
The matter we will be discussing is Jim Hawk Truck-Trailers of Sioux Falls v. Crossroads Trailer Sales & Service and the decision is from July 29, 2022 by United States District Judge Karen E. Schreier.
Continue reading or watch the video to learn more about the ediscovery issues in this matter.
Happy New Year and welcome to Episode 93 of our Case of the Week series published in partnership with ACEDS. Our first episode of 2023 comes to us from a decision from earlier this year, and we’ll get into that. My name is Kelly Twigger; I am the CEO and founder of eDiscovery Assistant, as well as the Principal at ESI Attorneys. Thanks so much for joining me this morning.
Couple of events for you to know about. First is registration for the University of Florida Levin College of Law (UF Law) eDiscovery Conference is open. That conference takes place on February 8th and 9th and is streamed live and for free. So grab some folks from your office, have a party in the conference room, and learn some great practical tips about eDiscovery.
We’re working feverishly on the case law report for 2022 (view 2021 Report), and that will be available later this month or early February. Also, Legalweek is coming up in March, so I’m sure you started to see emails about that as well. We will be at Legalweek, so we look forward to seeing you there.
All right, let’s get into this week’s decision. This week’s decision comes to us from a case titled Jim Hawk Truck-Trailers of Sioux Falls v. Crossroads Trailer Sales & Service. This is a decision from the district court for the District of South Dakota from United States District Judge Karen Schreier.
This is a decision from July of 2022, but it brings up some important considerations about search terms. We’re before Judge Schreier here on a motion to compel in which the plaintiffs are seeking to have the defendants run an additional seven search terms against data that has already been collected.
The objection to the motion to compel from the defendants is that the additional terms will cost roughly $200,000 to be able to review and produce and that it is not proportional to the needs of the case. In reviewing the information, the judge undertakes an analysis, not really of proportionality, but of accessibility under Rule 26, and denies the motion to compel.
As always, we tag the issues associated with the case in our decision in our eDiscovery Assistant database. The issues associated with this week’s decision include search terms, cost recovery, failure to produce, and proportionality.
Judge Schreier has 17 decisions in our database, so she is familiar with the issues related to eDiscovery. We’re going to see that there’s an interesting analysis that happens here, and so we’ll talk about that a little bit.
Facts of this case are pretty simple. The underlying action is for a misappropriation of trade secrets. The plaintiff, Jim Hawk alleges that Crossroads hired away individual employees from Jim Hawk and as a result, obtained customer contacts and then used those customer contacts to undercut Jim Hawk’s business. The two businesses are competitors within roughly the same market.
As I mentioned, we’re currently before the Court on a motion to compel responses to requests for production as well as interrogatories. For our purposes, we’re going to focus just on the responses to the request for production.
The request by Jim Hawk to Crossroads was to provide a full and complete production of documents responsive to seven search terms. Crossroads really objected to those seven terms as being overbroad, unduly burdensome, not proportional to the needs of the case, and unreasonably cumulative and duplicative. The plaintiff sought discovery from 13 custodians and asked for 99 search terms to be applied to that data.
The issue here is really the difference in the number of documents to be reviewed for the final seven search terms that are requested. There’s not a lot of information about the overall corpus of documents that were collected and reviewed using CAL here, but defendants did provide some specific facts related to those additional seven search terms on the proportionality issue.
According to the defendant, Crossroads, the seven additional proposed search terms would result in the identification of an estimated 42,216 additional documents for review. The review of those documents would take an estimated 600 hours of attorney time at a cost of roughly $115,000, so less than I mentioned earlier.
The plaintiff didn’t dispute at all the numbers that Crossroads put forth, but argued that those costs are proportional to the needs of the case because the terms are being applied to data that was only pulled for a six-month period. So plaintiff’s sole basis on the motion to compel is, “Hey, we’ve got a fixed time period and that’s all we asked you to pull.”
Now 99 search terms is a lot in most cases. I mean, very significant litigation, we’d see search terms numbering 9,900, maybe even more than that, but search terms strings, and that’s what we’re talking about here.
What is the Court’s analysis? Well, the Court starts as always with the language of Rule 26(b)(1) on proportionality and the six factors to be considered and whether or not the discovery sought here is proportional to the needs of the case.
In applying those factors to the additional documents that are identified by the seven search terms at issue, the Court rejected Jim Hawk’s argument that the time period really has any bearing on the proportionality issue. Now, interestingly, that’s where the proportionality analysis by the Court really stops. Instead, the Court turns to Rule 26(b)(2)(B) which discusses whether data is reasonably accessible and acts as a limitation on the discovery of data under Rule 26.
The Court notes that under Rule 26(b)(2)(B), if data is considered inaccessible, a party can overcome inaccessibility by showing good cause. Instead of looking at a proportionality analysis here, how many documents have been produced, what’s the value of the documents that are sought to be produced of that other additional 42,000, we’ve moved to an accessibility analysis.
It took me a while and sifting through this case and looking back at the language of the rule to understand, but it feels to me like we got confused a little bit in what the proper analysis is for this motion to compel additional information.
We’re not talking about data that is stored on a legacy system or is super costly to go and retrieve. We’re talking about data that has already been retrieved and search terms have been run across it, and we’re looking at a level of documents above what the other search terms are applied with a very low relevancy rate.
In my view, this is more of a proportionality analysis than an accessibility analysis, but the accessibility analysis is what the Court undertakes.
Now, the Court notes that the reasonably accessible standard refers to the degree of effort in accessing the information, not simply the accessibility of the material format. That’s really where the analysis takes root here in this decision.
Now, the Court looks at the cost of the collection review of the data based on the costs that are provided by the plaintiffs and finds that the data is inaccessible. The plaintiffs provided information that it was going to cost an additional $4,000 to collect the additional information and then approximately $115,000 to review it. That’s not an obscene amount of money, but there’s no information in the decision about other costs in the case for us to be able to judge the efficacy of those costs against.
Once the Court decides that the data is inaccessible due to cost under 26(b)(2)(B), the next question becomes whether Jim Hawk has shown good cause to overcome the inaccessibility. The good cause test really breaks down into seven factors that are delineated in that rule.
The first factor, the specificity of the discovery request, is not really an issue here. We’re talking about fixed search terms and they’re very specific. That factor really weighs in favor of the plaintiff.
The second factor, the quantity of information that is available from other and more easily accessed sources, the Court finds weighs for the defendant. Now, this one is a bit more interesting because the Court says that there’s been extensive discovery in the case, but does not really articulate whether the data that’s requested here is duplicative at all of other data. It merely says that there has been information that is more accessible that has been already produced in this case. Maybe other decisions in this matter articulate these issues, but we haven’t seen any written decisions from the Court on discovery in this case other than what we’re looking at today.
The third factor for the good cause test is the failure to produce relevant information that seems likely to have existed but is no longer available from more easily accessed sources. This factor is not an issue here. There’s no allegation of spoliation or that the information is not available. It’s just a matter of cost as to getting to it.
Factors 4 and 5, the Court considers together — factor 4, the likelihood of finding relevant responsive information that cannot be obtained from another more easily accessed source, and the fifth factor, predictions as to the importance and usefulness of the further information. The Court finds that both of those factors favor the defendant, and this is where some additional facts come into play that the defendant puts into the record. The defendant submitted a declaration from their ESI vendor which stated that based on review that had already been done using CAL (or computer-assisted learning), the relevancy rate for the first 92 search terms was at 7%, and that drops even further to 5% as of the last 2,000 documents that are reviewed.
Now that’s a very low relevancy rate. We would expect to see a much higher relevancy rate if you’re going to actually review content. It suggests to me that there wasn’t any analysis done of the search terms before they were agreed to. Both of those — four and five — weigh-in factor of the defendant because this relevancy rate is so low that it’s difficult for the Court to see why additional information is needed.
The sixth factor is the importance of the issues at stake in the litigation. That factor is not an issue here because there are no public policy considerations involved in this case. This is simply a competitive theft of trade secrets case between two competitors in the same space.
The seventh factor talks about the party’s resources, and the Court identifies that it really has no information as to what the individual party’s resources are here. This is likely because no one argued this, because they didn’t believe they’d be looking at an accessibility/good cause argument, because this case is really more about proportionality.
Now, of those seven factors totaled, the Court finds that three of them are not relevant to this issue, or the Court has no information on them. Of the remaining four, only one favors the plaintiff. The Court then notes that the seven factors for good cause are not a checklist, but they must be considered by importance.
With that in mind, the Court finds that the low level of relevancy of the responsive information really tips the hat in the favour of the defendants here. The plaintiffs really failed to show any heightened likelihood that the new and relevant information might be discovered using the search terms in dispute. That was really what tipped it over the edge for the defendants.
What it seems like here is that plaintiffs didn’t provide any real facts as to why they needed to get this information, and we’ll talk about that in the takeaways. Finally, really, the plaintiffs failed to show the good cause to overcome the inaccessibility limitation of 26(b)(2)(B), and the motion to compel the production of data related to the additional seven search terms was denied. The Court also granted costs and fees to Crossroads with the motion under Rule 37.
Okay, so what are our takeaways here? Well, I wanted to talk about this case first, because there are some key issues as to facts that are presented. But second, because we really haven’t seen an analysis of good cause under Rule 26(b)(2)(B) for quite some time. I think that’s because most of the time this analysis happens under the proportionality standard.
The Court’s analysis of inaccessibility as a basis for denying this motion, as I mentioned, isn’t one that we’ve seen before. Typically, inaccessibility is about the cost associated with retrieving data from legacy systems and not about the standard costs associated with collection and review. It feels here like the data had already been collected, even though there were some discussion of collection costs.
They were the same custodians’ data that had already been collected and had been searched with the previous 92 terms. It feels like that data was already sitting in a database somewhere and it was just the existence of that extra review that really was the issue on the motion to compel.
This was probably not an accessibility issue, but that’s how the Court did the analysis. It probably came to the same determination. Without a lot of facts from the plaintiff here, there really wasn’t much to tip the scales in terms of proportionality either.
As I mentioned earlier, the biggest thing that sticks out to me about this case is the relevance rate for review following CAL was at 7%. That’s very low. That means that the search terms that were applied were extremely broad. That takes us back to the issue of how to choose search terms. Search terms should be an iterative process where if the plaintiffs are going to propose search terms — which in my view is a disaster to start with because they have no idea what’s in the data and they’re simply drawing on research to identify terms — then testing those search terms is going to be key to determining whether or not they’re valid or at least to evaluating and providing a response from the defendants as to search terms that hit on the documents the plaintiffs are looking for with their proposed search term, but that are not the same term.
It doesn’t appear that there was much of an iterative process here based on this relevancy rate. Obviously, I’m reading between the lines and Monday morning quarterbacking could be completely wrong. But based on what we see in this decision, it seems like that iterative search term process could have been better dialed in and we may have avoided this motion practice at all.
The plaintiff may have also had more success here by providing some specific facts about the existing costs and number of documents that had already been produced in this case. For example, if the 42,000 documents to be reviewed for those seven terms only represented, let’s say, a 5% increase in volume, that might have been something that could have weighed in favour.
If they could have also shown some of the additional documents as to why they wanted the additional search terms — language from documents that had already been produced that led them to choosing this search — that may have also been persuasive for the Court. There was no information like that at least noted in this decision.
Now, we talk about this every week here and I will say it again, on a motion to compel, you have to have the facts to support your need for data. In the world of volume and costs of ESI, blanket assertions of relevance or proportionality will not get you anywhere. That’s really what we see from the plaintiffs here today based on what’s included in this decision.
Okay, that’s our Case of the Week for this week. Thanks so much for joining me. We’ll be back again next week with another decision from our eDiscovery Assistant database.
The post Episode 93: Analyzing the Court’s Decision on Overbroad Search Terms in eDiscovery first appeared on eDiscovery Assistant.