The Rise and Fall of Statistical Significance

By Amy Ralph Mudge & Randal M. Shaheen on April 29, 2019

Dealing with clinical studies can be one of the more challenging aspects of being an advertising/marketing lawyer, particularly if you are one of many lawyers who took the political science/econ route to law school. However, there’s one question that we all know to ask: Are the results clinically significant? If the answer is no, the conversation shuts down and the study is set aside. If the answer is yes, things are definitely looking up.

So why then are we writing a blog on statistical significance? Are we testing a cure for chronic insomnia? Bear with us if you can. We promise to try to make this interesting. And more importantly, the one thing you always thought was easy about dealing with clinical studies may be about to get hard.

The pushback against the uninformed use of statistical significance has been gathering momentum. In 2016, the American Statistical Association released a statement cautioning against the misuse of statistical significance and P values. Since then, the idea has gained further traction. A March 2019 comment in Nature arguing against the use of statistical significance garnered 800 signatories, all of them professionals who work in fields that rely upon statistical modeling. So what is going on? [For more background on what statistical significance is and is not, read this article by economist Andrew Abere.]

The authors argue against a statistical world that is black and white – either an observed result (such as a difference in outcomes between a control group and a treatment group) is statistically significant at some appropriate prespecified level (such as 5%), or it is not. They also rail against a paradigm in which a speaker can say there was no difference between two groups (because the result was not statistically significant) even though everyone in the room can plainly look at the data and see that there actually was a measured difference.

For this reason, they argue that it is too simplistic to assert that studies contradict each other when one finds statistical significance and the other does not. They cite a real-world example involving anti-inflammatory drugs and whether they lead to an increased risk of atrial fibrillation. Two studies were conducted, both of which found an estimated average 20% greater risk in patients taking the drugs. However, in the one study the precision of the estimate of the effect was lower, indicating a range from a 3% decreased risk to a 48% greater risk. As a result, the outcome in this study, unlike the other study, was not statistically significant, even though both studies had the exact same estimated average 20% greater risk outcome.

An analogy to antitrust law might help explain what is going on. Under antitrust law, certain types of conduct are “per se” illegal because a “per se” rule provides certainty and there is little if any countervailing benefit to the conduct in question. However, other types of conduct are subject to the “rule of reason,” and a closer look is merited at the benefits and harms associated with that conduct in any given situation. And sometimes types of conduct move from one category to another as we understand more about their possible impacts. So think of the rule that says for studies to substantiate a claim, their outcomes must be statistically significant as a “per se” rule. Some are now arguing that we should change this standard to a “rule of reason” because legitimate reasons may exist to accept studies with results that do not satisfy the generally accepted levels of statistical significance in some circumstances, and that the benefits from doing so outweigh the loss of certainty from having a “per se” rule.

At bottom, the authors call for more scientific inferences and fewer statistical ones. While they acknowledge that the idea of statistical significance can be a factor in yes or no decisions (such as the aforementioned example of whether a study substantiates a claim), they argue that other factors, such as the costs, benefits and likelihood of potential consequences, are more important than whether the difference is statistically significant.

Given the emerging scientific consensus on this point, perhaps it is time for the FTC to also consider whether to adopt a more subtle approach to statistical significance when it comes to studies and claim substantiation.