Great moments in bad statistics history
Per Ronald Coase, if you torture the data long enough, they will confess to anything.
Rob James
May 17, 2026
As a counterpart to my post on Great Moments in Bad Graphics History, here are some notes from a great book I read. Much of this is taken directly from the author’s helpful “Cliffs Notes” summary, which I will add to with my own examples over time.
“If you torture the data long enough, it will confess.”--Ronald Coase
1. Patterns, Patterns, Patterns
By evolution and survival in the gene pool, we are predisposed to spot patterns, and to think the patterns we see have deep meaning.
“If a baseball player plays a good game after wearing new socks, he shouldn’t change socks.
If the stock market does well after NFC teams win the Super Bowl, watch the game before investing.
If a basketball player makes four shots in a row, he is hot and is very likely to make the next shot.
If a heart attack victim recovers after being sent healing thoughts from a thousand miles away, distant healing works.
If a customer satisfaction survey finds that people living in homes with three bathrooms are more enthusiastic than are people living in homes with two bathrooms, that is the target market.
If a country has a recession when federal debt was high, then government debt causes recessions.”
Don’t be fooled into thinking that a pattern is proof. The overwhelming theme of Smith’s book is Think Both Theory and Data. Get yourself a logical, persuasive explanation, and test that explanation with unbiased and accurate data.
2. Garbage In, Gospel Out
We watch the world and we draw conclusions based on the evidence of our eyes. But consider that the folks doing some activity chose to be there. “The traits we observe may not be due to the activity, but may instead reflect the people who choose to do the activity.”
If we see that kids who play competitive sports are confident, don’t conclude that playing competitive sports builds confidence. It could just as well be that confident kids choose to play competitive sports.
If we think men and women who work on Wall Street are aggressive, don’t conclude that Wall Street is an incubating chamber that induces aggression. It could just as well be that aggressive folks are drawn to Wall Street.
Does college X produce smart ambitious graduates, or do smart ambitious high-school students go to college X?
We all make inferences from what we see – the snapshot current wages of current workers, the bullet-holes to parts of the aircraft that made it back from a bombing run, the companies that are successful today. We should also think about what we do not see – the employees who left, how long the current employees have been in their jobs, the planes that never made it back because they were shot in other parts, the companies similar to the successful ones that went belly-up anyway. “The unseen data may be just as important, or even more important, than the seen data.”
Fight survivor bias by going back to square one in the past. Look at people who were hired over the last ten years, look at how long people stay in the job, look at all of the planes that were sent on bombing missions, tote up the characteristics of all the companies that have been formed over the last several decades, and then make your conclusions. .
3. Apples and Prunes
We need to compare one thing with another. How can you decide if a drug, therapy or policy works unless you place it next to an alternative? But it has to be a plausible comparison. Beware of percentage changes and absolute changes: as I said in my article on energy numeracy, “if they give you a percentage, ask for the absolute; if they give you an absolute, ask for the percentage.” Sometimes the only thing in common between two variables is that they both increase over time (population increases correlate with increases of lots of things).
The infamous autism vaccine study was reported by Andrew Wakefield in the distinguished British publication The Lancet in 1998. He reported that twelve previously normally developing children were administered the MMR vaccine and proceeded to be diagnosed as autistic. It was quickly refuted and flagged as questionable, first due to the small sample size, second by its having been funded by lawyers, third by the investigator proposing his own new vaccine. More seriously, it was found that five of the twelve had preexisting developmental difficulties and only one of them was fully diagnosed with autism. It was a reporter not another investigator who reported the additional data, and in 2010 the study was finally retracted in full.
A 1989 Paul Brodeur New Yorker article contended that cancer was caused by electromagnetic force of nearby power lines. This makes no sense because the electromagnetic force was much, much smaller than background radiation or the Earth’s magnetic field. By 1999, the New Yorker recanted the study. But the autism study and the power line (now data-center!) EMF-illness study persist in social media and among charlatans to this day.
On the other hand, sometimes there is a correlation. The Pennsylvania American Legion annual convention in 1976 was associated with a cluster of instances of a severe illness, rare by any background illness measure. It turns out “Legionnaires disease” was indeed caused by newly detected airborne bacteria. John Snow’s 1854 map of London cholera cases tagged the Broad Street water pump as the culprit.
We sometimes miss the reasons for patterns. The stock market dropped 4%, 1% and 5% on the last three days of a week in October 1987. By Monday morning, the Wall Street Journal quoted an “expert” saying the selloff had been exhausted, and that “we are close to the bottom.” Instead, on the very day that newspaper came out, the market collapsed 23%. (It was later determined that a “portfolio insurance strategy” was a trigger: it called for buying a stock after its price rises, and selling stock after its price falls!)
A South Sea Bubble banker defended his insane buying of shares that had no intrinsic value, thus worth anything only if he could find a more insane repurchaser, by saying, “When the rest of the world goes mad, we must imitate them in some manner.” A similarly obtuse Member of Parliament said, “I knew that the ruin must come sooner or later, but it came two months sooner than I expected.”
4. Oops!
Assertions make headlines when and because they are counterintuitive. But in general, trust your intuitions. Don’t quickly conclude that your intuition is off. Consider whether the data make sense or reflect some bias like self-selection bias. After all, as between two data sets, the causation might run in the opposite direction, or it might run in some third direction common to both sets, or there may be a mere correlation without causation.
And oh yes, consider the possibility that there was a mistake – that the computer was told to calculate the square root of 196 instead of the square root of 169. Even the best, and most honest, researchers are human – and humans make mistakes. The $300 million 1998 Mars Climate Orbiter ground software calibrated force in pounds per second, while the Orbiter itself measured force in newtons. What was supposed to be a 100-mile orbit became a 37-mile orbit, resulting in the craft plunging into the planet. Oops.
The ironic lesson to be learned from the worldwide influence of the Reinhardt-Rogoff study (claiming a certain level of debt-to-GDP ratio was critical, thus inspiring costly IMF austerity policies) and the runaway success of Freakonomics (correlation of legalized abortion in 1973 with lower murder rates after 18 years) is that “it is not true that data are more important than ideas. We are often duped by data”
Graphs can be useful. They can show how income and spending are related to each other. But they can also be deceptive, intentionally or not. Be very skeptical of graphs whose axes omit the zero point. Watch for graphs that zoom in on one segment, which can show apparent patterns (like a one-day dip in prices during a months-long price rise). The booby prize is a graphs with no axis numbers at all. Data like nationwide aggegates, wages and prices need to be adjusted for population growth and inflation. “Don’t be deceived by graphs that put time on the vertical axis instead of the horizontal axis where we are used to seeing it, or by graphs that use inconsistent spacing” – where a half-inch is sometimes five years and sometimes ten years.
Graphs are serious business. “A useful graph displays data accurately and coherently, and helps us understand a plausible relationship of two or more dimensions of accurate data. Chartjunk, in contrast, distracts, confuses, and annoys.” A chart may be misguided or malevolent.
5. Sense and Nonsense
The Monty Hall Problem (after choosing A out of ABC and being shown C is not the prize, should you switch from A to B?) is an apparent paradox because our intuition is wrong and we have to work out the answer with “slow thinking.” On the other hand, the Two-Child Problem (knowing the first child has sex M, what are the odds of the second child’s sex?) has the opposite problem—our intuition is right and trying to work out the answer superficially can lead to error.
“Don’t just do the calculations. Use common sense to see whether you are answering the correct question, the assumptions are reasonable, and the results are plausible. If a statistical argument doesn’t make sense, think about it carefully – you may discover that the argument is nonsense.”
Even professionals succumb to the false positive fallacy (a 99% accurate test will still have many false positives). “A test may be very likely to show a positive result in certain situations (for example, if a disease is present), yet a positive test result does not ensure that the condition is present; it may be a false positive.” False positives are more common when something is super rare (like a malignant tumor) or when there are a large number of readings (MRIs taken from lots of tissue, alive or dead).
6. Confound It!
“If a study supports your belief, there is a natural inclination to nod knowingly and conclude that your beliefs are confirmed. It would be smarter to look closely and think about confounding factors.” The raw data showed higher male than female acceptance rates overall for UC Berkeley’s graduate schools. Discrimination, right? The investigators thought so and zeroed in on the worst departments, the ones with the biggest discrepancies—and instead they found those departments tended to favor female applicants of equal qualifications. Instead, women had a lower overall acceptance rate because they were more likely to apply to the most restrictive programs, the ones with the lowest acceptance rates. “Always be wary of studies that use the data to confirm the theory.”
7. When You’re Hot, You’re Hot
Our love of patterns extends to the sports field. We are convinced of large percentage swings due to “hot hands” and “cold hands” in basketball (probability of making next shot dependent on having made or missed the prior shot). Even extended hot and cold streaks can happen by chance. Heads can come up 47 times in a row, even with a fair coin. (By the way, it is true that hot and cold hands probably do exist in basketball, but not as huge and for not as long a duration as our gut tells us.)
8. Regression to the Mean
A nebulous attribute like academic smarts or athletic fitness can be measured at one point in time at an extremely good or poor level of achievement. You are not as good or as bad as your extreme mark. Those who perform the best are probably not as far above average as they seem. Nor are those who perform the worst as far below average as they seem. The subsequent performances of both of them will consequently tend to regress to the mean. This does not mean that the best performers are jinxed, only that their exceptional performances were assisted by good luck.
This does not lead to what I have referred to elsewhere as “a beige rainbow.” Regression doesn’t mean that we are all converging on a middle point so that everyone will be meh. “It only means that extreme performances tend to ‘rotate’ among people who experience good luck and bad luck. It doesn’t mean that successful and unsuccessful companies are all converging to a depressing mediocrity.”
9. Even Steven
After a bad luck streak, we are sure our luck will change for the better. “But the bad things that happened to us do not automatically make good things more likely.” Consider whether you need to modify your behavior, or find a different environment, to have better likelihood of outcomes. If you’re getting a lot of rejection letters, it doesn’t mean an acceptance is more likely to come in. Maybe you need to handle applications or interviews better or at least differently, or frankly think about another type of job or school.
On August 18, 1913, the Monte Carlo casino roulette wheel produced 26 Black numbers in a row. Contrarians were wiped out before a Red number came up and the overall pattern of parity re-emerged.
A due date is simply an estimate. There is only a 5% chance that the baby will arrive on the due date predicted at the outset. 20% of all pregnancies are more than two weeks away from that due date, even before taking into account induced births and Caesarian deliveries.
10. The Texas Sharpshooter
There is an old joke about the Texas sharpshooter who shoots first and then draws the target. Even random samples produce outcomes that look like clusters to our eyes. “Someone who looks for an explanation will inevitably find one, but [by itself,] a theory that fits a data cluster is not persuasive evidence.” A good explanation must be sensible and fit unadulterated data.
If you shoot arrows at enough targets, sooner or later you will hit at least one of them. If you set out with hundreds of possible theories, random data will likely wind up producing patterns consistent with one of them. Define “old age” as 55+, 60+, or 65+, or define “young old” as 55-70 (hey, that hurts!), and hey howdy eventually you’ll get a titillating data set. “When you hear that the data support a theory, don’t be persuaded until you’ve answered two questions. First, does the theory make sense? If it doesn’t, don’t be easily persuaded that nonsense is sensible. Second, is there a Texas sharpshooter in the house? Did the person promoting the theory look at the data before coming up with the theory? Or did the person conjure up hundreds of possible theories, tested before settling on the theory being promoted?”
11. Serious Omissions
Beware of weird theories! And beware of researchers who say they threw out some of their data. Deciding to omit outliers can be critical if the outliers are meaningful. Omissions should be based on hard to accurately collect data, not on whether the omission strengthens the desired correlation. The Dow keeps going up because they drop Sears Roebuck. A tourist bureau asks only people who’ve gone to France twice whether they think the French are friendly, not those who went one-and-done. “The best rule for researchers is when in doubt, don’t leave it out. The best rule for readers is to be wary of studies that discard data. Ask yourself as anything if anything is really clearly wrong with the omitted data. If not, be suspicious. Data have been discarded simply because they contradicted the desired findings.”
13. Flimsy Theories and Rotten Data
Watch for an investigator culling data to fit his theory. “Extraordinary claims require extraordinary evidence. True believers settle for less.”
If Arthur Conan Doyle believed in spiritualism so much that “he could refuse to believe a woman’s confession that she pretended to communicate with the deceased, so can anyone. If J. B. Ryan could conclude that a volunteer’s failure to guest ESP cards correctly was evidence that the volunteer was guessing cards correctly (!), so can anyone. If Elizabeth Todd could believe that there was no harm in ransacking data for evidence that people can be healed by distant prayer, so can anyone.”
15. Data Without Theory.
Extrapolations can be dangerous. All kinds of terrible outcomes can be predicted by apply linear or even exponential growth to a few data points. As one of my personal heroes Edward O. Wilson wrote, “Population growth can obey the exponential equation only under special circumstances and for short periods of time. Any population miraculously permitted to grow at its full exponential rate for just a few years will come to weigh as much as the visible universe and to expand outward at close to the speed of light.” More succinctly, Herb Stein said “If something cannot go on forever, then it will stop.” Even Moore’s Law is hitting particle-physical limits.
Why do you think the extrapolation makes sense? Even coin flips can look weird (remember the 47 heads and 26 black numbers in a row). “When someone shows you a pattern, no matter how impressive the person’s credentials, consider the possibility that the pattern is just a coincidence. Ask why, not what. No matter what the pattern, the question is: why should we expect to find this pattern? A statistical comparison of two things is simply similarly unpersuasive unless there is a logical reason why they should be related.”
Why would stock prices correlate with debt, or with consumer prices? As I stressed in my article addressing The Limits to Growth and The Population Bomb, if commodity prices were endlessly to rise, won’t people find alternatives or think of smart ways to produce more? Did the investigator have a good theory in place before starting to look at the data, all the data?
Stigler’s Law: no scientific discovery is named after its original discover. (Even this law was conceived by someone else, namely Robert K Merton.)
16. Betting the Bank
The gold/silver ratio may have been 34 to 38 over decades, but that doesn’t mean the ratio will be in that range tomorrow. Two markets may have been correlated for years, but macroeconomic or geopolitical events could’ produce wildly different effects on both markets. “Don’t just look at numbers. Think about reasons.”
17. Theory Without Data.
Like the farmer and the cowman in the musical Oklahoma!, we need the theory and the data to be friends. “No matter who has done the study, (1) the theory needs to pass the common-sense test and (2) the theory needs to be tested with data that are unbiased and have not been corrupted by data-grubbing.”