Flaws and Fallacies in Statistical Thinking 4/5
Stephen K. Campbell
9/ Improper Comparisons
The easiest comparisons to dispose of are the ones where things are compares against … nothing: “This widget X is 75% faster”. Faster than what? “Avis: we try harder”, anyone?
Comparing unlike things (apples and oranges) is another pet peeve: it’s only possible to compare things with the same characteristics and other relevant factors (like the context). Some broad terms like “conviction rate” must be split into more granular rates per article or at least similar group of articles. If the categories compared are too broad and have different elements in them – the comparison should be ignored.
It was said before but is worth mentioning again: there must be the same Friendly Definition for both items to be compared. Different sources classify certain items differently and sometimes proper comparison requires digging deeper into the definition specifics (“EBITDA margin” is the ubiquitous and painful example).
A subcase of a definition is arranging of items into different classes, i.e., classification. Comparing class sizes without agreeing on the universal classification principles and making subsequent decisions is dangerous. Data collection methods must also be identical, and the data collected must be of the same quality.
Many changes in statistics (e.g., crime rates) are also driven by changes in reporting methods on top of the changes in the underlying data.
MK: if you’re getting the feeling that comparing two numbers requires doing a large homework and digging deep into the subject matter – this is the right feeling.
If the bases for comparison are too small, numbers should be used instead of percentages.
The white vs black sheep comparison: it’s useful to know that white sheep eat more than black sheep [MK: I’m deliberately omitting the word “collectively”], but it’s even more useful to know that there are more white sheep in general. So, comparing characteristics of samples of substantially different sizes is also useless.
If 66% of all rape and murder victims are friends or former friends or relatives of the assailants, it’s wrong to think that one will be safer in a public park among strangers than at home, because the number of such life events (“in a park with only strangers” vs “at home with an acquaintance”) is incomparably skewed towards the latter. One can also think about the duration of such events to reach a similar conclusion.
The use of a standard for comparison in the form of a control group is essential (think of an A/B test). Any statement saying that the members of group X develop certain properties if they do Y is useless unless there’s a fair comparison either against the outcomes for the group that doesn’t include X, or if such anti-X group still develops these properties even if they don’t do Y.
10/ Jumping to Conclusions
Let’s get the obvious out of the way: it’s possible to supplement a single fact with enough context to support opposite points of view. This may blind the speaker as they thus ignore any other possible interpretations of the same evidence.
When equally plausible alternative conclusions can be reached from exactly the same statistical evidence, the logical link between evidence and conclusion offered is probably rather weak.
Whenever facts are presented about one group of items, the conclusion reached must pertain to the same group. Example: In 95% of couples seeking divorce either one or both partners. Incorrect conclusion: church goers stay wed. Debunking: the original statement says nothing about the opposite group of people, so there’s not enough data to imply anything about it. [MK: Of course, the context in which the original statement is made determines what the listener will infer from this information.]
11/ Faulty Thinking About Probability
[MK: Let’s get the obvious out of the way.] The difference between statistics and probability is that there’s a degree of uncertainty in probability, while statistics describes a fact.
In describing the probability there’s the “under the constant cause system” factor, meaning that if the game has a skill component in it, this skill may improve over time and affect the odds in later trials. [MK: you can’t get more skilful in tossing a coin.]
Subjective probability is the one where the decision maker must decide what probability to assign to an event based on the evidence available at the time.
A lovely fallacy is about the bomb not hitting the same hole twice: the underlying faulty assumption is that if the chance of this happening the first time is 1/10, then the chance of it hitting the same hole again is 1/10 * 1/10 = 1/100, while in fact these events are independent, so the real chance is still 1/10. Some people paid with their lives for this statistical lesson. A lesson to all of us: many events we think are dependent on each other are in fact not.
And the opposite is also true: for instance, if one hopes for a salary increase or a promotion, they shouldn’t treat these two events as independent of each other.