How to Lie with Statistics
While I maintain respect and regard for qualitative indications, I’m someone who will point to the data and take the numbers into account when I form thoughts or make decisions. That being said, I know that a lot of times, flaws in our thinking are influenced by how the data is presented to us. Understanding that data and statistical material is always used to get an argument across, at times, it can be used to misinform us through statistical manipulations called “statisticulations.” This short book highlights the defaults and details, being able to understand what we’re looking at and be willing to ask where did the information come from and question its accuracy. It has a lot of fantastic examples in it with good illustrations, even though it was published in 1954, its points are still valid till today. One must ask themselves five questions before going through any data or statistics: Who says so? How do they know? What’s missing? Did somebody change the subject? Does it make sense?
In the beginning, you realize that data and statistics can be flawed from the very start, as information and responses being collected from samples of people can be triggered by their individuals’ motives, emotions, and biases. Our samples should be incorporating of all types of people in a population, a representative sample, otherwise, we’ll have a tunnel thinking approach of specific cases. This made me think about how the GDPs of nations can be skewed tremendously. I was thinking about how Qatar's GDP (PPP) per capita was around $124k in 2017, labeling it as the richest country in the world. But, with a very small total population, mostly made up of cheap expatriate labor and Qataris only making up less than 15% of the population, that GDP is definitely way off. Furthermore, how we measure specific traits can also lead to false conclusions; such as IQ tests as a form of measuring intelligence, doesn’t incorporate social judgment or other aptitudes, and neglects traits like leadership and creative imagination.
What stood out to me is how we use the word “average” which could be the mean, median, or mode, picking which kind of “average” we want to use for different purposes to influence opinions in a certain way. Furthermore, what I found really interesting is when comparing data collected throughout different time periods, we often don’t realize that definitions of things change over time. An example provided was measuring the income of an average family, how the term “family” related to size was very different in 1949 than it is now, so the results could change but we haven’t changed those numbers.
It makes a terrific case of how people usually depend on visualizations when going through a page of words with numbers becomes tedious. It was very interesting to see how line charts and bar charts can be easily manipulated or drawn in a specific manner to inspire different understandings and reactions while presenting the same information. How there can also be manipulation in opinion polls by asking people who are more likely to give out your desired answer based on their backgrounds. Additionally, how percentages can be stunted or mixed because of how things are labeled or a disregard to the initial amount (the example I found very interesting was the rabbit burger, where the restaurant says the burger is 50% rabbit and 50% horse; but they’re using one rabbit and one horse in the mix. We assume it’s half-and-half, but clearly, it’s more horse than rabbit since the horse is bigger). Another case was before and after presentations, this was interesting how they make the “before” much worse than it actually was and the “after” much better than it is through photography, editing, and setup changes. Finally, the other percentage I found very interesting was promoting an entity based on shares, where they present all the employees and all the shares, but in reality, the majority of shares are going to a few executives.