How to Make Sense of Health Statistics

Lies, damn lies, and statistics. In his autobiography, Mark Twain attributed the phrase to the 19th-century British Prime Minister Benjamin Disraeli, yet the phrase never appeared in any of Disraeli’s writings or private remarks. The right statistics can reveal powerful truths about the world, but sometimes that truth is difficult to see through all the numbers. Twain himself wrote, “Figures often beguile me,” and figures beguile many other people, too. Statistics are often described as boring and complicated. Statistics, especially the misuse of statistics, can easily confuse the public.

For instance, news stories about health are full of statistics. Eating processed meat makes you 18% more likely to develop colon cancer. Drinking extremely hot beverages makes you 90% more likely to develop esophageal cancer. But what do those numbers really mean? Ideally, you should be able to read a story about a food or behavior and learn how that thing will affect your health. But it rarely works out that way.

If you want to learn as much as you can from health news, you have to know what these numbers mean. So having a general sense of basic statistics helps to make sense of the numbers that get thrown around. Sometimes, though, it’s not your fault that what the numbers mean isn’t exactly clear. Reporters and scientists can make mistakes, too. However, if you have a grasp on how the statistics are supposed to work, you can also learn when the numbers in a story don’t add up.

Tip 1: Know the Sample Size

When it comes to a health study, the sample is the group of people being observed or tested. And the size of that group is a major factor in how confident researchers, and by extension readers, are in a study’s results. The larger the group, the more evidence is gathered, and the more confident you can be in a study’s results.

Think about a hitter in baseball. Every time a hitter steps to the plate, he might hit a home run or he might strike out. But if you watch a single at-bat, it doesn’t really tell you much about how good the hitter is. Even the worst hitters sometimes hit a home run and even the best hitters sometimes strike out. And any hitter can have a hot streak or a cold streak. If you want to know how good a hitter is, you have to watch him over a whole season - hundreds of at-bats.

A lot of health studies are the same way. If you want to know how common a disease is in a group of people, it’s better if you observe lots of those people than just a few. However, it’s not just the size of the sample that is important.

Lisa Sullivan, the Associate Dean of Education for Boston University’s School of Public Health and faculty member in the Department of Biostatistics, told me in an interview that while a larger sample is generally better, “What’s really important is that your sample is representative of the bigger group.” For instance, if a study wants to measure the rate of breast cancer in American women, the sample has look like all American women. According to the American Cancer Society, white women are more likely to get breast cancer than hispanic women. So if a study’s sample only contained white women or hispanic women, the results would be different than if the sample had a representative mix of races in America.

This also goes the other way. Sometimes a study wants to establish the rate of a disease in a very specific population. Sebastien Haneuse, an Associate Professor of Biostatistics at Harvard, told me in an interview, “If you want to say something about Alzheimer’s in the elderly, and you’ve got a twenty year-old in front of you, that twenty year-old is not going to be particularly representative of the elderly.”

Tip 2: Understand Relative and Absolute Risk

Recently, CNN’s website ran a story with the following headline: “Drinking very hot tea almost doubles risk of cancer, new study says.” According to the article by Nina Avramova, drinking hot tea above 140F raised the risk of esophageal cancer by 90%. That sounds scary, but what does it really mean?

When a story or study tells you that a food or behavior raises your risk of getting something like cancer, that is known as the relative risk. Sullivan told me that’s only half the story. She said that in order to understand relative risk, “You really want to know what is the baseline.” That baseline is known as absolute risk, or the odds that you will get a particular medical condition over a specified period of time such as one year, or your lifetime. The CNN article never said how often people get esophageal cancer in general. Without that information, it’s impossible to know how large a risk something really is.

Let’s compare the two cancer risks mentioned earlier. According to the International Agency for Research on Cancer of the World Health Organization, eating processed meat increases your risk of getting colon cancer by 18% (relative rate). It doesn’t sound that bad on its own. But colon cancer is one of the most common forms of cancer. According to the American Cancer Society, about 5% of people develop colon cancer during their lifetimes (this is the absolute rate). Therefore, eating processed meat raises your absolute risk of getting colon cancer by about 1%.

According to the study in the International Journal of Cancer cited by the CNN article, drinking very hot tea increases your risk of getting esophageal cancer by 90% (relative rate). Your lifetime risk of getting esophageal cancers varies by sex (one in 132 for men, one in 455 for women according to the American Cancer Society) but for the overall population, it is about 0.5% (absolute rate). So if drinking hot tea doubles your chances of getting esophageal cancer, you have increased your risk by 0.5%, or half as much as eating processed meat increases your chances of getting colon cancer.

The point is not that either of these particular behaviors is that dangerous, but that seeing only a relative risk can be very misleading - 90% seems much higher than 18%, yet it results in a bad thing happening half as often. And if you like eating processed meat or drinking very hot tea, now you have a better idea of how likely that behavior will affect you. Also, though 1% doesn’t sound like a big number, if you have more than one bad eating habit, the odds can stack up quickly. Scientists working on the Global Burden of Disease Study found that in 2017 about 11 million people died from conditions related to poor diet. That’s about 1 in 5 deaths worldwide for the year.

So when you see a news story reporting the relative risk of some behavior for developing some condition, it’s not telling you much. Without the absolute rate to compare it to, you have no idea how likely that behavior will result in that condition.

Also these risks are not distributed equally over the population. “Relative risk does not necessarily speak to what will happen to you, or to me, or to any given individual,” Haneuse said. “That number summarizes what is happening across everybody.” That’s why someone can engage in risky behavior like smoking and live into their 90s, or someone can exercise every day, eat right, and still have a heart attack in their 50s.

Tip 3: Understand that Contradiction is part of the Scientific Process

One of the greatest sources of confusion when it comes to health stories is how often they seem to contradict each other. One week a food is great for you, and the next week it’s not. Part of the problem is that science and the news don’t always mix. 

Scientists expect to be wrong from time to time. Sullivan says, “No one study is definitive.” So a contradictory study might not even be a sign of bad science, but the expected statistical error when multiple studies carried out on the same topic take place. But news organizations have a different way of doing things. Journalists report new and relevant information to their readers. A study that contradicts a previous study seems much more interesting than a study that just confirms something we already know. “Statistics can be very powerful,” Sullivan said, “but you have to take that into context. It also means that sometimes we repeat studies because we want to build a body of evidence.”

And even if two experiments are very similar, they are studying different samples. Haneuse said, “If you did a study and recruited a thousand people, and then I did the same study and I also recruited a thousand people from the same population that you did, we would get different answers because they are different people.”

Think about that baseball hitter again. While there is a lot of variation in the outcome of a single at-bat, that variation is not all due to chance. That hitter is going to face a lot of different pitchers over the course of a season, and those pitchers are going to have different ability levels. So if you cut the hitter’s season up into chunks, by chance you might get a chunk where he faces a lot of good pitchers in a row and hits poorly, or you might get a chunk where he faces a lot of bad pitchers in a row and hits well.

The same is true for study samples. Since risk isn’t distributed equally among individuals, a researcher might get a sample of people who are more prone or less prone to a condition just by chance. The chance goes down if you have a larger sample, but it’s still possible.

*

Statistics are difficult. It’s easy to feel confused and frustrated when numbers get thrown at you. That’s okay. In his novel Contarini Fleming, one thing Benjamin Disraeli did write was, “Never apologize for showing feeling, my friend. Remember that when you do so, you apologize for truth.”

Source List

Avramova, Nina. “Drinking very hot tea almost doubles risk of cancer, new study says.” CNN, March 20, 2019. https://www.cnn.com/2019/03/20/health/hot-tea-linked-to-higher-cancer-risk-study-intl/index.html

“Breast Cancer Risk Factors You Cannot Change.” American Cancer Society. Accessed April 27, 2019. https://www.cancer.org/cancer/breast-cancer/risk-and-prevention/breast-cancer-risk-factors-you-cannot-change.html

Twain, Mark. Autobiography, Volume I, Berkeley, Los Angeles and London: University of California Press 2010, p. 228

Disraeli, Benjamin. Contarini Fleming; a Psychological Autobiography. Harper, 1832.

“Global, Regional, and National Incidence, Prevalence, and Years Lived with Disability for 354 Diseases and Injuries for 195 Countries and Territories, 1990–2017: a Systematic Analysis for the Global Burden of Disease Study 2017.” The Lancet, vol. 392, no. 10159, 2018, pp. 1789–1858.

Haneuse, Sebastian. Phone Interview. February 27, 2019.

“IARC Monographs Evaluate Consumption of Red and Processed Meat.” World Food Regulation Review, vol. 25, no. 6, 2015, p. 30. https://www.iarc.fr/wp-content/uploads/2018/07/pr240_E.pdf

“Key Statistics for Esophageal Cancer.” American Cancer Society. Accessed April 27, 2019. https://www.cancer.org/cancer/esophagus-cancer/about/key-statistics.html

“Lies, Damned Lies and Statistics.” The University of York, Department of Mathematics, July 19, 2012. https://www.york.ac.uk/depts/maths/histstat/lies.htm

Simon, Stacy. “World Health Organization Says Processed Meat Causes Cancer.” American Cancer Society, Oct. 26, 2015. https://www.cancer.org/latest-news/world-health-organization-says-processed-meat-causes-cancer.html

Sullivan, Lisa. Phone interview. February 14, 2019.