Bayes' Theorem Visualization in R

An article for those who are done with formulas and long texts

Rafaela Pinter
5 min readMay 19, 2021

Well, Statistical Inference is nothing but applied logic, but it can be tricky sometimes. Once things are not linear anymore, our brain struggles to make logical associations. For a long time I felt overwhelmed with all those formulas and failed to picture all the scenarios in my head. However, once I started to visualize the problems through graphs, I started to understand what I was doing.

Photo by Roman Mager on Unsplash

And that's why I'm here today. I'm going to show you examples of how we can solve questions using (or not using) the Bayes' Theorem. You will see how easy it is to get to the answer once you take a glimpse at the graph. Thus, my goal here is to solve the problems without any rigid formula, so you get the point of why it works.

Introduction

For these exercises, let's suppose that we are studing the weight distribution of 10,000 kids, that this is a normal distribution with mean (m) of 24.5 kg and a standart deviation (sd) of 1.5 kg.

Photo by Rene Bernal on Unsplash

We can now begin to set our variables:

Setting up our variables
The weights of 10,000 kids

This is our main graph. Looks nice, right? As a reminder, this graph contains the density of probability for each weight, and the area below the curve across all x-axis is equal to 1.

Question #1

If we define as "normal range" a range of symmetrical weight values around the mean that corresponds to the probability of 0.95, what will be the upper limit of the range?

Easy. We need to find two weights that are equally distant from the mean and the area between the two points that contains 95% of the chart area. In other words, if we keep randomly choosing kid's weights, the values will be within our range 95% of the time.

As it is symmetrical around the mean, the other 5% is splitted both sides: 2.5% on the extreme right, 2.5% on the extreme left. We can calculate this with the quantiles q=0.025 and q=0.975, respectively. Let's code.

Code for question #1
We are calculating the red-striped area, which corresponds to 95% of the total area.

So, for our calculations, the symmetrical upper limit is 27.4 kg. If you are curious, the lower limit is 21.6 kg. That means that if we integrate between 21.6 and 27.4, we have an area of 0.95.

Question #2

If a child has a weight that deviates more than 2.2 standard deviations from the mean, their weight value is considered atypical, otherwise it is considered normal. Thus, a child's weight of 26.8 kg is considered: 0-normal or 1-atypical?

We actually don't need that many calculations here. To answer this question, we only need to check if 26.8 kg is in the range of our mean plus 2.2 times our standard deviation and our mean minus 2.2 times our standard deviation. If our range contains 26.8 kg, then the weight is normal. Else, the weight is atypical. Let's code.

Code for question #2
We are checking if 26.8 is in the red-striped area that corresponds to a deviation of 2.2 standard deviations from the mean.

As the arrow indicates, 26.8 kg is right inbetween our boundaries. That means that the kid's weight is normal!

Question #3

What is the probability of randomly selecting a child over 21.5 kg knowing that the value is below 23.8 kg?

Now we're talking about the Bayes' Theorem!

How do we know it? We have a clue that updates our universe of possibilities. We know that the value is below 23.8 kg. Thus, it is impossible for the weight to be 24 kg, for exemple. We must now update our logic, but look at the graph first.

Code for question #3
We are calculating the proportion of the red-striped to the blue area.

As is it impossible for the weight to be more than 23.8 kg, it does not make sense to keep those possibilities in our calculations. We are 100% sure that the value will be 23.8 kg or less. We are 100% sure that the blue area will contain our answer. And that's why our whole universe of possibilities is updated at line 21.

With the line 21 of the code, we are calculating the proportion of the red-striped area to the area in blue (which is our new universe of possibilities). With this fraction, we have de probability of randomly choosing a kid that weights 21.5 kg or more,

Question #4

What is the weight which 80% of children are above?

Watch out. It's not the 0.8 quantile. It's "1 minus 0.8" quantile, because 80% are above that weight. Thus, 20% are below. See? Easy but tricky.

For this question, we must only calculate the 0.2 quantile and we're done. Let end this.

Code for question #4
We are calculating the red-striped area. It corresponds to 80% of the total area.

Question #5

What is the probability of randomly selecting a child over 25.5 knowing that the value is over 22.0 kg?

Another Bayes'! Now our clue is that all weights are above 22.0 kg, so we select values above 22.0 kg as our new universe of possibilities. Go ahead and check the blue area on the graph.

Code for question #5
We are calculating the proportion of the red-striped area to the blue area.

The calculations in the line 25 show us that our probability is 0.265. In summary, if we randomly choose a kid, there is 26.5% chance that the weight is above 25.5 kg knowing that they are heavier than 22.0 kg.

Question #6

What is the probability of randomly selecting a child weighing between 21.5 and 25.2 kg.

Note that we don't have any prior knowledge this time. We don't have any clue. All we have is a range of weights that we need to study. Let's go.

Code for question #6
We are calculating the red-striped area.

Thus, if we choose one kid from our 10,000 population, there is a 65.7% change their weight is between 21.5 and 25.2 kg.

Acknowledgments

I would like do personally thank Prof. Wagner Bonat, PhD, for the excelent course teached at the Data Science & Big Data Specialization and for agreeing to share this content with the people around the world.

--

--

Rafaela Pinter

Bioprocess Engineer | Data Science enthusiastic | Bassist