Confidence intervals – why do we even have them? The reason is we usually don’t have all the data from a population. I downloaded one month of data (December 2024) about electric vehicle (EV) public charge points (CPs) in the United Kingdom (UK). The dataset had 2,848 rows – meaning that is the entire population. So in this case, I can calculate measures based on the entire population. But this is also a good way to demonstrate what happens if you take just a sample from that population, and construct confidence intervals to estimate the true measure, which is what I will do in this blog post.
What are Confidence Intervals?
Consider the term “confidence interval” – especially the word “interval”. Interval literally means a range between one number and another. So confidence intervals represent a range – but what is it a range of? It’s basically a range you construct when you only have of sample of data from a population, where you are very confident that the range contains the true population measure.
In our CP data, one of the variables is Number_of_sessions. This indicates the number of charging sessions that CP had that month. I calculated the true population mean, and it was 39.21. So this means in December 2024, each CP had an average of about 40 charging sessions.
Now, let’s see what happens when we take just a sample of data and try to estimate the population level mean Number_of_sessions by using confidence intervals. I literally did that on a spreadsheet and I’ll show you the result.
Confidence Intervals at Different Sample Sizes
I chose to make 95% confidence intervals, so I can say I’m 95% confident the true population measure falls in the range I will calculate. Then I chose random samples of different sizes.
- When I chose a sample of 50 CPs, I got a confidence interval of 25.33 to 52.87, which is a range of 27.54.
- When I chose a sample of 100 CPs, I got a confidence interval of 35.67 to 59.13, which is a range of 23.46.
- When I chose a sample of 150 CPs, I got a confidence interval of 33.68 to 50.79, which is a range of 17.12.
- When I chose a sample of 200 CPs, I got a confidence interval of 35.02 to 49.87, which is a range of 14.85.
- When I chose a sample of 250 CPs, I got a confidence interval of 34.32 to 46.98, which is a range of 12.66.
- When I chose a sample of 300 CPs, I got a confidence interval of 34.04 to 45.15, which is a range of 11.11.
It’s important to see a certain pattern. Each time a choose a larger sample, the range gets smaller – so you can be more precise about where the true population estimate lies. And luckily, all of my samples contained the true population measure. Because I am constructing a 95% confidence interval, in theory, 5% of my samples will not contain the true population mean.
How to Construct Confidence Intervals
Here I explain how to construct a confidence interval, and at the end I show a graphical depiction using our example spreadsheet.

First, you take a sample. In the example spreadsheet, the data are located on the “Main Data” tab, so the sample is specified by a series of cells on that tab.

Next, you calculate the point estimate average of your sample from those cells. In confidence intervals, this point represents the center.

Next, you calculate the standard deviation (SD) of your sample. You will need this to calculate the margin of error (ME).

After you have the sample average point estimate and the sample SD, you select the value of the Z score associated with how confident you want to be that the true measure is in the range of your confidence interval. We will select the most popular Z score, which is 1.96, which corresponds to the 95% confidence interval. Also, make note of the sample size you selected.

Now that you have the SD, sample size, and Z score selected, you can calculate the ME. The equation is the Z score multiplied by the result of the SD divided by the square root of the sample size. This ME represents the “margin” you will subtract from the point estimate to get the lower level (LL) of the confidence interval, and you will add to the point estimate to get the upper level (UL) of the confidence interval.

Finally, we have our ME. We subtract it from the point estimate to get the LL, and add it to the point estimate to get the UL. Now, you consider the range between the LL and the UL – that is where you are 95% confident the true population measure lies.

Read all of our data science blog posts!
Confidence intervals (CIs) help you get a solid estimate for the true population measure. Read my blog and try my CI calculator!