Pie Chart ggplot Style is Surprisingly Hard! Here’s How I Did it

How do you make a pie chart in ggplot2 package in R? It's not that obvious

Pie chart ggplot style is a bad memory for me. I had made pie charts using base R, and it had been no big deal. I also had made time series and other plots with ggplot2, and it was no problem. But when I went to make a pie chart ggplot style for the first time, I found it very confusing!

The situation was my colleague and I had done some research into different types of deeper learning strategies in higher education, and one of them is to have students create types of media. We wanted to have a pie chart displaying the distribution of the types of media that were created in the studies of deeper learning we reviewed. Here are our data:

Type

Percentage

Blog

22%

Digital storytelling

11%

ePortfolio

11%

Podcast

6%

Video

50%

Total

100%

Pie Chart ggplot Style was Much Harder than Base R

As you can imagine, this was supposed to be a very simple project! But it was not. I posted the code on Github for you here. I will show you how I went through it.

First I read in the dataframe I had made from the estimates in the table, and loaded the ggplot2 library.

Circuit board image is a symbol of technology, statistics, data science, and innovation.
piechart_df <- readRDS("piechart_df.rds")
library(ggplot2)

If you run piechart_df to look at it, you will see this:

Circuit board image is a symbol of technology, statistics, data science, and innovation.
  Proportion                 Type
1 0.22222222                 Blog
2 0.11111111 Digital Storytelling
3 0.11111111           ePortfolio
4 0.05555556              Podcast
5 0.50000000                Video

As you can see, the data are represented in terms of proportions. I also wanted to have customized colors, so I made this color vector called pie_colors to refer to later in my ggplot2 coding.

Circuit board image is a symbol of technology, statistics, data science, and innovation.
pie_colors <- c("orangered4","orchid4",
  "palegreen4","paleturquoise4", "palevioletred4")

Pie Chart ggplot Style: There is No “geom_pie” or “geom_circle”

Now, before we look at the code, let me show you the output. This is the final pie chart below.

Of course, I expected there to be a “geom” shape for a circle. There isn’t! It’s geom_bar! Look at the code!

Circuit board image is a symbol of technology, statistics, data science, and innovation.
ggplot(piechart_df, aes("", Proportion, fill = Type)) +
    geom_bar(width = 1, size = 1, color = "white", 
          stat = "identity") +
    coord_polar("y") +
    geom_text(aes(label = paste0(round(Proportion*100,0), "%")), 
              position = position_stack(vjust = 0.5)) +
    labs(x = NULL, y = NULL, fill = NULL, 
         title = "Distribution of Media Conditions Tested") +
    guides(fill = guide_legend(reverse = TRUE)) +
    scale_fill_manual(values = pie_colors) +
    theme_classic() +
    theme(axis.line = element_blank(),
          axis.text = element_blank(),
          axis.ticks = element_blank(),
          plot.title = element_text(hjust = 0.5, 
            color = "saddlebrown"))

So, let’s talk about this code! How do we get geom_bar to be a pie chart? This is how:

  • On the aes line to launch the chart, notice how the x value is set at “”. That’s because it doesn’t matter. It’s a pie chart, so it doesn’t have an x – only a y.
  • The circular style is achieved with geom_bar + coord_polar. The geom_bar line describes a white, unshapen bar, and coord_polar(“y”) is what turns it into a circle.
  • The geom_text line turns the proportions into percent, and adds them as data labels. Notice the position option to move the label a little so it’s not so crowded.
  • labs fills in the labels I want to fill in.
  • guides format the legend. For whatever reason, I wanted the legend listed in reverse of the default, which explains the reverse = TRUE option.
  • Scale_fill_manual tells ggplot2 to use the color vector for the colors we specified and overwrite the default colors.
  • Theme commands after that include theme_classic() (to give it a clean, uncluttered theme), and the theme command to format the plot title with a particular color, and to suppress a lot of elements that look ugly in default on a pie chart.

Updated April 20, 2022. Photograph of pies by Haem85, available here. Added banners March 6, 2023.

Read all of our data science blog posts!

Chi-square Test: Insight from Using Microsoft Excel

Chi-square test is hard to grasp – but doing it in Microsoft Excel can give [...]

Identify Elements of Research in Scientific Literature

Identify elements in research reports, and you’ll be able to understand them much more easily. [...]

Design the Most Useful Time Periods for Your Conversions

Time periods are important when creating a time series visualization that actually speaks to you! [...]

Apply Weights? It’s Easy in R with the Survey Package!

Apply weights to get weighted proportions and counts! Read my blog post to learn how [...]

Make Categorical Variable Out of Continuous Variable

Make categorical variables by cutting up continuous ones. But where to put the boundaries? Get [...]

Remove Rows in R with the Subset Command

Remove rows by criteria is a common ETL operation – and my blog post shows [...]

CDC Wonder for Studying Vaccine Adverse Events: The Shameful State of US Open Government Data

CDC Wonder is an online query portal that serves as a gateway to many government [...]

AI Careers: Riding the Bubble

AI careers are not easy to navigate. Read my blog post for foolproof advice for [...]

Descriptive Analysis of Black Friday Death Count Database: Creative Classification

Descriptive analysis of Black Friday Death Count Database provides an example of how creative classification [...]

Classification Crosswalks: Strategies in Data Transformation

Classification crosswalks are easy to make, and can help you reduce cardinality in categorical variables, [...]

FAERS Data: Getting Creative with an Adverse Event Surveillance Dashboard

FAERS data are like any post-market surveillance pharmacy data – notoriously messy. But if you [...]

Dataset Source Documentation: Necessary for Data Science Projects with Multiple Data Sources

Dataset source documentation is good to keep when you are doing an analysis with data [...]

Joins in Base R: Alternative to SQL-like dplyr

Joins in base R must be executed properly or you will lose data. Read my [...]

NHANES Data: Pitfalls, Pranks, Possibilities, and Practical Advice

NHANES data piqued your interest? It’s not all sunshine and roses. Read my blog post [...]

Color in Visualizations: Using it to its Full Communicative Advantage

Color in visualizations of data curation and other data science documentation can be used to [...]

Defaults in PowerPoint: Setting Them Up for Data Visualizations

Defaults in PowerPoint are set up for slides – not data visualizations. Read my blog [...]

Text and Arrows in Dataviz Can Greatly Improve Understanding

Text and arrows in dataviz, if used wisely, can help your audience understand something very [...]

Shapes and Images in Dataviz: Making Choices for Optimal Communication

Shapes and images in dataviz, if chosen wisely, can greatly enhance the communicative value of [...]

Table Editing in R is Easy! Here Are a Few Tricks…

Table editing in R is easier than in SAS, because you can refer to columns, [...]

R for Logistic Regression: Example from Epidemiology and Biostatistics

R for logistic regression in health data analytics is a reasonable choice, if you know [...]

1 Comments

Connecting SAS to Other Applications: Different Strategies

Connecting SAS to other applications is often necessary, and there are many ways to do [...]

Portfolio Project Examples for Independent Data Science Projects

Portfolio project examples are sometimes needed for newbies in data science who are looking to [...]

Project Management Terminology for Public Health Data Scientists

Project management terminology is often used around epidemiologists, biostatisticians, and health data scientists, and it’s [...]

Rapid Application Development Public Health Style

“Rapid application development” (RAD) refers to an approach to designing and developing computer applications. In [...]

Understanding Legacy Data in a Relational World

Understanding legacy data is necessary if you want to analyze datasets that are extracted from [...]

Front-end Decisions Impact Back-end Data (and Your Data Science Experience!)

Front-end decisions are made when applications are designed. They are even made when you design [...]

Reducing Query Cost (and Making Better Use of Your Time)

Reducing query cost is especially important in SAS – but do you know how to [...]

Curated Datasets: Great for Data Science Portfolio Projects!

Curated datasets are useful to know about if you want to do a data science [...]

Statistics Trivia for Data Scientists

Statistics trivia for data scientists will refresh your memory from the courses you’ve taken – [...]

Management Tips for Data Scientists

Management tips for data scientists can be used by anyone – at work and in [...]

REDCap Mess: How it Got There, and How to Clean it Up

REDCap mess happens often in research shops, and it’s an analysis showstopper! Read my blog [...]

GitHub Beginners in Data Science: Here’s an Easy Way to Start!

GitHub beginners – even in data science – often feel intimidated when starting their GitHub [...]

ETL Pipeline Documentation: Here are my Tips and Tricks!

ETL pipeline documentation is great for team communication as well as data stewardship! Read my [...]

Benchmarking Runtime is Different in SAS Compared to Other Programs

Benchmarking runtime is different in SAS compared to other programs, where you have to request [...]

End-to-End AI Pipelines: Can Academics Be Taught How to Do Them?

End-to-end AI pipelines are being created routinely in industry, and one complaint is that academics [...]

Referring to Columns in R by Name Rather than Number has Pros and Cons

Referring to columns in R can be done using both number and field name syntax. [...]

The Paste Command in R is Great for Labels on Plots and Reports

The paste command in R is used to concatenate strings. You can leverage the paste [...]

Coloring Plots in R using Hexadecimal Codes Makes Them Fabulous!

Recoloring plots in R? Want to learn how to use an image to inspire R [...]

Adding Error Bars to ggplot2 Plots Can be Made Easy Through Dataframe Structure

Adding error bars to ggplot2 in R plots is easiest if you include the width [...]

AI on the Edge: What it is, and Data Storage Challenges it Poses

“AI on the edge” was a new term for me that I learned from Marc [...]

Pie Chart ggplot Style is Surprisingly Hard! Here’s How I Did it

Pie chart ggplot style is surprisingly hard to make, mainly because ggplot2 did not give [...]

Time Series Plots in R Using ggplot2 Are Ultimately Customizable

Time series plots in R are totally customizable using the ggplot2 package, and can come [...]

Data Curation Solution to Confusing Options in R Package UpSetR

Data curation solution that I posted recently with my blog post showing how to do [...]

Making Upset Plots with R Package UpSetR Helps Visualize Patterns of Attributes

Making upset plots with R package UpSetR is an easy way to visualize patterns of [...]

6 Comments

Making Box Plots Different Ways is Easy in R!

Making box plots in R affords you many different approaches and features. My blog post [...]

Convert CSV to RDS When Using R for Easier Data Handling

Convert CSV to RDS is what you want to do if you are working with [...]

GPower Case Example Shows How to Calculate and Document Sample Size

GPower case example shows a use-case where we needed to select an outcome measure for [...]

Querying the GHDx Database: Demonstration and Review of Application

Querying the GHDx database is challenging because of its difficult user interface, but mastering it [...]

Variable Names in SAS and R Have Different Restrictions and Rules

Variable names in SAS and R are subject to different “rules and regulations”, and these [...]

Referring to Variables in Processing Data is Different in SAS Compared to R

Referring to variables in processing is different conceptually when thinking about SAS compared to R. [...]

Counting Rows in SAS and R Use Totally Different Strategies

Counting rows in SAS and R is approached differently, because the two programs process data [...]

Native Formats in SAS and R for Data Are Different: Here’s How!

Native formats in SAS and R of data objects have different qualities – and there [...]

SAS-R Integration Example: Transform in R, Analyze in SAS!

Looking for a SAS-R integration example that uses the best of both worlds? I show [...]

Dumbbell Plot for Comparison of Rated Items: Which is Rated More Highly – Harvard or the U of MN?

Want to compare multiple rankings on two competing items – like hotels, restaurants, or colleges? [...]

2 Comments

Data for Meta-analysis Need to be Prepared a Certain Way – Here’s How

Getting data for meta-analysis together can be challenging, so I walk you through the simple [...]

Sort Order, Formats, and Operators: A Tour of The SAS Documentation Page

Get to know three of my favorite SAS documentation pages: the one with sort order, [...]

Confused when Downloading BRFSS Data? Here is a Guide

I use the datasets from the Behavioral Risk Factor Surveillance Survey (BRFSS) to demonstrate in [...]

2 Comments

Doing Surveys? Try my R Likert Plot Data Hack!

I love the Likert package in R, and use it often to visualize data. The [...]

3 Comments

I Used the R Package EpiCurve to Make an Epidemiologic Curve. Here’s How It Turned Out.

With all this talk about “flattening the curve” of the coronavirus, I thought I would [...]

Which Independent Variables Belong in a Regression Equation? We Don’t All Agree, But Here’s What I Do.

During my failed attempt to get a PhD from the University of South Florida, my [...]

Pie chart ggplot style is surprisingly hard to make, mainly because ggplot2 did not give us a circle shape to deal with. But I explain how to get around it in my blog pot.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Verified by MonsterInsights