Category Archives: Data Science

Posts about data science topics.

Dumbbell Plot for Comparison of Rated Items: Which is Rated More Highly – Harvard or the U of MN?

This is an example of a dumbbell plot from the ggalt package in R that you can also use in RStudio

Want to compare multiple rankings on two competing items – like hotels, restaurants, or colleges? I show you an example of using a dumbbell plot for comparison in R with the ggalt package for this exact use-case!

Data for Meta-analysis Need to be Prepared a Certain Way – Here’s How

This is the forrest plot resulting from analysis with open source statistical software R using package rmeta.

Getting data for meta-analysis together can be challenging, so I walk you through the simple steps I take, starting with the scientific literature, and ending with a gorgeous and evidence-based Forrest plot!

Sort Order, Formats, and Operators: A Tour of The SAS Documentation Page

SAS software sorting a to z or using arithmetic operators

Get to know three of my favorite SAS documentation pages: the one with sort order, the one that lists all the SAS formats, and the one that explains all the SAS operators and expressions!

Confused when Downloading BRFSS Data? Here is a Guide

You can download public data from health surveillance surveys. However, you have to know how to locate it on the web site.

I use the datasets from the Behavioral Risk Factor Surveillance Survey (BRFSS) to demonstrate in a lot of my data science tutorials. The BRFSS are free and available to the public – but they are kind of buried on the web site. This blog post serves as a “map” to help you find them!

Doing Surveys? Try my R Likert Plot Data Hack!

The Likert package in R can visualize categorical data.

I love the Likert package in R, and use it often to visualize data. The problem is that sometimes, I have sparse data, and this can cause problems with the package. This blog post shows you a workaround, and also, a way to format the final plot that I think looks really great!

I Used the R Package EpiCurve to Make an Epidemiologic Curve. Here’s How It Turned Out.

Epidemiologic Curve of 2015 Middle East Respiratory Virus Outbreak Using R EpiCurve Package

With all this talk about “flattening the curve” of the coronavirus, I thought I would get into the weeds about what curve we are talking about when we say that. We are talking about what’s called an epidemiologic curve, or epicurve for short. And to demonstrate what an epicurve is and what it means, I […]

Which Independent Variables Belong in a Regression Equation? We Don’t All Agree, But Here’s What I Do.

Little table of X Y Pairs with a Regresion Diagram with Least Squares Line

During my failed attempt to get a PhD from the University of South Florida, my doctor friend asked me one day to build a linear regression model using a small dataset he had collected from a lab. He had measurements of these “new” chemical messengers called cytokines – so that definitely dates this story! I […]

Verified by MonsterInsights