Remove rows by criteria is a common ETL operation – and my blog post shows you how to do it using the subset command.
Tag Archives: peer-reviewed literature
Joins in base R must be executed properly or you will lose data. Read my tutorial on how to correctly execute left joins in base R.
Text and arrows in dataviz, if used wisely, can help your audience understand something very abstract, like a data pipeline. Read my blog post for tips in choosing images for your data visualizations!
Ask me anything about data science or public health every month! Subscribe to my “Ask Me Anything” membership, and get all your questions answered in real time!
Wiley’s predatory behavior recently with a colleague’s scientific manuscript makes me want to avoid publishing in their journals. Read about our experience.
The paste command in R is used to concatenate strings. You can leverage the paste command to make refreshable label objects for reports and plots, as I describe in my blog post.
“What is the ONC?” is what I used to ask before I realized it involves health technology. Although ONC just means “Office of the National Coordinator”, this agency is now known as HealthIT.gov, as I explain in my blog post.
Time series plots in R are totally customizable using the ggplot2 package, and can come out with a look that is clean and sharp. However, you usually end up fighting with formatting the x-axis and other options, and I explain in my blog post.
Native formats in SAS and R of data objects have different qualities – and there are reasons behind these differences. Learn about them in this blog post!