Tag Archives: college student

Classification Crosswalks: Strategies in Data Transformation

What if you have too many categories in a categorical variable? Your cardinality is too high for a chi-square analysis.

Classification crosswalks are easy to make, and can help you reduce cardinality in categorical variables, making for insightful data science portfolio projects with only descriptive statistics. Read my blog post for guidance!

NHANES Data: Pitfalls, Pranks, Possibilities, and Practical Advice

If you are interested in population-level surveillance data, you might have thought about using NHANES data in portfolio projects.

NHANES data piqued your interest? It’s not all sunshine and roses. Read my blog post to see the pitfalls of NHANES data, and get practical advice about using them in a project.

Shapes and Images in Dataviz: Making Choices for Optimal Communication

If you use good judgment in choosing chapes and images to add to your data visualizations, your audience will be enlightened.

Shapes and images in dataviz, if chosen wisely, can greatly enhance the communicative value of the visualization. Read my blog post for tips in selecting shapes for data visualizations!

Portfolio Project Examples for Independent Data Science Projects

Are you a data scientist who is interested in doing independent portfolio projects to sharpen your skills? Then I strongly suggest you get a coach or a mentor.

Portfolio project examples are sometimes needed for newbies in data science who are looking to complete independent projects. This blog post provides some great examples of independent projects you can do with datasets available online!

Internship Strategy for Data Science: Download our Guide!

In data science, you can learn applied skills by being part of an internship at a noted organization.

Internship strategy for data science is not obvious, and even if you are in a college program, they often expect you to find your own internship. Download our internship strategy guide and get the experience you want!

Understanding Legacy Data in a Relational World

Data systems started being in use in the 1960s and 1970s, but these were flat systems, usually using IBM mainframes.

Understanding legacy data is necessary if you want to analyze datasets that are extracted from old systems. This knowledge is still relevant, as we still use these old systems today, as I discuss in my blog post.

REDCap Mess: How it Got There, and How to Clean it Up

REDCap mess on your hands? The REDCap designers made the application so loosey goosey, you can really program yourself into a messy corner if you don't plan well.

REDCap mess happens often in research shops, and it’s an analysis showstopper! Read my blog post to learn my secret tricks for breaking through the barriers and getting on with data analytics!

US Public Health Alphabet Soup Explained: What is the NACCHO?

You may have wondered if public health workers who are employed by local public health departments have a professional society devoted just to them. That's NACCHO.

You may already know that NACCHO is NOT cheese – but what is it? It’s a professional society for local public health officials. Read my blog post to learn what NACCHO does, and who it serves.

Data Curation Solution to Confusing Options in R Package UpSetR

It is possible to use data curation to solve the problem of a confusion vector containing options.

Data curation solution that I posted recently with my blog post showing how to do upset plots in R using the UpSetR package was itself kind of a masterpiece. Therefore, I thought I’d dedicate this blog post to explaining how and why I did it.

Native Formats in SAS and R for Data Are Different: Here’s How!

Why use particular data formats for different programming languages in statistics? Because the programs can then process the data faster and with more accuracy.

Native formats in SAS and R of data objects have different qualities – and there are reasons behind these differences. Learn about them in this blog post!

Verified by MonsterInsights