Tag Archives: data compliance

Dataset Source Documentation: Necessary for Data Science Projects with Multiple Data Sources

If you work on a big data project with multiple source datasets, you run the risk of forgetting exactly how you blended them together.

Dataset source documentation is good to keep when you are doing an analysis with data from multiple datasets. Read my blog to learn how easy it is to throw together some quick dataset source documentation in PowerPoint so that you don’t forget what you did.

Joins in Base R: Alternative to SQL-like dplyr

In base R, you can execute SQL-like joins, as long as you use the correct code syntax.

Joins in base R must be executed properly or you will lose data. Read my tutorial on how to correctly execute left joins in base R.

Front-end Decisions Impact Back-end Data (and Your Data Science Experience!)

How the front-end and back-end are connected can impact how data are stored in the application. So if you extract the data, you can have data quality problems caused by the front-end.

Front-end decisions are made when applications are designed. They are even made when you design a survey in SurveyMonkey. What health data analysts often don’t realize is that these decisions have a profound impact on the quality and accuracy of the data that are collected through these front-ends, which is the focus of this blog […]

Verified by MonsterInsights