Category Archives: Data Science

Posts about data science topics.

Rapid Application Development Public Health Style

If you work on front-ends or back-ends of health applications, you are probably already familiar with the concepts of Agile and rapid application development.

“Rapid application development” (RAD) refers to an approach to designing and developing computer applications. In public health and healthcare, we are not taught about application development – but it’s good for us to learn about it, since we have to deal with data from health applications. My blog post talks about the RAD approach I […]

Understanding Legacy Data in a Relational World

Data systems started being in use in the 1960s and 1970s, but these were flat systems, usually using IBM mainframes.

Understanding legacy data is necessary if you want to analyze datasets that are extracted from old systems. This knowledge is still relevant, as we still use these old systems today, as I discuss in my blog post.

Front-end Decisions Impact Back-end Data (and Your Data Science Experience!)

How the front-end and back-end are connected can impact how data are stored in the application. So if you extract the data, you can have data quality problems caused by the front-end.

Front-end decisions are made when applications are designed. They are even made when you design a survey in SurveyMonkey. What health data analysts often don’t realize is that these decisions have a profound impact on the quality and accuracy of the data that are collected through these front-ends, which is the focus of this blog […]

Reducing Query Cost (and Making Better Use of Your Time)

Slow queries can happen in SAS, R, Python, SQL or any database language. These slow queries have a cost.

Reducing query cost is especially important in SAS – but do you know how to do it, or what it even means? Read my blog post to learn why this is important in health data analytics.

Curated Datasets: Great for Data Science Portfolio Projects!

If you need data to do a project, read this blog post for information.

Curated datasets are useful to know about if you want to do a data science portfolio project on your own. I made this blog post for our group mentoring program. Check out the ones I am promoting on my blog!

Statistics Trivia for Data Scientists

Public health, artificial intelligence, and data science trivia! Fun! Educational! Test your knowledge!

Statistics trivia for data scientists will refresh your memory from the courses you’ve taken – or maybe teach you something new! Visit my blog to find out!

Management Tips for Data Scientists

When working in data science, there are some tips and tricks to managing your communication and relationship with superiors that can help you advance in your career.

Management tips for data scientists can be used by anyone – at work and in your personal life! Get the details in my blog post.

REDCap Mess: How it Got There, and How to Clean it Up

REDCap mess on your hands? The REDCap designers made the application so loosey goosey, you can really program yourself into a messy corner if you don't plan well.

REDCap mess happens often in research shops, and it’s an analysis showstopper! Read my blog post to learn my secret tricks for breaking through the barriers and getting on with data analytics!

GitHub Beginners in Data Science: Here’s an Easy Way to Start!

If you are an aspiring data scientist, you will need to know how GitHub works. You will probably want to use it for your projects.

GitHub beginners – even in data science – often feel intimidated when starting their GitHub accounts and trying to interact with the web page. Don’t be shy! Catch the highlights from a recent GitHub beginners workshop I held!

ETL Pipeline Documentation: Here are my Tips and Tricks!

This blog post shows you how to properly document your extract, transform, and load code.

ETL pipeline documentation is great for team communication as well as data stewardship! Read my blog post to learn my tips and tricks.

Verified by MonsterInsights