Front-end Decisions Impact Back-end Data (and Your Data Science Experience!)

How the front-end and back-end are connected can impact how data are stored in the application. So if you extract the data, you can have data quality problems caused by the front-end.

Front-end decisions made at the time of application design often impact the structure and quality of our back-end data. Let’s say we extract data from a medical record application like Epic and try to analyze it – and we see that the data are screwed up. Those of us in health data analytics are often at a loss to troubleshoot this problem. This is because many of us don’t even know what a “front-end” or a “back-end” is! So we receive some screwed up data, and we can’t figure out why it is so screwed up.

We don’t realize how much the front-end – which is the “window” the data pass through to enter the back-end when someone does data entry – may have impacted what data show up in the back-end, in terms of why certain columns are there, and why they might have the values they have. If you are unfamiliar with the terms “front-end” and “back-end”, watch my video below for an example.

All Application Designers Must Make Front-end Decisions

Here’s something popular we use in health data analytics that we might not think of as being an application with a front-end and a back-end – SurveyMonkey! Actually, all survey applications – including REDCap – are technically applications with a front-end and a back-end. Since a survey is basically mainly a front-end, applications like SurveyMonkey are optimized so that you can make excellent front-end decisions for survey administration.

SurveyMonkey and other survey software applications allow you to make important front-end decisions that make your survey data look awesome. First, they allow you to build in validation rules for free text variables. Next, they allow you to create low cardinality picklists. Picklists can restrict answers to one answer only (e.g., “What is your favorite color? Select one.” or multiples (e.g., “What colors are you willing to wear? Check all that apply.”).

Third, survey software applications allow you to build in skip patterns, so you can have subsets of respondents opt out of certain items. In addition, survey software applications allow you to make different decisions about how you present Likert scale or other ordinal items – in a matrix, or using fancy controls like dials. Watch my livestream where I show you how to make a survey specification before programming a SurveyMonkey survey so you can see how I document the decisions I make when designing my SurveyMonkey front-end.

Even though epidemiologists and biostatisticians regularly use survey software applications, they often have trouble using them, because they don’t really understand how they are built and how they work. They might observe that their survey data are being recorded in some sort of screwed up way, but they lack the expertise and background in application design and development to easily troubleshoot.

Document Your Front-end Decisions with Data Curation

As you can see if you watch my video above or read my blog post about REDCap, I take my front-ends (including all my surveys!) very seriously. I do a lot of data curation in the design of my surveys. I’m very careful when I implement my surveys in the software, because it’s very easy to create a hot mess in the back-end if you do something wrong in the front-end!

Learn to make data dictionaries, flow charts, and diagrams to understand your data

If you want to learn how to do what I demonstrate in my video and talk about in my REDCap blog post, take my LinkedIn Learning course on data curation. You can get an introduction to it in this video.

Data curation is the key to managing data science teams, which is why it’s a central topic in my online boot camp course for research data managers, “How to do Data Close-out”. I feel that if you can master data curation, you can communicate clearly with others about data, and that is the key to being a successful manager in the data science field.


Updated June 10, 2023.

Read all of our data science blog posts!

Front-end decisions are made when applications are designed. They are even made when you design a survey in SurveyMonkey. What health data analysts often don’t realize is that these decisions have a profound impact on the quality and accuracy of the data that are collected through these front-ends, which is the focus of this blog post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Verified by MonsterInsights