Online educators like me saw a boom in people taking their online courses with COVID-19, so I decided to belatedly announce my Data Curation Course. That’s because I’m seeing a lot of people taking data science courses, and my opinion is that you will literally go crazy if you try to work with big data without knowing how to keep track of it, which is what I teach in the Data Curation Course.
The Data Curation Course provides the tools to help you set up relatively simple files (Word documents, tables, diagrams, etc.) that serve as guides to navigate you and your team through data science projects. So it’s like the management side of data science – not the programming side. You need both to get through a project with your team and not kill each other, but the courses I’ve seen tend to focus on the programming. This course complements that knowledge, and helps you actuate your programming prowess, and become more organized about it.
Here’s what you’ll learn in the course:
Chapter 1: Introduction
A lot of people have never even heard the term “data curation” – I myself first heard it around 2010. So I start the chapter by explaining what it is, and giving you context as to why this topic is important to know in data science, and how it can help you.
In the introduction, I also explain each type of curation file I’m going to teach you how to make and use in the course. They fall in the following categories:
- Back-end curation for data tables.
- Front-end curation for interfaces.
- Survey curation for when you do online or in-person surveys.
- Flow diagrams for working out work flow and data flow problems.
- Text-based curation files – which are usually reports, cheat sheets, or other items that help you understand your data system.
Chapter 2: Back-end Curation
Topics covered: Study vs. production data, EAV structure, ERDs, data dictionaries, crosswalks
What you learn how to do: Document back-end data with data dictionaries and ERDs.
Chapter 3: Front-end Curation
Topics covered: How to curate a front-end, using front-end curation in change management, steps for designing a dashboard, and mocking up a dashboard.
What you learn how to do: Use Powerpoint to mock-up front-ends and dashboards for initial design and also change management.
Chapter 4: Special Curation for Surveys
Topics covered: Choosing domains, making design documents for online and paper surveys, and documenting survey data structure and collection.
What you learn how to do: Mock-up and annotate and online and paper surveys, and develop documentation for questions, answers, and instrument scoring.
Chapter 5: Flow Diagrams
Topics covered: The following types of flow diagrams: warehouse data, analytic data, application, workflow, study flow, and data reduction.
What you learn how to do: Choose the type of flow diagram you think is the right one to make for the issue at hand, and develop it using Powerpoint or another flowcharting tool.
Chapter 6: Text-based Curation Files
Topics covered: Text-based curation files are extremely diverse. I cover a few in the course that I personally have found particularly useful in my career: standardized code, data reports, and founding documents (from a project or effort) along with project-related agreements. I also tend to collect manuals (especially data entry manuals), cheat sheets, and diagrams in my forensic data curation arsenal in hopes I will get a deeper understanding about the story behind the data!
What you learn how to do: Identify what types of text-based curation files would be helpful to the project, and include them with the others for the project team to use.
Chapter 7: Conclusion
For my LinkedIn Learning courses, I’ve been told that I make bigger conclusion chapters than other authors. This is because I really want to explain to you how to apply what you just learned, since I like to teach very practical skills. I think it’s worth watching the videos in this section, because if you are in a data-related job, you can start making these curation files at work right away, and maybe clear up some confusion or miscommunication on your team. In fact, you might want to get your team members to take the Data Curation Course also, so you all can start using the same terminology.
FTC disclaimer added December 5, 2020. Cosmetic revisions on July 30, 2021. Added video January 12, 2022. Revised and added banners June 17, 2023. Added courses slider September 29, 2023.
Curation files are especially helpful for communicating about data on teams. Learn more about what you’ll learn when you take my online LinkedIn Learning data curation course!