According to John Carryrou’s book, “Bad Blood”, the startup Theranos had no administrative barrier between research data produced in their lab to validate their innovative device and the clinical data produced later from patient samples. This is the last in a series of blog posts I am writing about data-related misconduct at Theranos outlined in Carryrou’s book, and explain how data science leaders can take steps to prevent this.
Theranos essentially had no administrative barrier between research data they produced and any other data that was flowing through their business. In my earlier posts, I described how Theranos apparently had no data stewardship policies and were lacking a governance structure. These can be problems at any organization. But in this blog post, I focus on a problem that can only be experienced by health-related organizations, and that is having no administrative barrier between research and clinical data in the organization.
Run your data warehouse like a BOSS
Read this book to learn everything from A to Z about running a data warehouse – from setting up a governance structure, to implementing code and dataset naming conventions.Buy the book
What it Meant to Theranos to Have No Administrative Barrier Between Research and Other Data
In a well-run organization, data produced by the business are categorized by their class. For example, data about employees are classified as human resources data, and data about purchases are classified in some procurement database, and so on. All of these classifications related to other business processes – such as payroll, and paying sales tax on purchases. Observe that terms like “payroll” and “sales tax” relate to regulatory functions, and now you got it! One main reason why we have to classify our data in business is so that we make sure we meet laws and regulations.
As Carryrou’s book described and I mention in earlier posts, there was no shortage of leaders, but there was a shortage of leadership at Theranos. For most of the time, there was not really a human resources department, so there were a lot of practices most human resources professionals would prevent, including conflicts of interest (the CEO and COO were in a romantic relationship the board didn’t know about).
With this going on, there was no hope for Theranos inventorying and classifying their data. This turned into a big headache for them when they went to do regulatory things – like try to get approval for their lab test from the United States Food and Drug Administration. Spoiler alert: It didn’t happen.
How to Put an Administrative Barrier Between Research and Clinical Data
Administrative barriers are not built in a day. They require some foundation of administration first. In Theranos’ case, they started as a research lab, so their first documentation should have been about their research data. This means writing research protocols, and keeping documentation you normally keep with research data. At that point, it wasn’t a problem that there was a lack of administrative barrier between research and other data, because Theranos was small, and didn’t have much data. But then later, after they grew and started wanting to run patient samples, they should have classified those data as patient data, and come up with separate protocols for handling them.
Keep your big data clean and organized
Learn how to make data curation files to keep your team on the same page about the different big datasets you are using in your projects.learn more
I have consulted with patient care clinics who want to start doing research. Often, they are pretty well-organized, but before they start with research, we usually have to do a little bit more to get their clinical data classifications in order. This means updating some clinical policies and procedures – which is normal, because if a clinic is expanding and adding research studies to patient care, it means their operations are going to have to change.
Then, we can write some policies about doing research at their clinic. Who gets to be Principal Investigator of studies at their clinic? Which IRB do they use? How are the studies funded? How do researchers get permission help with research at the clinic? Are there any rules about studies the clinic will and will not do as part of their business strategy? Don’t forget that clinics are businesses, too!
In the end, Theranos was an exceptional case, because they were lying on purpose, which most businesses don’t do. CEOs of grandiose businesses like Tesla may bloviate, but they eventually deliver something. It was the fact that Theranos was pretending their product worked and patients and scientists weren’t buying it that eventually took them down. That is why the moral of the story is not to trust our regulatory systems to catch fraud organizations like Theranos. We data scientists need to know how to keep our noses clean, and avoid working in chaotic workplaces like Theranos in the first place.
Updated June 13, 2021. Added menu list July 19, 2021.
Read all posts in the series!
Read my last post in a series on data-related misconduct at startup Theranos outlined in the book, “Bad Blood”, where I discuss their lack of administrative barrier between research and clinical data.