“Bad Blood” Demonstrates how a Lack of Product Description Leads to Data Science Misconduct: Part 2 of 5

In order to operationalize your data variables, you need to have clear product descriptions

Although “lack of product description” may sound like it does not have much to do with scientific misconduct, it actually has a lot to do with it. This blog post is part of a series where I reflect on various data science and research misconduct outlined in John Carryrou’s book, “Bad Blood”, about the rise and fall of laboratory startup Theranos.

The first time I read about Theranos in the news, I could tell they were a hoax, and I told everyone who would listen – which was basically no one. The reason why it was obvious is because if you know anything about how labs that serve patients actually run, Holmes was promising something fanciful – as fanciful as the science fiction trope of “food pills” that would replace eating.

Those of you who take my online courses know that I often set my scenarios in a fictitious research team comprised of people working in a central patient lab, so I know what I am talking about. There are just too many issues to solve in reducing three meals a day to a pill, and there are just too many issues to solve to run zillions of totally different blood tests on a tiny little drop of blood, meaning hardly any sample.

Lack of Product Description is Misconduct if you are Saying you are Making a Product

Normally, startups actually have a realistic product in mind they are trying to make, so Theranos is a pretty egregious case. But I often observe that scientists who have a legitimate innovation they are working on also do not document the product correctly. Even on their web sites, you’ll see a profound lack of product description (especially with data science products).

What are we studying, and later, what are we trying to sell? A “food”? An “investigational drug”? A “medical device”? Attorneys and scientists can help with those classifications if product descriptions are written down. I once worked with a scientist who had tested a substance in a lab for a startup, and the startup wanted him to publish his work. But we could not publish it because the startup could not agree on how to describe their product, and what their product actually was. What a waste!

Learn to Design Online Surveys

This course will show you how to plan an online survey from start to finish, and will help you develop design documentation.

Go to course
Research forms and surveys need to be designed, and files like this need to be made before programming the survey.

In Theranos’ case, they were deliberately hoodwinking people. Holmes had made promises, and scientists were trying to cobble together items in the lab that seemed to fulfill those promises. This is why she was not writing descriptions of her product: what it was intended to do, what it was not intended to do, what problems it was intended to solve, how it was intended to be used, what type of customers, and why it is superior to the competition. This may seem like bad business, but it is also bad science.

Why Lack of Product Description and Documentation is Bad Data Science

If I am the scientist on your team trying to help you see if your innovation works or how it could be improved, I need to know what “works” means. Another way of saying this is I need to “operationalize” what you are trying to do into data definitions. My job is to make a study design around this.

A long time ago, a pathologist friend of mine and I tested an app he made for helping pathology trainees learn how to complete pathology reports. He did a good job of designing the study, because he actually operationalized what would constitute the app “helping” into numbers we could measure. This was only possible because he had a super clear vision of what he wanted the students to get out of the app. He did a great product description in his IRB application of saying why he was doing the study, why he designed the app the way he did, and what it was intended to do.

Learn how to design big data healthcare studies on LinkedIn Learning

Be Organized!

Learn how to operationalize and document your variables in big data.

Take the Course!

In the case of Theranos, it is obvious why Holmes did not nail any of this down. She actually wasn’t envisioning any sort of realistic final product, so keeping a lack of product description was in her favor. But in the case of my colleague who was working with a startup and tested their substance in his lab, in retrospect, my colleague should have gotten more information from the startup before getting into the situation he found himself in.

He asked me to help later, with the statistics, and it was only then that I realized that I wasn’t sure what we were trying to do with the study. This meant that I wasn’t sure how to handle the data my colleague had collected from mice injected with the substance.

When I asked what the substance was supposed to do, the startup founders could not decide. The idea was that they had made some sort of substance that can hold a protein and dissolve after being injected, thus slowly releasing the protein into the blood. But they hadn’t decided if they were trying to use it for drug delivery, or for something that didn’t have to do with medicine.

In fact, they couldn’t decide on a lot of things, and we had to walk away from the project – even though the data from the mice looked pretty good! That’s why it is very important to maintain up-to-date product descriptions in data science, as well as research and business.

Updated June 17, 2021. Blog post menu added July 19, 2021.

This blog post talks about how lack of product description led to data-related misconduct at Theranos, because they could never nail down exactly what they were trying to do.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.