VP at Data Robot Tells Cautionary Tale of Data Science, AI, and Healthcare

I encourage those of you into data science and AI to sign up to receive digests from Data Science Central. I happened upon this web site when looking for some references for a book I am writing, and found some amazing articles. Since I’m into data science in and AI in healthcare, I was intrigued when I received an invitation to listen to the podcast, “Data Science Fails: Ignoring Business Rules & Expertise”, a 15-minute presentation that examined at a use case about AI in healthcare. The podcast, which was sponsored by Data Robot, featured speaker Colin Priest, Vice President of AI Strategy and Data Scientist with Data Robot.

Mr. Priest pointed out how dazzled we are by AI these days. He gave the example of the AI system AlphaGo Zero learned to play Go in just 36 hours on its own. It got so good, it beat the human champions – mainly because it thought of moves that humans had not considered.

He pointed out that lately, some shops have been trying to use AI to replace human expertise, and failing miserably. Instead, he suggests data scientists should use our human expertise to better construct, direct, manage, and leverage AI.

And Mr. Priest had a particularly insightful way of making his point, by telling…

…A Cautionary Tale about Data Science, AI and Healthcare

In 2011, IBM Watson, a supercomputer running AI algorithms, beat its human opponents – who were champions – on the trivia game show, Jeopardy. Mr. Priest reflected on how challenging the AI was to program to create such a success. IBM Watson used databanks of questions and answer pairs, with answers being human labeled as “correct” and “incorrect” so that IBM Watson could learn the patterns and make a model that produced the correct response. Although Mr. Priest noted that IBM Watson was not 100% correct on all its Jeopardy answers, it was correct enough to beat the human game show champions.

Of course, there was nothing wrong with IBM Watson playing game shows on television. The tale turned cautionary when IBM Watson was repurposed from answering trivia questions to diagnosing cancer.

Empty stage set for game show Jeopardy with IBM Watson

IBM Watson’s New Tall Order: Using Data Science and AI to Diagnose Cancer

As described by Mr. Priest, the somewhat fantastic idea was that IBM Watson could take in all kinds of textual information from different data sources using its natural language processing (NLM) capabilities: peer-reviewed journal articles on cancer, patient medical records, textbooks, and drug information. Once those data were absorbed into IBM Watson, it could then explore the data for patterns that no human could ever discern.

I just want to point out that some peer-reviewed articles are so poorly written, no one could understand them, not even IBM Watson.

Nevertheless, in October 2013, IBM announced a partnership University of Texas MD Anderson Cancer Center in using IBM Watson as part of its mission to eradicate cancer.

Issues with IBM Watson Implementation for AI in Healthcare

I will tick off here the laundry list of issues Mr. Priest observed with this particular implementation:

They did not train IBM Watson’s NLM engine, even though this is what had been perfected in the Jeopardy exercise. Instead, they only trained for pattern recognition.
They only used a small number of cancer cases to train. For example, they used 635 cases for lung cancer, and only 106 cases for ovarian cancer.
The cancer cases they used were hypothetical – not real.
The algorithm wasn’t trained against anything evidence-based or humanly-labeled.

Unsurprisingly, patients and clinicians identified multiple cases where IBM Watson made incorrect and unsafe treatment recommendations.

How AI in Healthcare is Different than in Games

Of course, it seems a little obvious, but someone had to say it. Here are some differences as to why AI works well for games, and not as well for diagnosing cancer:

If you lose the game, you can just play another game, but the stakes are a lot higher in healthcare, and you can’t make a mistake. In games, you can give the AI a chance to learn and make mistakes. In healthcare, you can’t.
If the goal is pattern-recognition, the AI could work in healthcare. Priest gave the example of imaging, citing a recent meta-analysis that showed AI could be comparable to human pattern recognition.
In healthcare, AI is useful for predictions. Priest mentioned that AI has been used to successfully classify patients at high risk for sepsis or lack of adherence to medical appointments.
Need to develop and conduct AI in healthcare using rigorous policy. The result will need to be trustworthy, so the development and deployment needs to follow best practices.

Recommendations for Data Science and AI in Healthcare

In the end, Mr. Priest made four reasonable recommendations:

Code all the business rules, ethical behavior, and anything you already know into the algorithm at the beginning. Don’t make the algorithm “learn” things you already know – tell it directly by programming these parameters into it.
Make sure you use enough training data that is real (not fabricated). Having big datasets will allow you to explore correlations and try to determine if they are spurious or not.
Do healthcare AI projects in bite-sized pieces. Priest noted that AI can work in very small decision spaces well, so if you break a healthcare task into subtasks, AI can be used on each subtask. For example, AI might be able to be used on imaging in a certain type of cancer diagnosis. Putting these “bite-sized” modular AI pieces together is how Mr. Priest suggest we build evidence-based AI for healthcare to ultimately achieve complex tasks – like diagnosing cancer.
Use subject matter experts to review and sign off on an AI algorithm in healthcare before it goes live. It’s important to have actual clinicians or other “humans” who would normally do the work the AI is doing ensure that the AI is producing reasonable results.
Train your AI like you would train your child. First, you don’t want your child to just run around learning unsupervised. Like with your child, you need to set up boundaries to guide the AI to be trained the way you want. Next, if you don’t want the AI (or your child) to be biased, you have to train it (them) not to be biased. That means carefully curating your training data so it gives unbiased information to the AI algorithm.

Want to watch IBM Watson win at Jeopardy? Watch the YouTube video here.

Updated March 2, 2020

Fallen ice cream cone photograph by Tamorlan. Photograph of Jeopardy set by Atomic Taco.

Career Development