Statistics Trivia for Data Scientists

Public health, artificial intelligence, and data science trivia! Fun! Educational! Test your knowledge!

Statistics trivia questions for you to challenge your students, colleagues, friends and foes! Here is the answer key – scroll down to the bottom to post the quiz on your social media! Today’s topic is “artificial intelligence in public health”.

Q. Why is there “no free lunch”?
A. Because no single machine learning algorithm is considered the optimal one.

Read more statistics trivia and facts about why there is “no free lunch” in Amol Mavuduru’s article on Toward Data Science.

Q. What’s the problem with a “bag of words”?
A. It does not consider the order in the corpus.

Read statistics trivia about the “bag of words” phenomenon in Federico Albanese’s blog post.

Q. What’s the problem with using machine learning to model risk factors for epidemiologic outcomes instead of logistic regression?
A. It’s a lot of extra work for no gain.

Most studies that compare logistic regression to various machine learning approaches for epidemiologic studies (like this one) find that logistic regression produces the same results, and is easier to use.

Q. Why does everyone like Cox proportional hazard regression so much?
A. Because the non-parametric estimation of the hazard function gives you so much flexibility!

The fact that Cox is semi-parameterized means you don’t have to worry about fitting your survival function to a parameter (like Poisson or Weibull).

Q. When fitting a regression model, should you go backwards, forwards, or ambi-directional?
A. Doesn’t matter, so long as you follow a priori rules for sloughing off covariates.

Many analysts prefer “backward elimination”. But if your dataset is not that big, or you have small cells, usually a first regression model with 20 candidate covariates in it bombs. That’s why I teach an ambi-directional approach to modeling, which has been criticized for whatever reason. Evidence suggests that it doesn’t matter which direction you choose, so long as you stick to the a priori rules you set, you should derive roughly the same final model.

Q. What should you do if you have years of data-related experience in the health domain, and want to undergo intensive retraining to transition to more of a “data scientist” role? What is the most effective way forward?
A. Join an exclusive mentoring program aimed at professionals with health data experience.

I did some research and found there are many public health and healthcare analysts who want to move in a more data science direction. There was a gap in what was available, so I developed a mentoring program exclusively for this audience. Read about it here, and if you are interested in joining, you can sign up for a 30-minute Zoom market research call about the program.

Post the Statistics Trivia Questions on Social Media

Copy the HTML or Word versions below to post the trivia questions on your social media account!

Read all of our data science blog posts!

Statistics trivia for data scientists will refresh your memory from the courses you’ve taken – or maybe teach you something new! Visit my blog to find out!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.