Researching Data Science Companies: How to Evaluate Your Future Employer

You should research companies offering data science job positions before scheduling an interview, because you do not want to be surprised during the hiring process.

Researching data science companies is an important topic Wyatt Sharber covers in his article on building a career in data science. There are categories of different organizations that seem to operate in certain ways, and it is important to remember that not all companies with data science jobs available are “data science companies”, per se. They may be companies focused on doing something other than data science – such selling retail goods online – but they need a data scientist to help them build up their analytics department, and do analytics on their core business.

I thought it might provide useful guidance to explain how I help my customers who are researching data science companies (or other companies in hopes of landing a data scientist job). This blog post explains my approach to helping my customers evaluate companies before applying for jobs so they can streamline their application process, and only focus on businesses that have values that align with their career goals.

Researching Data Science Companies to Classify Them

When researching data science companies that advertise on jobseeker platforms like indeed.com, you’ll realize that the companies seem to fall into categories. I will point out a few important categories here.

Recruitment/Talent Companies

These are companies that large firms (that have a lot of money and a lot of open positions) hire on contract to do their job recruitment for them. They often have company names that suggest recruiting for jobs or finding talent. If you apply for a job through these companies, you may not be told about the actual company and actual job you are interviewing for until well into the process.

While this may sound like a bad deal to the jobseeker, these companies may be the only gatekeepers to certain jobs, so you may have to deal with them in order to get a job at the company you want – especially large companies that hire these organizations to help them. So I encourage you to go along with the process, and as soon as you can find out who the actual company is that is interviewing you, do the research on them that I talk about in this blog post.

Famous, Longstanding Companies Outside of Data Science

I recently went on Indeed.com and looked at the data science job listings. I found listings for many famous companies, such as Pfizer – a pharmaceutical company – and Liberty Mutual – an insurance company. Neither of these companies is a data science company per se, but it is easy to imagine what a data science job would be like at either of these companies. At Pfizer, the data science would likely center around health and pharmaceuticals, while at Liberty Mutual, the data science would likely center around insurance and risk.

But this isn’t always the case, actually. The only thing you really know about these jobs is the companies. For all you know, you could be hired into a department that does some sort of billing, and your job is to build an artificial intelligence algorithm for fraud detection. From the jobseeker’s point-of-view, at this point, all you really know about is the company, not the job. So you have to start by evaluating the company, which isn’t hard, since these are famous, longstanding companies.

Both Pfizer and Liberty Mutual have long histories, and have been responsible for a lot of different actions. It is important for the jobseeker to decide – do I ever want this company on my resume? How will I tell people in the future as to why I chose to work for this company? Some people might cite that Pfizer was responsible for one of the COVID-19 vaccines, and that Liberty Mutual has a strong social justice perspective. In any case, if you have a bias against either company, it’s best to find out early on, before you get too far into the application process.

Government, Non-profits, and Academia

This sector is very unique for data science, in that it doesn’t use the recruitment/talent companies as often, yet is filled with organizations that are both longstanding and famous – like the United States Department of Defense, the Brookings Institute, and Harvard University.

Generally, researching data science companies in this sector is more about figuring out if the whole sector is the right fit for you as a data scientist or not – not about evaluating each individual organization. This is likely because they are all very similar in terms of their reputations (which tend to be good), types of work (which tend to be academic), and types of salaries (which tend to be bad). If you think you are interested in a career as a data scientist in academia, I strongly encourage you to read this blog post and watch the linked videos on it, where I describe what these jobs are like.

Private Companies You Don’t Know Much About

Whether you are looking for jobs from a job seeker site or you end up working with a “headhunter” or talent coordinator at one of the recruitment/talent companies, you may realize you are interested in a job at a firm you have never heard of, or otherwise know very little about. It is important that you do a significant amount of research about these types of companies online before you interview for a job with them. That’s because you want to make sure that you know what you are getting into if they start acting like they want to hire you during the process.

Researching a Company New to You

I thought I’d teach you how to do this by giving you an example. As I said, I went on Indeed.com and I searched for data science jobs. I saw jobs come up from the places I listed above: talent/recruitment companies, famous private sector companies, famous public sector companies, and companies with names I didn’t know. I decided to pick one from the last category, and show you how I’d research that job position.

This is an example of a job from Indeed.com for a data scientist

This job looked like an excellent first position for a new data scientist.

However, I had never heard of the firm GSSA, LLC.

The name also didn’t give me any hints as to what this firm does.

I put together some of the information from the job announcement into a graphic below.

Here are more details about the data science job from GSSA, LLC. It explains what the requirements are of the role.

I also took note of some of the wording about GSSA, LLC from the job announcement materials:

GSSA, LLC, a progressive political consulting firm, is actively seeking an experienced data professional to fill the position of Data Scientist. The Data Scientist reports to the Director and is responsible for gathering, assessing, and presenting data from voter databases, public opinion polls, the U.S. Census Bureau and other sources; identifying and interpreting meaningful patterns; conducting randomized control trials (experiments); and compiling visualizations and reports.

GSSA, LLC, is a progressive political research firm located in the LoDo area of Downtown Denver. We follow a hybrid work model, designed to give a high degree of flexibility while maintaining a base in-person business model. For this position, remote candidates will be considered.

When I searched for GSSA, LLC, on Google, I found a LinkedIn profile for the founder, and a web page with public information about the corporation. Also, when I looked for news about the company, I found a news article.

Founder’s Profile

I was very impressed with the LinkedIn profile of Noel Smith, the founder of GSSA, LLC. I am familiar with his alma mater, Liberty University, because I have tutored learners studying in their programs.

Smith has a degree in business, and a background in managing and owning businesses.

Public Records about the Company

Next, I looked at public records I could find about the company. I found one from OpenCorporates that provided the following insight:

  • The company was originally established in 2011
  • There was not much activity until 2016, when there were some changes of address and whoever was the registered agent
  • The company is currently registered in the state of Colorado and is in good standing

I know that Liberty University, where the owner went to school for his business degree, is a Christian college in the South of the United States (US). It appears to me that GSSA was started in North Carolina (which is in the South), and only later moved to Colorado. It is obviously a small company that is lean, and can move around easily when necessary.

News about the Company

I admit, I did not look very hard, but I could not find a web page for the company GSSA, LLC. Maybe there is one, but I couldn’t find it. When I can find a company’s web page, I like to look at their press releases, because that way, I can figure out what their reputation might be like.

I also look for occurrences of the company name in the news. I was pleased to find this article in Blavity about how GSSA, LLC was involved in a civic engagement campaign to boost long-term Black voter turnout.

Why Researching Data Science Companies is Important

Why was I pleased to see that article about Black voters? Because this activity is consistent with my values, so if I were the job seeker, I’d be positively influenced by this news. As it is, after my research, I am positively inclined to promote this job opportunity to my customers learning data science and seeking a new job.

And that is the main reason to do research into prospective companies – because you want to see if they are consistent with your values. You may end up having this company’s name on your resume for the rest of your life. If you worked for a company that is against what you believe in, or has been disgraced in a scandal, how will you explain that to a future employer?

Hopefully, this blog post will help you sort out how you feel about these companies before you interview. Please makes sure to visit my blog and YouTube channel often to see updates on job seeking in data science, and get my latest career development advice!

Updated January 16, 2022. Added livestream video January 19, 2022. Added video January 22, 2022. Added banners and slider July 3, 2024.

Read all of our data science blog posts!

Apply Weights? It’s Easy in R with the Survey Package!

Apply weights to get weighted proportions and counts! Read my blog post to learn how [...]

Make Categorical Variable Out of Continuous Variable

Make categorical variables by cutting up continuous ones. But where to put the boundaries? Get [...]

Remove Rows in R with the Subset Command

Remove rows by criteria is a common ETL operation – and my blog post shows [...]

CDC Wonder for Studying Vaccine Adverse Events: The Shameful State of US Open Government Data

CDC Wonder is an online query portal that serves as a gateway to many government [...]

AI Careers: Riding the Bubble

AI careers are not easy to navigate. Read my blog post for foolproof advice for [...]

Descriptive Analysis of Black Friday Death Count Database: Creative Classification

Descriptive analysis of Black Friday Death Count Database provides an example of how creative classification [...]

Classification Crosswalks: Strategies in Data Transformation

Classification crosswalks are easy to make, and can help you reduce cardinality in categorical variables, [...]

FAERS Data: Getting Creative with an Adverse Event Surveillance Dashboard

FAERS data are like any post-market surveillance pharmacy data – notoriously messy. But if you [...]

Dataset Source Documentation: Necessary for Data Science Projects with Multiple Data Sources

Dataset source documentation is good to keep when you are doing an analysis with data [...]

Joins in Base R: Alternative to SQL-like dplyr

Joins in base R must be executed properly or you will lose data. Read my [...]

NHANES Data: Pitfalls, Pranks, Possibilities, and Practical Advice

NHANES data piqued your interest? It’s not all sunshine and roses. Read my blog post [...]

Color in Visualizations: Using it to its Full Communicative Advantage

Color in visualizations of data curation and other data science documentation can be used to [...]

Defaults in PowerPoint: Setting Them Up for Data Visualizations

Defaults in PowerPoint are set up for slides – not data visualizations. Read my blog [...]

Text and Arrows in Dataviz Can Greatly Improve Understanding

Text and arrows in dataviz, if used wisely, can help your audience understand something very [...]

Shapes and Images in Dataviz: Making Choices for Optimal Communication

Shapes and images in dataviz, if chosen wisely, can greatly enhance the communicative value of [...]

Table Editing in R is Easy! Here Are a Few Tricks…

Table editing in R is easier than in SAS, because you can refer to columns, [...]

R for Logistic Regression: Example from Epidemiology and Biostatistics

R for logistic regression in health data analytics is a reasonable choice, if you know [...]

1 Comments

Connecting SAS to Other Applications: Different Strategies

Connecting SAS to other applications is often necessary, and there are many ways to do [...]

Portfolio Project Examples for Independent Data Science Projects

Portfolio project examples are sometimes needed for newbies in data science who are looking to [...]

Project Management Terminology for Public Health Data Scientists

Project management terminology is often used around epidemiologists, biostatisticians, and health data scientists, and it’s [...]

Rapid Application Development Public Health Style

“Rapid application development” (RAD) refers to an approach to designing and developing computer applications. In [...]

Understanding Legacy Data in a Relational World

Understanding legacy data is necessary if you want to analyze datasets that are extracted from [...]

Front-end Decisions Impact Back-end Data (and Your Data Science Experience!)

Front-end decisions are made when applications are designed. They are even made when you design [...]

Reducing Query Cost (and Making Better Use of Your Time)

Reducing query cost is especially important in SAS – but do you know how to [...]

Curated Datasets: Great for Data Science Portfolio Projects!

Curated datasets are useful to know about if you want to do a data science [...]

Statistics Trivia for Data Scientists

Statistics trivia for data scientists will refresh your memory from the courses you’ve taken – [...]

Management Tips for Data Scientists

Management tips for data scientists can be used by anyone – at work and in [...]

REDCap Mess: How it Got There, and How to Clean it Up

REDCap mess happens often in research shops, and it’s an analysis showstopper! Read my blog [...]

GitHub Beginners in Data Science: Here’s an Easy Way to Start!

GitHub beginners – even in data science – often feel intimidated when starting their GitHub [...]

ETL Pipeline Documentation: Here are my Tips and Tricks!

ETL pipeline documentation is great for team communication as well as data stewardship! Read my [...]

Benchmarking Runtime is Different in SAS Compared to Other Programs

Benchmarking runtime is different in SAS compared to other programs, where you have to request [...]

End-to-End AI Pipelines: Can Academics Be Taught How to Do Them?

End-to-end AI pipelines are being created routinely in industry, and one complaint is that academics [...]

Referring to Columns in R by Name Rather than Number has Pros and Cons

Referring to columns in R can be done using both number and field name syntax. [...]

The Paste Command in R is Great for Labels on Plots and Reports

The paste command in R is used to concatenate strings. You can leverage the paste [...]

Coloring Plots in R using Hexadecimal Codes Makes Them Fabulous!

Recoloring plots in R? Want to learn how to use an image to inspire R [...]

Adding Error Bars to ggplot2 Plots Can be Made Easy Through Dataframe Structure

Adding error bars to ggplot2 in R plots is easiest if you include the width [...]

AI on the Edge: What it is, and Data Storage Challenges it Poses

“AI on the edge” was a new term for me that I learned from Marc [...]

Pie Chart ggplot Style is Surprisingly Hard! Here’s How I Did it

Pie chart ggplot style is surprisingly hard to make, mainly because ggplot2 did not give [...]

Time Series Plots in R Using ggplot2 Are Ultimately Customizable

Time series plots in R are totally customizable using the ggplot2 package, and can come [...]

Data Curation Solution to Confusing Options in R Package UpSetR

Data curation solution that I posted recently with my blog post showing how to do [...]

Making Upset Plots with R Package UpSetR Helps Visualize Patterns of Attributes

Making upset plots with R package UpSetR is an easy way to visualize patterns of [...]

4 Comments

Making Box Plots Different Ways is Easy in R!

Making box plots in R affords you many different approaches and features. My blog post [...]

Convert CSV to RDS When Using R for Easier Data Handling

Convert CSV to RDS is what you want to do if you are working with [...]

GPower Case Example Shows How to Calculate and Document Sample Size

GPower case example shows a use-case where we needed to select an outcome measure for [...]

Querying the GHDx Database: Demonstration and Review of Application

Querying the GHDx database is challenging because of its difficult user interface, but mastering it [...]

Variable Names in SAS and R Have Different Restrictions and Rules

Variable names in SAS and R are subject to different “rules and regulations”, and these [...]

Referring to Variables in Processing Data is Different in SAS Compared to R

Referring to variables in processing is different conceptually when thinking about SAS compared to R. [...]

Counting Rows in SAS and R Use Totally Different Strategies

Counting rows in SAS and R is approached differently, because the two programs process data [...]

Native Formats in SAS and R for Data Are Different: Here’s How!

Native formats in SAS and R of data objects have different qualities – and there [...]

SAS-R Integration Example: Transform in R, Analyze in SAS!

Looking for a SAS-R integration example that uses the best of both worlds? I show [...]

Dumbbell Plot for Comparison of Rated Items: Which is Rated More Highly – Harvard or the U of MN?

Want to compare multiple rankings on two competing items – like hotels, restaurants, or colleges? [...]

2 Comments

Data for Meta-analysis Need to be Prepared a Certain Way – Here’s How

Getting data for meta-analysis together can be challenging, so I walk you through the simple [...]

Sort Order, Formats, and Operators: A Tour of The SAS Documentation Page

Get to know three of my favorite SAS documentation pages: the one with sort order, [...]

Confused when Downloading BRFSS Data? Here is a Guide

I use the datasets from the Behavioral Risk Factor Surveillance Survey (BRFSS) to demonstrate in [...]

2 Comments

Doing Surveys? Try my R Likert Plot Data Hack!

I love the Likert package in R, and use it often to visualize data. The [...]

2 Comments

I Used the R Package EpiCurve to Make an Epidemiologic Curve. Here’s How It Turned Out.

With all this talk about “flattening the curve” of the coronavirus, I thought I would [...]

Which Independent Variables Belong in a Regression Equation? We Don’t All Agree, But Here’s What I Do.

During my failed attempt to get a PhD from the University of South Florida, my [...]

Researching data science companies who might be your future employers, but you don’t know where to start? Read my blog post to learn my simple approach.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Verified by MonsterInsights