SAS macros for beginners are always challenging. The problem is that macros are for automating code you are sick of running manually over and over, and that really just doesn’t happen to students learning data science. But it happens all the time to developers running data warehouses!
SAS Macros for Beginners: Starting with Just the Basics
The code I demonstrate in the videos are greatly simplified versions of the kind of code I use when I run a data warehouse. SAS macros for beginners need to be simplified, so I can illustrate each step for you independently.
Here, I give examples of taking baby steps to turn base code from SAS into reporting and extract-transform-load (ETL) macros.
SAS Macros for Beginners: Building a Reporting Macro
Many times, you want a report to come out differently depending up on the input data. In the examples I show here, we have a subset of data from one of the annual Behavioral Risk Factor Surveillance System (BRFSS) survey datasets. I took just data from three states: Florida (FIPS code = 12), Massachusetts (FIPS code = 25), and Minnesota (FIPS code = 27).
Actually, here is a video of me running a one-way frequency using PROC FREQ pointed at the state variable, _STATE. You can see the distribution of states in the example dataset.
Then, I start by setting up our base code.
Step 1. Create Base Reporting Code that “Wants” to be Automated
So, I started by creating simple report that just runs a one-way frequency using PROC FREQ on one of the variables. Since BRFSS is a health survey, they asked the respondents if they had various health conditions. One of them is diabetes (variable = DIABETE3), so I ran a PROC FREQ on DIABETE3.
But there is nothing interesting about running PROC FREQ on a variable against the whole dataset. I had to make it so that the code “wanted” to be automated. It’s hard to describe what I mean, but I’ll try by saying how I modified the code.
What I did was add WHERE criteria on the state variable (_STATE), because then you had to choose between Florida, Massachusetts, and Minnesota each time you ran the report. And the report would say on the title in the output, “PROC FREQ with _STATE=” and then the value, so that way, we could see that the report was evaluating for the right state. You will see that to write the base code, I just hard coded in Florida (code 12).
Step 2. Practice Adding Macro Variables
So basically, by adding WHERE criteria to the base PROC FREQ, I made the report “want” to take in parameterized inputs that could change each time you run it. I start by running it hard-coded for Florida (so I know what correct Florida output looks like), and then I can start building automation from there.
In this next video, I demonstrate turning that WHERE parameter I hard-coded for Florida into a macro variable. In this iteration, you can change the value of the macro variable manually – by hard-coding it – which is not ultimately what we want. However, it takes us one baby step toward automating the macro, and also, demonstrates how macro variables can be used in SAS.
Step 3. Turning the Code into a Macro
Now we have some code that runs a frequency with a hard-coded macro variable, but we don’t actually have a macro. This next video shows how I turn the code into a macro, and then I can launch the macro.
Step 4. Parameterizing the Macro
Finally, in this step, we parameterize the macro – meaning that we make it so that when you launch it, you need to declare a value for the macro variable. That way, we can decide when we run it whether we are running it for Florida, Massachusetts, or Minnesota.
SAS Macros for Beginners: Building an ETL Macro
Above, I showed you how to build a reporting macro that could be parameterized. Now, we are going to do the same thing, but incorporating a data step. That’s to simulate the kind of ETL we do in data warehousing, and just to provide another of many examples of how SAS code is automated with macros in a large data-intensive organization or project.
Step 1. Create Base ETL Code that Runs Well to Put in the Macro
One of the things I emphasize when I teach macro programming is that it is for automating code you already wrote and are sure runs. Most of the time when you do research projects in academia, there really isn’t any reason to set up macros. Why would you automate a one-time analysis?
But when running a data warehouse, you create ETL code for processing raw data into your warehouse that has to be run each time you receive a new dataset from the data provider. This ETL code often adds variables. So in this video, I made some base SAS data step code that adds a flag variable to a dataset. This is the base code that I later automate with macro language.
Step 2. Make the Code into a Macro and Call it
The next step is to convert the code into a macro, and to call it to make sure it runs, as I show in this video. In this step, you have to choose what parts of the macro you are going to turn into macro variables so they can be parameterized when you call the macro.
If you run into trouble, you might hang back and make the extra half-step of turning them all into macro variables and declaring the same parameters as you used in the base code with a %LET statement (as we did with the reporting macro).
Step 3. Add Conditions at the Beginning of the Macro if Needed
You’ll see I am demonstrating doing an ETL maneuver where we take a bunch of similarly-coded variables and create a new flag variable based upon their native coding. The problem is that not all of the native variables are coded exactly the same way.
- The asthma variable (ASTHMA3) is coded like the other health condition variables, where 1 = Yes, 2 = No, and the other values indicate various types of “unknown”.
- But the diabetes variable (DIABETE3) is coded where 1 = Yes, and 2 = Yes (but only during pregnancy), and the other values mean other things.
The issue is that the flag variable we are transforming is supposed to indicate “yes” when it equals 1, so we have to process DIABETE3 differently. A way to handle that in the macro is to set up a condition, which is what we do in this video.
I always try to add the conditions at the beginning of the macro, and I add them in order of when they show up. I usually add comments as to when and why they were added.
Step 4. Store the Macros in Separate Files
In the previous videos, I demonstrated us making a macro and calling it in the same code file. However, this is a huge no-no in data warehousing. In data warehousing, you need to store you macros in separate files from the code that calls them.
- Macro files: These are SAS code files with only macros in them. I try to keep very few of them – like put all the ETL macros in one file, and report macros in another file, for example. These need to be stored in a known place that other code can access, and you want to make sure only a few people have editing privileges.
- Code that calls macro files: This is usually ETL code or some other production code. Whoever is making this code needs to be able to run code in the macro files – but not necessarily edit it.
This video shows how to separate macro code into a separate file, and then call it from another file.
Building SAS Macro ETL and Reporting Code in a Data Warehouse
These examples are just SAS macros for beginners. Over time, your ETL macro code will grow, because you will keep adding new conditions at the beginning of each macro. As a data warehouse leader, my job was always to figure out:
- Should we just add another condition to a macro?
- Or should we just make a new macro?
Sometimes, if you have a whole bunch of variables that you are conditioning out, you can just make a new macro to call for those. You have to be flexible as time goes on, because your source data keep changing in structure. If you face challenges editing macros, read Chapter 9 of my book, “Mastering SAS Programming for Data Warehousing” for tips on troubleshooting macros.
Published November 29, 2021.
Want to get started learning about SAS macros? This blog post provides SAS macros for beginners with video tutorials to walk beginners and code newbies through the basic steps!