The format procedure, PROC FORMAT, allows us to create user-defined formats for our variables in SAS. PROC format lets us define a map which will print variables differently values based on their current value.
proc format; value gender 'M' = 'Male' 'F' = 'Female' other = 'N/A' ; run;
PROC format is one of the most powerful procedures in the SAS language, but also one which is underused. The format procedure gives us an easy way to label our data based on a mapping we provide.
With PROC format, we can create user-defined formats of character strings which we can apply to any variable in any situation.
For example, let’s say I have the following SAS dataset which has the some dates.
data data_with_dates; input d date9.; format d date9.; datalines; 31DEC2021 24OCT2020 12DEC2019 07JUN2019 17FEB2021 12JAN2021 03MAR2020 ; run;
For each of the dates, I want to find the day of the week. The problem though, is that by default, the SAS weekday() function returns the number of the date of the week.
Below is SAS code with a data step getting the day of the week from the variable “d”.
data data_with_weekday; set data_with_dates; wd = weekday(d); run;
The resulting dataset with the week day of the date variable is as follows.
d wd 1 31DEC2021 6 2 24OCT2020 7 3 12DEC2019 5 4 07JUN2019 6 5 17FEB2021 4 6 12JAN2021 3 7 03MAR2020 3
This isn’t ideal, but we can easily get the name of the day of the week with PROC format.
proc format; value dayname 1 = 'Sunday' 2 = 'Monday' 3 = 'Tuesday' 4 = 'Wednesday' 5 = 'Thursday' 6 = 'Friday' 7 = 'Saturday' other = 'N/A' ; run;
By applying the format ‘dayname’, we get the desired result with the name of the day of the week in our variable.
data data_with_weekday; format dayname dayname.; set data_with_dates; wd = weekday(d); dayname = wd; run; /* Output */ dayname d wd 1 Friday 31DEC2021 6 2 Saturday 24OCT2020 7 3 Thursday 12DEC2019 5 4 Friday 07JUN2019 6 5 Wednesday 17FEB2021 4 6 Tuesday 12JAN2021 3 7 Tuesday 03MAR2020 3
Using PROC Format for Character Variables
With PROC format, we have the potential to save a lot of code. Let’s say we have a dataset with a variable which is coded and we want to format these variables with something more descriptive.
data data; input gender $; datalines; F F M F M M M F G F ; run;
We can easily create a format for this character variable using PROC Format. To create a character variable format, add $ to the front of the name of the format.
proc format; value $gender 'M' = 'Male' 'F' = 'Female' other = 'N/A' ; run;
Below is the result after we apply the format for character variables from PROC format to our SAS dataset.
data data_new; format gender $gender.; set data; run; /* Output */ gender 1 Female 2 Female 3 Male 4 Female 5 Male 6 Male 7 Male 8 Female 9 N/A 10 Female
Using PROC Format for Numeric Variables
We can also use PROC format to define formats for numeric variables.
Let’s say we have a dataset with credit scores and want to assign a new variable for the various levels of the credit scores.
data data; input credit_score; datalines; 624 653 719 754 734 701 687 643 651 740 ; run;
Let’s say we want to label the credit scores in 50 point increments. We can create a format for this variable easily with PROC format. Below, we will define ranges of numbers in the format procedure and label them appropriately.
proc format; value credit_score_buckets low - 599 = '<600' 600 - 649 = '600-649' 650 - 699 = '650-699' 700 - 749 = '700-749' 750 - high = '750+' ; run;
Below is the result from when we apply this format to our SAS dataset.
data data_new; format credit_score credit_score_buckets.; set data; run; /* Output */ credit_score 1 600-649 2 650-699 3 700-749 4 750+ 5 700-749 6 700-749 7 650-699 8 600-649 9 650-699 10 700-749
Creating a User-Defined SAS Dollar Format with PROC Format
With PROC format, we are able to define number formats so our data can be displayed cleaner. One such format is a currency format, or in particular, a dollar format.
We could use the standard SAS dollar format, but if you’ve ever worked with the built-in SAS formats, sometimes you want a little more.
With PROC format, we can build our own dollar format.
When viewing financial data, personally, I like to convert what I’m looking at into thousands, millions or billions, depending on the value of the number.
We can create a format for this easily with PROC format in the following way. We will create a picture format for displaying dollars with the following SAS code.
proc format; picture our_dollar_format low - -1000000000 = '00,000,000,009.9 B' (prefix='-$' mult=.000000001) -1000000000 < - -1000000 = '00,000,009.9 M' (prefix='-$' mult=.000001) -1000000 < - -1000 = '009.9 K' (prefix='-$' mult=.01) -1000 - < 0 = '009.99' (prefix='-$') 0 - < 1000 = '009.99' (prefix='$') 1000 - < 1000000 = '009.9 K' (prefix='$' mult=.01) 1000000 - < 1000000000 = '00,000,009.9 M' (prefix='$' mult=.000001) 1000000000 - high = '00,000,000,009.9 B' (prefix='$' mult=.000000001) ; run;
The “prefix” option adds the “$” sign to the number, and the “mult” option ensures our number is scaled appropriately.
Let’s see how “our_dollar_format” is applied in practice.
data data; format num our_dollar_format.; input num; datalines; 10382741.23 817293.12 754.89 340.40 78701.28 23074813.74 6431782.00 4832.93 ; run; /* Output */ num 1 $1.0 M 2 $817.2 K 3 $754.89 4 $340.40 5 $78.7 K 6 $2.3 M 7 $0.6 M 8 $4.8 K
The dollar values above are now formatted much cleaner.
Creating a User-Defined SAS Percent Format with PROC Format
As in the last example With PROC format, we are easily able to define number formats so our data can be displayed cleaner. Another useful format is the percent format.
We could use the standard SAS percent format, but sometimes you want the ability to customize formats.
With PROC format, we can build our own percent format.
We can create a format for this easily with PROC format in the following way. We will create a picture format and pass the “round” option so that the format rounds the data first, and then displays the percents.
proc format; picture our_percent_format(round) low -< 0 = '000009.9%' (prefix='-' mult=1000) 0 - high = '000009.9%' (mult=1000) ; run;
Depending on the scale of your data, you may want to change the multiplier.
Let’s see how this looks when it is applied to some data.
data data; format num our_percent_format.; input num; datalines; 0.123 1.789 0.504 0.981 0.48 ; run; /* Output */ num 1 12.3% 2 178.9% 3 50.4% 4 98.1% 5 48.0%
As you can see from above, this looks much better. You can play with the multiplier and the format to get exactly what you want for your user defined percent format.
Hopefully this article has been helpful for you to understand how to use PROC format to define user formats and use them in your SAS code.