Learn SAS for FREE Day15

ARRAYS in SAS

We move on to the last day of our learning for SAS Base Certification. There will be more tutorials after this but they will be for Advanced SAS and SAS Analytics only.

Just like other programming languages we have arrays in SAS. Mainly to simply coding, and reduce number of lines in code.

Lets understand this by example. Consider below employee data, having customer rating for each quarter:

name  qtr1  qtr2  qtr3  qtr4
sumit  7     8    6     8
jack   10    7    8     9
mac    8     4    6     3

Now suppose, we need to pay quarterly commission to each of employee. We will mulitply $100 * rating and payout accordingly. Rather than writing excess code, we can use array.

DATA commision;
SET emp;
ARRAY comm{4} qtr1 qtr2 qtr3 qtr4;
 DO i = 1 to 4;
   comm(i) = 100*comm(i);
 END;
RUN;

Here we are creating an array of dimension 4, and each element of the array refers to the qtr1-qtr4 variables of emp data. Here we get:

In case the names of variables were different in emp, say qtr1, quarter2, qtrthree  and qtr4. In that case we need to manually specify which variables does the array refer to:

ARRAY   myarray(4)   qtr1  quarter2  qtrthree  qtr4;

 

Reading all Numeric values into the array:

ARRAY myarray{*}  _NUMERIC_

For all character values

ARRAY myarray{*}  _CHARACTER_

If our dataset has all numeric or all character values, we can read using _ALL_

ARRAY myarray{*}  _ALL_

Please note that _ALL_ will give error in case all variables in the dataset don’t have either all numeric or all character values.

ERROR: All variables in array list must be the same type, i.e., all numeric or character.

Similarly, we have following restrictions:

1. We can’t have the array name as any variable name of existing data set. Example:

DATA commision;
SET emp;
 ARRAY name{*} qtr1-qtr4 ;
  DO ;
  END;
RUN;

As name variable exists in emp dataset, we get following error:

ERROR 124-185: The variable name has already been defined.

 

2. Avoid using SAS function name as array name. Example, don’t call it  CATX. Because CATX is a sas function, this will work fine, but in case we need to use CATX function in the datastep, it will cause error.

 

3. You cannot use array names in LABEL, FORMAT, DROP, KEEP, or LENGTH statements.

To define an array following brackets allowed
( ) = parentheses
{ } = braces
[ ] = brackets


DIMENSION of ARRAY

So far we have used array as “myarray{4}”. That is of fixed dimension. Although we can say “myarray{*}” but then we need to mandatory specify all the variables it refers. Below gives error:

ARRAY myarray[*]

ERROR: The array Weight has been defined with zero elements.

To do a loop using dimension of array, we can use “DIM” keyword:

DO i = 1 TO  DIM(myarray)

We can’t specify array size less than the number of variables we want it to have, below is wrong:

ARRAY  myarray(2)  age salary weight;

ERROR: Too many variables defined for the dimension(s) specified for the array.

 

Similarly, we can’t define it larger than number of variables:

ARRAY  myarray(100)  age salary weight;

ERROR: Too few variables defined for the dimension(s) specified for the array

 

In case we have a dataset like emp, which has mix of char and numeric variables, and we try to create an array to have _ALL_ variables, it fails:

ERROR: All variables in array list must be the same type, i.e., all numeric or character.



NEW ARRAY WITH NEW VARIABLES

We can even create new array to have new variables in the dataset. This is majorly used for computations. Example, we want to compare how the employees ratings were between qtr1 and qtr2, qtr2 and qtr3 and finally qtr3 and qtr4.

DATA test;
SET emp;
ARRAY myarray{*} _NUMERIC_;

ARRAY diff(3);

DO i = 1 to DIM(diff);
 diff[i] = myarray(i+1) - myarray(i);
END;
RUN;

As you see diff1-diff3 appear in the new dataset.

To create an array of character variables, add a dollar sign ($) after the array dimension.

array names{5} $;

Optionally we can specify a fixed length for character type arrays:

array names{5} $  20;

 

In case we want to avoid this new array to appear in the dataset, we can create it with  _TEMPORARY_ keyword. This is majorly used for some calculations.

example, suppose we want to decrease each employees rating for qtr1 by 1, qtr2 by 2 and so on.

DATA test;
SET emp;
ARRAY myarray{*} _NUMERIC_;

ARRAY reducer(4) _TEMPORARY_  (1 2 3 4);

DO i = 1 to 4;
 myarray[i] = myarray(i) - reducer(i);
END;
RUN;

Here we have created a temporary array which has 4 elements, reducer1 to 4. Reducer1 =1, Reducer2 = 2 and so on.

Similarly, we can assign fixed values to temporary array of character types:

ARRAY  name{3}  $  (‘ABC’, ‘DEF’, ‘XYZ’)



 

Posted in: SAS Filed under:

Leave a Reply

Your email address will not be published. Required fields are marked *