Category: Analytics


Basics of a SAS Program


 The following is a basic example of a SAS program.   The program goes through the following steps.

  1. Creates a Dataset
  2. Defines the columns (commonly referred to as variables in SAS)
  3. Reads inline values to populate the SAS dataset with
  4. Prints a basic report displaying all values in the dataset
  5. Prints output given the mean of 2 numeric variables specified

If you are new to SAS but familiar with SQL, then as you learn SAS it will be easier if you think of SAS Datasets as Tables and Variables as Columns.  

Below is the sample program.  Underneath that is the program with comments.  Either of these programs can be copied and pasted into SAS and run with no other modification.

Simple SAS program example

 

DATA EMP_DATA;

INPUT EID : $5. F_NAME $ L_NAME $ AGE YRS_EMP;

CARDS;

15486 Mike Peters 41 8

52527 Louis Henna 52 14

54946 Sam Bass 31 4

17854 Cesar Chavez 25 7

77747 Fyodor Roosevelt 30 5

;

PROC PRINT DATA=EMP_DATA;

PROC MEANS MEAN; VAR AGE YRS_EMP;

 

Simple SAS program example with comments

 

 

/* This statement will tells SAS to create a dataset named EMP_DATA */

DATA EMP_DATA;

 

/* This statement describes how the data will be organized in the dataset */

INPUT EID : $5. F_NAME $ L_NAME $ AGE YRS_EMP;

 

/* CARDS as opposed to INFILE gives SAS the values to populate the dataset with */

CARDS;

15486 Mike Peters 41 8

52527 Louis Henna 52 14

54946 Sam Bass 31 4

17854 Cesar Chavez 25 7

77747 Fyodor Roosevelt 30 5

;

 

/* PROC PRINT gives tell SAS to print a report showing all values in the dataset */

PROC PRINT DATA=EMP_DATA;

 

/* PROC MEANS, in this example gives us the mean AGE and YRS_EMP (Years Employed)  */

PROC MEANS MEAN; VAR AGE YRS_EMP;

 

 


If you get the following error in SAS:

ERROR: Expression using greater than (>) has components that are of different data types

you may be trying to evaluate dates.  The following SAS PROC SQL code would generate those errors.

PROC SQL;

CREATE TABLE LIBRARY_NAME.EXAMPLE_TABLE AS

(

SELECT * FROM LIB_NAME.SRC_TABLE

WHERE PRODUCT_DATE GT ’12DEC2008′

);

QUIT;

 

The problem lies in the way I formatted the date.  If you are calling from a database and the date formats that the database holds do not work (e.g. 12/12/2008 or 2008-12-12, etc . . .) then you can use the SAS date format.  The SAS date format is DDMMMYYYY where MMM are the first three letters of the month.  You enclose this in single quotes and follow it by the letter d.  See the example below.




PROC SQL;

CREATE TABLE LIBRARY_NAME.EXAMPLE_TABLE AS

(

SELECT * FROM LIB_NAME.SRC_TABLE

WHERE PRODUCT_DATE GT ’12DEC2008′d

);

QUIT;

 



This will allow you to execute your statement. 

Using a SAS dataset and SQL table in PROC SQL


If you are reading this it is because you have written a PROC SQL statement where you reference both a dataset that you created in an earlier step and a table that resides on an RDBMS (such as SQL Server, MySQL, Teradata, etc . .  .).

Well before you go any further I have to tell you that so far it is not possible to combine the results of a SAS data set and an DBMS table within one PROC-SQL-CREATE-TABLE-AS statement.

For example with the following code I would get an ODBC error saying that "object LIBRARY.SASDATASET1 is not an object in RDBMS …"

PROC SQL;

 

CONNECT TO ODBC AS DSNNAME(DATASRC=DSN USER=USERNAME PASSWORD=PASSWORD);

 

CREATE TABLE LIBRARY.SASDATASET1 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, NAME, SSN FROM SQLDB.TABLE_EMP_ID);

 

 

CREATE TABLE LIBRARY.SASDATASET2 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, JOB_TITLE, SALARY

 FROM SQLDB.TABLE_EMP_SALARY SAL,

      LIBRARY.SASDATASET1 SDS

 WHERE SAL.ID = SDS.ID)

 

 DISCONNECT FROM DSNNAME;

 

 QUIT;

 

What you will have to do instead is make a copy of the SASDATASET1 to your SQL database (Teradata, MySQL, SQLServer, etc…) and then rewrite the second create table statement with your join pointing not to the SAS dataset but rather the table that was created from your first dataset.

 

 

PROC SQL;

 

CONNECT TO ODBC AS DSNNAME(DATASRC=DSN USER=USERNAME PASSWORD=PASSWORD);

 

CREATE TABLE LIBRARY.SASDATASET1 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, NAME, SSN FROM SQLDB.TABLE_EMP_ID);

 

 

CREATE TABLE LIBRARY.SASDATASET2 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, JOB_TITLE, SALARY

 FROM SQLDB.TABLE_EMP_SALARY SAL,

      LIBRARY.SASDATASET1 SDS

 WHERE SAL.ID = SDS.ID)

 

 DISCONNECT FROM DSNNAME;

 

 QUIT;

 

 

 

 

 

 PROC SQL;

 

CONNECT TO ODBC AS DSNNAME(DATASRC=DSN USER=USERNAME PASSWORD=PASSWORD);

 

CREATE TABLE LIBRARY.SASDATASET1 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, NAME, SSN FROM SQLDB.TABLE_EMP_ID);

 

QUIT;

 

PROC COPY;

IN=LIBRARY OUT=ODBCLIB MTYPE=(DATA);

SELECT SASDATASET1;

QUIT;

 

PROC SQL;

 

CREATE TABLE LIBRARY.SASDATASET2 AS

SELECT * FROM CONNECTION TO DSNNAME

(SELECT ID, JOB_TITLE, SALARY

 FROM SQLDB.TABLE_EMP_SALARY SAL,

      SQLDB.SASDATASET1 SDS

 WHERE SAL.ID = SDS.ID)

 

 DISCONNECT FROM DSNNAME;

 

 QUIT;

 

Questions?  Please post a comment.