«

»

Working with Special Missing Null Values for SAS numeric variables

Introduction

SAS not only provides standard functionality for setting character or numeric fields to null, it also provides the ability to create specialized types of null values for numeric fields. For instance, you can create one type of null value to represent data that doesn’t exist vs. data that exists but is unknown vs. data that hasn’t been captured yet, etc. These specialized categories of null values are called special missing values.

First this blog shows how SAS handles standard numeric nulls values. Then it shows how you can use special missing values to create specialized categories of null values.

This blog focuses on nulls for numeric fields. At some point I may create a separate blog discussing null character fields in SAS.

 

I. A quick primer on null numeric values in SAS

The following code sets the nbr_sales  (a numeric field) for the second observation to null.

/* create data sets with one null sales value */

data samp_ds1;

format fname lname $10. nbr_sales 8. dob date9.;

input fname $ lname $ nbr_sales dob;

datalines;

John Smith 20 0

Mary Washington . 0

Meriwether Clark 30 1240

;

SAS displays these values as a . in the numeric field.

You can also assign a numeric variable to null in a sas data step by setting the numeric value to . (i.e, a period).

/* sets dob to null for second row */

data samp_ds1;

set samp_ds1;

if fname=”Mary” then dob=.;

run;

You can query for numeric nulls by referencing numeric fields with a . value

/* just filter the rows with a null dob */

data temp;

set samp_ds1 (where=(dob=.));

run;

 

Warning
SAS treats numeric nulls as equal to “the lowest possible number” (essentially negative infinity) when sorting a numeric field. This means, for instance, that if you want to subset your data to only those values less than a certain number, you must explicitly filter out nulls if you don’t want them. For example, if you want to to subset your data to only people with a birth date before a certain date, and you don’t filter out nulls, you will include those people with null birth dates.

/* this operation does NOT filter out null dobs */

data born_before_1961_wrong;

set samp_ds1 (where=(dob lt “01JAN1961″d));

run;

To avoid this, remember to explicitly filter out null values.

/* this operation explicitly filters out null dobs */

data born_before_1961_right;

set samp_ds1 (where=(dob lt “01JAN1961″d and dob ne .));

run;

II. Working with SAS special missing values

In addition to providing standard null values for numeric fields, SAS also provides the ability to set special null values to represent specific types of null values. These values are represented by a dot followed by a single character (.a, .b, .c, etc.) or a hyphen (.-).

In a “datalines” data step, you can set these specialized null values using a missing statement. The example that follows creates two specia missing values (.A and .I):

/* create specialized null values for absent [A} and incomplete [I] */

data samp_ds2;

format fname lname $10. nbr_sales 8. dob date9.;

missing A I;

input fname $ lname $ nbr_sales dob;

datalines;

John Smith 20 0

Mary Washington A .

Meriwether Clark 30 .

;

Now, the specialized null value for nbr_sales for the second row is set to the special missing value of .A, and is represented by A in the numeric field.

You can also assign variables to special missing numeric values in a data statement.

/* update to specialized missing value using a hyphen */

data samp_ds2;

set samp_ds2;

if fname=”Mary” then dob=._;

run;

 

Warning
A search for regular SAS numeric null values will miss special numeric null values. Keep this in mind when creating and working with special missing values.

 

/* a regular search for nulls will MISS specialized null values */

data temp;

set samp_ds2 (where=(dob=.));

run;

However, the following condition will catch all numeric null values, including specialized null values.

/* this regular search for nulls will also catch special nulls */

data temp;

set samp_ds2 (where=(dob<=.z));

run;

 

Another way you can catch all types of nulls is using the SAS missing() function, which will catch all null values for both character and numeric data types.

/* this search for nulls will also catch special nulls */

data temp;

set samp_ds2 (where=(missing(dob)));

run;

Yet another way you can catch all numeric nulls is to use the missing reserved word in a proc sql statement.

/* proc sql will catch them using reserved word “missing” */

proc sql;

create table temp as

select * from samp_ds2

where dob is missing;

quit;

Finally, you can filter for just special missing values using the following approach.

/* this search for nulls will only catch the special type you request */

data temp;

set samp_ds2 (where=(nbr_sales =.a));

run;

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>