SAS Format

Description

SAS .sas7bdat files can be imported into DSS using a Pandas-based reader.

This capability is provided by the “SAS Format” plugin, which you need to install. Please see Installing plugins.

This implementation will work with almost any SAS file but may be slower than the default one provided with Dataiku. It is provided by the Pandas library shipped with Dataiku, and may change when future Pandas versions are released.

How To Use

You need to import your .sas7bdat file into Dataiku and then choose the SAS using Pandas format, as shown in the picture above.

The chunksize parameter lets you choose how many lines will be read during each iteration ; a higher value means faster read but higher memory usage.

Implementation choices:

  • Date values may be kept in their original SAS format (number of seconds since 1960-1-1) depending on the Pandas version.

  • Integer values may be transformed into double values.