Delta Lake

Delta Lake is a file storage format on top of Parquet, that augments Parquet with the ability to perform updates and removals, and other database-oriented features.

Warning

Experimental: Support for Delta Lake is Experimental and not fully-supported

Applicability and limitations

  • DSS only provides read support on Delta Lake. Write is not supported

  • DSS does not provide support for “time-travel” on Delta Lake. Only the latest version is read

  • Delta Lake datasets can only be stored on S3, Azure Blob Storage, Google Cloud Storage or HDFS

  • While Delta Lake datasets can be processed with any recipe, we strongly recommend processing them with Spark recipes

  • Delta Lake datasets that have underlying partitioning will be read unpartitioned. It is not supported to partition a Delta Lake dataset