Dataiku DSS¶
Welcome to the reference documentation for Dataiku Data Science Studio (DSS).
Is This the Help You’re Looking For?¶
- The reference documentation contains information on the details of installing and configuring Dataiku DSS in your environment, using the tool through the browser interface, and driving it through the API.
- Dataiku Academy contains self-learning tutorials and use cases.
- Dataiku Answers is a place where you can ask questions and receive answers from other members of the community.
Reference Doc Contents¶
- Installing DSS
- Requirements
- Installing a new DSS instance
- Upgrading a DSS instance
- Updating a DSS license
- Other installation options
- Setting up Hadoop and Spark integration
- Setting up Dashboards and Flow export to PDF or images
- R integration
- SageMaker Integration
- Customizing DSS installation
- Installing database drivers
- Java runtime environment
- Python integration
- Installing a DSS plugin
- Configuring LDAP authentication
- Working with proxies
- Migration operations
- DSS concepts
- Homepage
- Projects
- Connecting to data
- Supported connections
- Upload your files
- Server filesystem
- HDFS
- Amazon S3
- Google Cloud Storage
- Azure Blob Storage
- FTP
- SCP / SFTP (aka SSH)
- HTTP
- SQL databases
- Cassandra
- MongoDB
- Elasticsearch
- Managed folders
- “Files in folder” dataset
- Metrics dataset
- Internal stats dataset
- HTTP (with cache)
- Dataset plugins
- Data connectivity macros
- Making relocatable managed datasets
- Data ordering
- Exploring your data
- Schemas, storage types and meanings
- Data preparation
- Charts
- Machine learning
- Prediction (Supervised ML)
- Clustering (Unsupervised ML)
- Automated machine learning
- Model Settings Reusability
- Features handling
- Algorithms reference
- Advanced models optimization
- Models ensembling
- Deep Learning
- Models lifecycle
- Scoring engines
- Writing custom models
- Exporting models
- Partitioned Models
- The Flow
- Visual recipes
- Recipes based on code
- Code notebooks
- Webapps
- Code reports
- Dashboards
- DSS in the cloud
- Working with partitions
- DSS and Hadoop
- Setting up Hadoop integration
- Connecting to secure clusters
- Hadoop filesystems connections (HDFS, S3, EMRFS, WASB, ADLS, GS)
- DSS and Hive
- DSS and Impala
- Hive datasets
- Multiple Hadoop clusters
- Dynamic AWS EMR clusters
- Hadoop user isolation
- Distribution-specific notes
- Teradata Connector For Hadoop
- Dynamic Google Dataproc clusters
- DSS and Spark
- DSS and SQL
- DSS and Python
- DSS and R
- Metastore catalog
- Code environments
- Running in containers
- Concepts
- Using code envs with containerized execution
- Setting up (Kubernetes)
- Unmanaged Kubernetes clusters
- Managed Kubernetes clusters
- Using Amazon Elastic Kubernetes Service (EKS)
- Using Microsoft Azure Kubernetes Service (AKS)
- Using Google Kubernetes Engine (GKE)
- Setting up (Docker)
- Remote Docker daemons
- Customization of base images
- Troubleshooting
- Collaboration
- Automation scenarios, metrics, and checks
- Automation node and bundles
- API Node & API Deployer: Real-time APIs
- Time Series
- Unstructured data
- Plugins
- Python APIs
- Using the APIs inside of DSS
- Using the APIs outside of DSS
- API for interacting with datasets
- API for interacting with Pyspark
- API for managed folders
- API for interacting with saved models
- API for scenarios
- API for performing SQL, Hive and Impala queries
- API for performing SQL, Hive and Impala queries like the recipes
- API for metrics and checks
- API For creating static insights
- Reference API documentation of
dataiku
- API for plugin components
dataikuapi
: The REST API client
- R API
- Public REST API
- Additional APIs
- File formats
- Security
- User Isolation
- Operating DSS
- Advanced topics
- Accessibility
- Troubleshooting
- Release notes
- DSS 6.0 Release notes
- DSS 5.1 Release notes
- DSS 5.0 Release notes
- DSS 4.3 Release notes
- DSS 4.2 Release notes
- DSS 4.1 Release notes
- DSS 4.0 Release notes
- DSS 3.1 Release notes
- DSS 3.0 Relase notes
- DSS 2.3 Relase notes
- DSS 2.2 Relase notes
- DSS 2.1 Relase notes
- DSS 2.0 Relase notes
- DSS 1.4 Relase notes
- DSS 1.3 Relase notes
- DSS 1.2 Relase notes
- DSS 1.1 Release notes
- DSS 1.0 Release Notes
- Pre versions
- Other Documentation
- Third-party acknowledgements