Logging and auditing

The API node outputs three kinds of logs:

  • Regular runtime logs (in the run/apimain.log file)

  • Audit logs for the administration API

  • Logs of queries

Logging of queries is especially important if you plan on implementing a feedback loop. Knowing what has been predicted for what records is important. You’ll also need to have a way to retrieve “what finally happened” for each record that the API node predicted (did this customer convert? churn? was it a fraud? did the sensor fail? …)

By default:

  • Administration API audit logs are written to the same run/apimain.log file

  • Queries are logged in the run/audit folder

How to configure audit and query logging

Audit and query logging is done through the standard Java Log4J logging mechanism.


Dataiku DSS has been confirmed to be not vulnerable to the family of vulnerabilities regarding Log4J:

  • “log4shell” vulnerabilities (CVE-2021-44228, CVE-2021-45046, CVE-2021-45105)

  • JMSAppender and JMSSink vulnerabilities (CVE-2021-4104, CVE-2021-44832,CVE-2022-23302)

  • JDBCAppender vulnerabilities (CVE-2022-23305)

  • SocketAppender vulnerability (CVE-2019-17571)

  • SMTPAppender vulnerability (CVE-2020-9488)

  • Chainsaw vulnerability (CVE-2022-23307)

No mitigation action nor upgrade is required. Dataiku keeps closely monitoring the security situation on log4j, as it does for all of its third-party dependencies, and will take action if a vulnerability is exploitable.

You can set the destination of these loggers by modifying the Log4J appenders in the bin/log4j.properties file

The loggers used for audit logging are:

  • dku.apinode.audit.queries:

    • Logs all queries to prediction endpoints, in a JSON format. The log message includes the input features, the prediction results, and timing information

  • dku.apinode.audit.auth

    • Logs authentication failures, both on Admin and User APIs

  • dku.apinode.audit.admin

    • Logs all modifications done through the admin API. The log message includes details about the API key used to perform the call

  • dku.apinode.audit.allcalls

    • Logs basic information for all API calls, both Admin and User APIs. It is generally not recommended to enable this logger

How to turn on query logging

In your API_DATA_DIRECTORY, create a directory and subdirectory called resources/logging.

In the logging directory, add a file called dku-log4j.properties.

Copy the following content into dku-log4j.properties

# By default, send audit logging to a specific file in run
# For an inalterable audit log, this should be sent to an external system,
# not controlled by the DSS user

# Queries logging: use rolling files.
log4j.appender.QUERIES_AUDIT_FILE.layout.ConversionPattern={"timestamp" : "%d{yyyy/MM/dd-HH:mm:ss.SSS}Z", "logger": "%c", "severity" : "%p", "message" : %m}%n

# Remove audit logs from regular CONSOLE logger
log4j.logger.dku.audit= INFO, AUDITFILE

# And enable it
log4j.logger.dku.audit.generic= INFO, QUERIES_AUDIT_FILE

Then, restart your API node (./bin/dss restart).

You should now see queries logged in the run/audit folder.

Logging queries to Kafka

Apache Kafka is a distributed message queue, which can be used to get query logs out of the API node.

To enable logging queries to Kafka:

  • Add all jars from the Kafka distribution to the lib/java folder

  • Replace the “Queries logging” part of bin/log4j.properties by the following snippet:


log4j.logger.dku.apinode.audit.queries= INFO, QUERIES_KAFKA


You can also send administration and authentication audit logs to Kafka by setting appropriate configuration for the other audit loggers.