API node user API¶
Predictions are obtained on the API node by using the User REST API.
The REST API¶
Request and response formats¶
For POST and PUT requests, the request body must be JSON, with the Content-Type header set to application/json.
For almost all requests, the response will be JSON.
Whether a request succeeded is indicated by the HTTP status code. A 2xx status code indicates success, whereas a 4xx or 5xx status code indicates failure. When a request fails, the response body is still JSON and contains additional information about the error.
Authentication¶
Each service declares whether it uses authentication or not. If the service requires authentication, the valid API keys are defined in DSS.
The API key must be sent using HTTP Basic Authentication:
Use the API key as username
The password can remain blank
The valid API keys are defined on the DSS side, not on the API node side. This ensures that all instances of an API node will accept the same set of client keys
Methods reference¶
The reference documentation of the API is available at https://doc.dataiku.com/dss/api/13/apinode-user
API Python client¶
Dataiku provides a Python client for the API Node user API. The client makes it easy to write client programs for the API in Python.
Installing¶
The API client is already pre-installed in the DSS virtualenv
From outside of DSS, you can install the Python client by running
pip install dataiku-api-client
Reference API doc¶
- class dataikuapi.APINodeClient(uri, service_id, api_key=None, bearer_token=None, insecure_tls=False)¶
Entry point for the DSS API Node client This is an API client for the user-facing API of DSS API Node server (user facing API)
- predict_record(endpoint_id, features, forced_generation=None, dispatch_key=None, context=None, with_explanations=None, explanation_method=None, n_explanations=None, n_explanations_mc_steps=None)¶
Predicts a single record on a DSS API node endpoint (standard or custom prediction)
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
features – Python dictionary of features of the record
forced_generation – See documentation about multi-version prediction
dispatch_key – See documentation about multi-version prediction
context – Optional, Python dictionary of additional context information. The context information is logged, but not directly used.
with_explanations – Optional, whether individual explanations should be computed for each record. The prediction endpoint must be compatible. If None, will use the value configured in the endpoint.
explanation_method – Optional, method to compute explanations. Valid values are ‘SHAPLEY’ or ‘ICE’. If None, will use the value configured in the endpoint.
n_explanations – Optional, number of explanations to output per prediction. If None, will use the value configured in the endpoint.
n_explanations_mc_steps – Optional, precision parameter for SHAPLEY method, higher means more precise but slower (between 25 and 1000). If None, will use the value configured in the endpoint.
- Returns:
a Python dict of the API answer. The answer contains a “result” key (itself a dict)
- predict_records(endpoint_id, records, forced_generation=None, dispatch_key=None, with_explanations=None, explanation_method=None, n_explanations=None, n_explanations_mc_steps=None)¶
Predicts a batch of records on a DSS API node endpoint (standard or custom prediction)
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
records – Python list of records. Each record must be a Python dict. Each record must contain a “features” dict (see predict_record) and optionally a “context” dict.
forced_generation – See documentation about multi-version prediction
dispatch_key – See documentation about multi-version prediction
with_explanations – Optional, whether individual explanations should be computed for each record. The prediction endpoint must be compatible. If None, will use the value configured in the endpoint.
explanation_method – Optional, method to compute explanations. Valid values are ‘SHAPLEY’ or ‘ICE’. If None, will use the value configured in the endpoint.
n_explanations – Optional, number of explanations to output per prediction. If None, will use the value configured in the endpoint.
n_explanations_mc_steps – Optional, precision parameter for SHAPLEY method, higher means more precise but slower (between 25 and 1000). If None, will use the value configured in the endpoint.
- Returns:
a Python dict of the API answer. The answer contains a “results” key (which is an array of result objects)
- forecast(endpoint_id, records, forced_generation=None, dispatch_key=None)¶
Forecast using a time series forecasting model on a DSS API node endpoint
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
records (array) –
List of time series data records to be used as an input for the time series forecasting model. Each record should be a dict where keys are feature names, and values feature values.
Example:
records = [ {'date': '2015-01-04T00:00:00.000Z', 'timeseries_id': 'A', 'target': 10.0}, {'date': '2015-01-04T00:00:00.000Z', 'timeseries_id': 'B', 'target': 4.5}, {'date': '2015-01-05T00:00:00.000Z', 'timeseries_id': 'A', 'target': 2.0}, ... {'date': '2015-03-20T00:00:00.000Z', 'timeseries_id': 'B', 'target': 1.3} ]
forced_generation – See documentation about multi-version prediction
dispatch_key – See documentation about multi-version prediction
- Returns:
a Python dict of the API answer. The answer contains a “results” key (which is an array of result objects, corresponding to the forecast records) Example:
{'results': [ {'forecast': 12.57, 'ignored': False, 'quantiles': [0.0001, 0.5, 0.9999], 'quantilesValues': [3.0, 16.0, 16.0], 'time': '2015-03-21T00:00:00.000000Z', 'timeseriesIdentifier': {'timeseries_id': 'A'}}, {'forecast': 15.57, 'ignored': False, 'quantiles': [0.0001, 0.5, 0.9999], 'quantilesValues': [3.0, 18.0, 19.0], 'time': '2015-03-21T00:00:00.000000Z', 'timeseriesIdentifier': {'timeseries_id': 'B'}}, ...], ...}
- predict_effect(endpoint_id, features, forced_generation=None, dispatch_key=None)¶
Predicts the treatment effect of a single record on a DSS API node endpoint (standard causal prediction)
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
features – Python dictionary of features of the record
forced_generation – See documentation about multi-version prediction
dispatch_key – See documentation about multi-version prediction
- Returns:
a Python dict of the API answer. The answer contains a “result” key (itself a dict)
- predict_effects(endpoint_id, records, forced_generation=None, dispatch_key=None)¶
Predicts the treatment effects on a batch of records on a DSS API node endpoint (standard causal prediction)
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
records – Python list of records. Each record must be a Python dict. Each record must contain a “features” dict (see predict_record) and optionally a “context” dict.
dispatch_key – See documentation about multi-version prediction
- Returns:
a Python dict of the API answer. The answer contains a “results” key (which is an array of result objects)
- sql_query(endpoint_id, parameters)¶
Queries a “SQL query” endpoint on a DSS API node
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
parameters – Python dictionary of the named parameters for the SQL query endpoint
- Returns:
a Python dict of the API answer. The answer is the a dict with a columns field and a rows field (list of rows as list of strings)
- lookup_record(endpoint_id, record, context=None)¶
Lookup a single record on a DSS API node endpoint of “dataset lookup” type
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
record – Python dictionary of features of the record
context – Optional, Python dictionary of additional context information. The context information is logged, but not directly used.
- Returns:
a Python dict of the API answer. The answer contains a “data” key (itself a dict)
- lookup_records(endpoint_id, records)¶
Lookups a batch of records on a DSS API node endpoint of “dataset lookup” type
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
records – Python list of records. Each record must be a Python dict, containing at least one entry called “data”: a dict containing the input columns
- Returns:
a Python dict of the API answer. The answer contains a “results” key, which is an array of result objects. Each result contains a “data” dict which is the output
- run_function(endpoint_id, **kwargs)¶
Calls a “Run function” endpoint on a DSS API node
- Parameters:
endpoint_id (str) – Identifier of the endpoint to query
kwargs – Arguments of the function
- Returns:
The function result