Without API Deployer¶
Note
This method is not available on Dataiku Cloud.
You can configure the parallelism parameters for the endpoint by creating a JSON file in the
config/services folder in the API node’s data directory.
mkdir -p config/services/<SERVICE_ID>
Then create or edit the config/services/<SERVICE_ID>/<ENDPOINT_ID>.json file
This file must have the following structure and be valid JSON:
{
"pool" : {
"floor" : 1,
"ceil" : 16,
"cruise": 8,
"queue" : 16,
"timeout" : 10000
}
}
Those parameters are all positive integers:
floor(default: 1): Minimum number of pipelines. Those are allocated as soon as the endpoint is loaded.ceil(default: 16): Maximum number of allocated pipelines at any given time. Additional requests will be queued.ceil ≥ floorcruise(default: 8): The “nominal” number of allocated pipelines. When more requests come in, more pipelines may be allocated up toceil. But when all pending requests have been completed, the number of pipeline may go down tocruise.floor ≤ cruise ≤ ceilqueue(default: 16): The number of requests that will be queued whenceilpipelines are already allocated and busy. The queue is fair: first received request will be handled first.timeout(default: 10000): Time, in milliseconds, that a request may spend in queue wating for a free pipeline before being rejected.
Creating a new pipeline is an expensive operation, so you should aim cruise around the expected maximal nominal query load.
With API Deployer¶
You can configure the parallelism parameters for the endpoint in the Deployment settings, in the “Endpoints tuning” setting.
Go to the Deployment Settings > Endpoints tuning
Add a tuning block for your endpoint by entering your endpoint id and click Add
Configure the parameters
Those parameters are all positive integers:
Pooling min pipelines(default: 1): Minimum number of pipelines. Those are allocated as soon as the endpoint is loaded.Pooling max pipelines(default: 8): Maximum number of allocated pipelines at any given time. Additional requests will be queued.max pipelines ≥ min pipelinesPooling cruise pipelines(default: 2): The “nominal” number of allocated pipelines. When more requests come in, more pipelines may be allocated up tomax pipelines. But when all pending requests have been completed, the number of pipeline may go down tocruise pipelines.min pipelines ≤ cruise pipelines ≤ ceil pipelinesPooling queue length(default: 16): The number of requests that will be queued whenmax pipelinespipelines are already allocated and busy. The queue is fair: first received request will be handled first.Queue timeout(default: 10000): Time, in milliseconds, that a request may spend in queue waiting for a free pipeline before being rejected.
Creating a new pipeline is an expensive operation, so you should aim cruise pipelines around the expected maximal nominal query load.