Exposing a prediction model

The primary function of the DSS API node is to expose as a service a prediction model trained using the DSS machine learning component.

The steps to expose a prediction model are:

  • Train the model in Analysis
  • Deploy the model to Flow
  • Create a new API service
  • Create a prediction endpoint using the saved model
  • Create a package of your service
  • Deploy and activate the package on the API node

This section assumes that you already have installed and started a DSS API node instance. Please see Installing the API node if that’s not yet the case.

Creating the model

The first step is to create a model and deploy it to the Flow. This is done using the regular Machine Learning component of DSS. Please refer to the Tutorial 103 of DSS and to Machine learning for more information.

Create the service and endpoint in DSS

An API service is created within a project in DSS. From the project Homepage, click on “API SERVICES” in the navigation bar.

../_images/apiservices_link.png
  • Click on New API Service
  • You need to give an identifier to your service. This “service id” will be referenced in APP interaction with the API node, and will be part of the URL to which your clients connect.
  • Click on your newly-created service

Our service is created but does not yet have any endpoint, ie. it does not yet expose any model. See Concepts for what endpoints are.

  • Click on the “Create new endpoint” button

We are going to create a “Prediction model” endpoint. You need to select the model that this endpoint will use. This is a saved model (ie. a model which has been deployed to the Flow). You also need to give an identifier to your endpoint. The endpoint id will be part of the URL to which your clients connect.

If your model is java-compatible (See: The machine learning engines), you may select “Java scoring.” This will make the deployed model use java to score new records, resulting in extremely improved performance and throughput for your endpoint.

For a simple service, that’s it ! You don’t need any further configuration.

Create and transfer the package

Now that your service is properly configured in DSS, the next step is to create a package (See Concepts).

  • Click on the “Prepare package” button
  • DSS asks you for a package version number. This version number will be the identifier of this generation for all interaction with API node. It is recommended that you use a meaningful short name like v4-new-customer-features. You want to be able to remember what was new in that generation (think of it as a Git tag)
  • Go to the packages tab.
  • Click on the Download button

The package file (a .zip file) is downloaded to your computer. Upload the zip file to all hosts running the API node software.

Create the service in API node

We are now going to actually activate the package in the API node

  • Go to the API node directory
  • Create the service: run the following command
./bin/apinode-admin service-create <SERVICE_ID>
  • Then, we need to import the package zip file:
./bin/apinode-admin service-import-generation <SERVICE_ID> <PATH TO ZIP FILE>

Now, the API node has unzipped the package in its own folders, and is ready to start using it. At that point, however, the new generation is only available, it’s not active. In other words, if we were to perform an API query, it would fail because no generation is currently active.

./bin/apinode-admin service-switch-to-newest <SERVICE_ID>

When this command returns, the API node service is now active, running on the latest (currently the only) generation of the package.

Perform a test query

We can now actually perform a prediction. Query the following URL (using your browser for example):

http://APINODE_SERVER:APINODE_PORT/public/api/v1/SERVICE_ID/ENDPOINT_ID/predict-simple?feat1=val1&feat2=val2

where feat1 and feat2 are the names of features (a.k.a. columns) in your train set.

You should receive a JSON reply with a result section containing your prediction (and probabilities in case of a classification model)

Perform real queries

Once you have confirmed that your service endpoint works, you can actually use the API to integrate in your application.

See API node user API

Configure endpoint parallelism

You may want to configure how many concurrent requests your API node can handle. This depends mainly on your model (its speed and in-memory size) and the available resources on the server running the API node. You can configure the parallelism parameters for each endpoint by creating a JSON file in the config/services folder in the API node’s data directory.

mkdir -p config/services/<SERVICE_ID>
echo '{"floor":1, "ceil":2}' >config/services/<SERVICE_ID>/<ENDPOINT_ID>.json

This configuration allows you to control the number of allocated pipelines. One allocated pipeline means one model loaded in memory that can handle a prediction request. If you have 2 allocated pipelines, 2 requests can be handled simultaneously, other requests will be queued until one of the pipeline is freed (or the request times out). When the queue is full, additional requests are rejected.

Those parameters are all positive integers:

  • floor (default: 1): Minimum number of pipelines. Those are allocated as soon as the endpoint is loaded.
  • ceil (default: 8): Maximum number of allocated pipelines at any given time. Additional requests will be queued. ceil floor
  • cruise (default: 2): The “nominal” number of allocated pipelines. When more requests come in, more pipelines may be allocated up to ceil. But when all pending requests have been completed, the number of pipeline may go down to cruise. floor cruise ceil
  • queue (default: 16): The number of requests that will be queued when ceil pipelines are already allocated and busy. The queue is fair: first received request will be handled first.
  • timeout (default: 10000): Time, in milliseconds, that a request may spend in queue wating for a free pipeline before being rejected.

You can also deploy your service on multiple servers, see High availability and scalability.