Exposing a SQL query

You can expose a parametrized SQL query as a DSS API node endpoint. Calling the endpoint with a set of parameters will execute the SQL query with these parameters.

The DSS API node automatically handles pooling connections to the database, high availability and scalability for execution of your query.

Note

Only “regular” SQL connections are possible. Hive and Impala are not supported.

Note

You’ll need to install the JDBC driver on the API node servers. The installation procedure is the same as for regular DSS design and automation nodes.

Creating the SQL query endpoint

To create a custom prediction endpoint, start by creating a service. (See Your first API service for more information). Then, create an endpoint of type “SQL query”.

In “Settings”, select the connection you want to target

In “Query”:

  • Write your query, using ‘?’ as the placeholder for parameters
  • Add one parameter name for each ? that you inserted in the query.

For example:

select * from mytable where email = ?;

Parameter names:
        * target_email

When you submit an API query with the “target_email” parameter set to “test@test.com”, it will run the query select * from mytable where email = 'test@test.com' and return the results.

You must not surround the ? markers by quotes, the database engine will handle that itself.

Non-SELECT statements

It is possible to use non-SELECT statements (for example INSERT or DELETE). In that case, the result will not include columns and rows, but instead a updatedRows indicating how many rows were impacted by the query.

Transaction commit

By default, DSS does not COMMIT the connection in a SQL query endpoint. You can activate the “post-commit” option to have DSS commit the connection after the execution of the query. This is required when using INSERT or DELETE statements.

Using multiple queries

The code that you enter in the query field can include multiple SQL statements. For example, you could start by creating a temporary table, selecting from it and removing it. DSS automatically splits the query into statements.

However, the following rules apply:

  • Only the last SELECT statement will receive the parameters. It is not possible to “spread” the parameters over multiple statements.
  • If there is no SELECT statement, the last statement will receive the parameters.

Some complex use cases cannot fit these requirements, for example:

  • Creating a temporary table using the parameters
  • Selecting from it, possibly using more parameters

For this, you can use multiple queries. Click on the “Add a query” button, and enter the second code and parameter names. Each query can have different parameters, but they can also use the same parameter: if you use the same parameter name in two different queries, both will use the same parameter from the REST API query.

All queries are executed within the same connection so temporary tables will persist. Commit happens after each query if you enable the “post-commit” option.

Testing your queries

To ease the process of testing your enpdoints, a “Development server” is integrated in the DSS UI.

To test your code, click on the “Deploy to Dev Server” button. The dev server starts and load your model. You are redirected to the Test tab where you can see whether your model loads.

You can then define Test queries, i.e. JSON objects akin to the ones that you would pass to the API node user API. When you click on the “Play test queries” button, the test queries are sent to the dev server, and the result is printed.

Defining the connection on the API node

Before you can activate your service on a API node, the API node must have the definition of the connection (with the same name as it was on the design node)

Add the connection in a top-level remappedConnections in the API node’s DATA_DIR/config/server.json:

{
        "remappedConnections": {
                "MY-CONNECTION": {
                        "type": "PostgreSQL",
                        "params": {
                                "host": "my-db-host",
                                "db": "my-db",
                                "user": "my-user",
                                "password": "my-password"
                        }
                },
                "MY-OTHER-CONNECTION": { "..." }
        },
        "..."
}

The connection parameters to use can be found in the config/connections.json file in the DSS design or automation node. The password is encrypted on the design/automation node, you’ll need to retype it in the API node DATA_DIR/config/server.json file.

Server-side tuning

It is possible to tune the behavior of SQL query endpoints on the API node side. This is used to tune parameters of the connection pool to the database.

You can configure this by creating a JSON file in the config/services folder in the API node’s data directory.

mkdir -p config/services/<SERVICE_ID>

Then create or edit the config/services/<SERVICE_ID>/<ENDPOINT_ID>.json file

This file must have the following structure and be valid JSON (the values shown here are the defaults):

{
    "sql" : {
        "connectionsEvictionTimeMS" : 60000,
        "evictionIntervalMS" : 30000,
        "maxPooledConnections": 10
    }
}

Those parameters are all positive integers:

  • connectionsEvictionTimeMS (default: 60000): Connections that have not been used for that amount of time are closed
  • evictionIntervalMS (default: 30000): How often to check for connections that can be closed
  • maxPooledConnections (default: 10): Maximum number of connections that can be opened to the database

You can also deploy your service on multiple servers, see High availability and scalability.