Geographic processing

DSS provides a variety of processors to work with geographic information.

DSS also provides a set of formulas to compute geographic operations (see Formula language)

Geopoint converters

Most geographic processors in DSS are based on the Geopoint meaning.

DSS provides two processors to convert between a Geopoint column and latitude/longitude columns:

Resolve GeoIP

The Resolve GeoIP processor uses the GeoIP database (https://www.maxmind.com) to resolve an IP address to the associated geographic coordinates.

../_images/geoip-processor.png

It produces two kinds of information:

  • Administrative data (country, region, city, …)

  • Geographic data (latitude, longitude)

The output GeoPoint can be used for Map Charts.

Reverse geocoding

Note

To use this processor, you must first install the DSS plugin called “Reverse Geocoding”. Please see Installing plugins.

The reverse geocoding processor takes geographic coordinates as input and extracts the different levels of administrative boundary to which it belongs (country, region, city …). The administrative boundaries we use are the ones defined in Open Street Map. The type of administrative boundary for each level depends on the country. For more information please refer to the Open Street Map documentation.

  • You need a column containing a Geo Point or a Geometry as input.

  • The processor outputs two columns per administrative level for which you provide a column name: one with the local name of the administrative entity and one with the English name.

  • Selecting “Output the smallest selected administrative area” will output the shape of the administrative entity. This polygon is encoded using WKT format. It is displayed as a third column to the administrative level for which you provided a column name. In case several levels are selected, only the smallest in size is displayed (for instance if both city and country are selected, it will return the shape of the city).

Zipcode geocoding

Note

To use this processor, you must first install the DSS plugin called “Zipcode geocoding”. Please see Installing plugins.

This processor provides “Country + zipcode” –> “Geographic coordinates” resolution, at the city-level resolution.

  • You need a column containing the country (name or ISO code) and a column containing the zipcode

  • The processor outputs a Geo Point column

Geo-join

The Geo-join processor performs a geographic nearest-neighbor join between two datasets with geo coordinates.

Geocoding

This processor performs the geocoding of an address using either the MapQuest or Bing API.

For details on which kind of addresses are handled, please see the doc of the associated API.

Warning

This processor requires the use of an API key for the external services. You must comply to the terms of use of these services

API calls can be expensive, depending on your API key usage terms

Warning

This processor (with both API) is extremely slow. Once you are happy with the result, you should export to another dataset so that the processor does not make API call every time you restart the studio.

Change coordinates system

This processor changes the Coordinates Reference System (CRS) of a geometry or geopoint column.

Source and target CRS can be given either as a EPSG code (e.g., “EPSG:4326”) or as a projected coordinate system WKT (e.g., “PROJCS[…]”).

Warning

Dataiku uses the WGS84 (EPSG:4326) coordinates system when processing geometries. Before manipulating any geospatial data in Dataiku, make sure they are projected in the WGS84 (EPSG:4326) coordinates system.

Use this processor to convert data projected in a different CRS to the WGS84 (EPSG:4326) coordinates system.

../_images/change-crs-processor.png

Create area around a geopoint

The Create area around a geopoint processor performs creation of polygons centered on input geopoints. For each input geospatial point, a spatial polygon is created around it, delimiting the area of influence covered by the point (all the points that fall within a given distance from the geopoint). The shape area of the polygon can be either rectangular or circular (using an approximation) and the size will depend on the selected parameters.

Extract from geo column

The Extract from geo column processor extracts data from a geometry column:

  • centroid point,

  • length (if input is not a point),

  • area (if input is a polygon).