Geographic data preparation

The prepare recipe provides a variety of processors to work with geographic information. DSS also provides a set of formulas to compute geographic operations (see Formula language)

Geopoint converters

DSS provides two processors to convert between a Geopoint column and latitude/longitude columns:

Resolve GeoIP

The Resolve GeoIP processor uses the GeoIP database (https://www.maxmind.com) to resolve an IP address to the associated geographic coordinates.

../_images/geoip-processor.png

It produces two kinds of information:

  • Administrative data (country, region, city, …)

  • Geographic data (latitude, longitude)

The output GeoPoint can be used for Map Charts.

Change coordinates system

This processor changes the Coordinates Reference System (CRS) of a geometry or geopoint column.

Source and target CRS can be given either as a EPSG code (e.g., “EPSG:4326”) or as a projected coordinate system WKT (e.g., “PROJCS[…]”).

Warning

Dataiku uses the WGS84 (EPSG:4326) coordinates system when processing geometries. Before manipulating any geospatial data in Dataiku, make sure they are projected in the WGS84 (EPSG:4326) coordinates system.

Use this processor to convert data projected in a different CRS to the WGS84 (EPSG:4326) coordinates system.

../_images/change-crs-processor.png

Compute distances between points

The /preparation/processors/geopoint-distance processor allows you to compute distance between points

Create area around a geopoint

The Create area around a geopoint processor performs creation of polygons centered on input geopoints. For each input geospatial point, a spatial polygon is created around it, delimiting the area of influence covered by the point (all the points that fall within a given distance from the geopoint). The shape area of the polygon can be either rectangular or circular (using an approximation) and the size will depend on the selected parameters.

Extract from geo column

The Extract from geo column processor extracts data from a geometry column:

  • centroid point,

  • length (if input is not a point),

  • area (if input is a polygon).