Geocoding and reverse geocoding

  • Geocoding, sometimes called forward geocoding, is the process of transforming an address into geographic coordinates.

  • Reverse geocoding is the process of transforming geographic coordinates into administrative information, such as country, region, or city.

  • Zipcode geocoding is the process of transforming a country and zipcode into geographic coordinates.

Geocoding, reverse geocoding, and zipcode geocoding are always best-effort activities. It is not always possible to perform these activities, and data may be incomplete depending on the location. Dataiku is not able to provide any guarantee as to the completeness or correctness of any geocoding-related data.

Geocoding

This capability is provided by the Geocoder plugin, which you need to install. Please see Installing plugins.

The Geocoder uses online geocoding service providers. Your DSS instance needs to have outgoing Internet access.

You will need an API key for most of these providers. Some providers have some free plans, sometimes with limits, sometimes with various usage policies. Please make sure to review the usage policy of each provider before using it.

Not all providers have the same level of coverage of the world, so you should use providers depending on their coverage.

Providers with “Batch available” usually have significantly better performance.

Provider

Optimal for

Usage Policy

Batch available

ArcGIS

World

Baidu

China

API key

Bing

World

API key

yes

CanadaPost

Canada

API key

FreeGeoIP

World

Gaode

China

API key

Geocoder.ca (Geolytica)

CA & US

Rate Limit

GeocodeFarm

World

Policy

GeoNames

World

Username

GeoOttawa

Ottawa

Gisgraphy

World

API key

Google

World

Rate Limit, Policy

HERE

World

API key

IPInfo

World

Rate Limit, Plans

Komoot (OSM powered)

World

LocationIQ

World

API Key

Mapbox

World

API key

MapQuest

World

API key

yes

MaxMind

World

OpenCage

World

API key

OpenStreetMap

World

Policy

Tamu

US

API key

TGOS

Taiwan

TomTom

World

API key

USCensus

US

yes

What3Words

World

API key

Yahoo

World

Yandex

Russia

The plugin provides a recipe. You can use this recipe multiple times in a row using different providers, for example in case the previous providers failed on some inputs. The recipe will only try to recompute rows for which outputs are not already filled.

Reverse geocoding

DSS provides two different reverse geocoding capabilities:

  • A bundled-data reverse geocoder, which extracts administrative information from geographic coordinates. This reverse geocoder is available as a preparation processor and is also required for Administrative map charts. It does not use any external provider, does not need any API key or payment, and is fast.

  • The ability to call external providers, which require API keys or payments, require Internet access, and are usually significantly slower, but can provide better resolution, up to the address level.

Bundled-data administrative reverse geocoder

This capability is provided by the Reverse geocoding plugin, which you need to install. Please see Installing plugins.

The reverse geocoding preparation processor takes geographic coordinates as input and extracts administrative information, such as country, region, and city. The administrative boundaries used by the processor are based on OpenStreetMap data. The type of administrative boundary for each level depends on the country. For more information, see the OpenStreetMap administrative boundaries documentation.

Use this capability when you need administrative information from a Geo Point or Geometry column without calling an external geocoding provider. The same bundled administrative data is also used by Administrative map charts.

Configure the processor with:

  • A Geo Point or Geometry column as input.

  • For each administrative level you want to extract, provide the output column names.

  • Optionally select “Output the smallest selected administrative area” to output the shape of the administrative entity.

For each selected administrative level, the processor outputs two columns: one with the local name of the administrative entity and one with the English name. If you output the smallest selected administrative area, the processor also outputs the polygon of that administrative entity, encoded using WKT format. When several levels are selected, only the smallest selected area is output. For example, if both city and country are selected, the processor returns the city shape.

Reverse geocoding is best-effort and depends on the coverage and freshness of the bundled administrative boundary data.

Online reverse geocoder

This capability is provided by the Geocoder plugin, which you need to install. Please see Installing plugins.

The Reverse Geocoder uses online reverse geocoding service providers. Your DSS instance needs to have outgoing Internet access.

You will need an API key for most of these providers. Some providers have some free plans, sometimes with limits, sometimes with various usage policies. Please make sure to review the usage policy of each provider before using it.

Not all providers have the same level of coverage of the world, so you should use providers depending on their coverage.

Providers with “Batch available” usually have significantly better performance.

Provider

Optimal for

Usage Policy

Batch available

ArcGIS

World

Baidu

China

API key

Bing

World

API key

yes

Gaode

China

API key

GeocodeFarm

World

Policy

Gisgraphy

World

API key

Google

World

Rate Limit, Policy

HERE

World

API key

Komoot (OSM powered)

World

LocationIQ

World

API Key

Mapbox

World

API key

MapQuest

World

API key

OpenCage

World

API key

OpenStreetMap

World

Policy

USCensus

US

What3Words

World

API key

Yandex

Russia

The plugin provides a recipe. You can use this recipe multiple times in a row using different providers, for example in case the previous providers failed on some inputs. The recipe will only try to recompute rows for which outputs are not already filled.

Zipcode geocoding

Zipcode geocoding provides Country + zipcode to Geographic coordinates resolution, at city-level resolution.

This capability is provided by the Zipcode geocoding plugin, which you need to install. Please see Installing plugins.

The zipcode geocoding preparation processor uses bundled data, so it does not call an external geocoding provider and does not require an API key.

Configure the processor with:

  • A country column, containing either a country name or an ISO country code.

  • A zipcode column.

  • The name of the output column that will contain the Geo Point.

The processor outputs a Geo Point column with the latitude and longitude for the matching city-level zipcode entry. As with other geocoding capabilities, zipcode geocoding is best-effort and depends on the coverage of the bundled zipcode data.