GeoKettle: A powerful spatial ETL tool for feeding your Spatial Data Infrastructure (SDI)

Session Type: 
Thierry Badard, Spatialytics

A full fledged Spatial Data Infrastructure (SDI) enables dissemination of data and processes in an interoperable way, through standardized web services as WFS, WMS, SOS and WPS. Data and processes are cataloged in a CSW which is used as an entry point of the infrastructure. Feeding and updating such a spatial infrastructure is a repetitive and very time consuming task. An open source spatial ETL tool such as GeoKettle ( can help in automating numerous complex and repetitive every-day duties that a SDI administrator has to complete. It also avoids delivering data with poor quality as it is possible to perform advanced geoprocessing, data cleansing and error correction in such a tool. This workshop proposes to explore, in a practical manner, all the areas where GeoKettle could be useful for automatically feeding and updating a SDI. Thus, after a short intro to the fundamental concepts and the features provided by this ETL tool, attendees will learn and experiment how is it possible with GeoKettle to: 1) Grab some data from various and heterogeneous sources such GIS files, spatial DBMS, Web services (WFS, SOS, …), social networks, … and transform them in order to feed their SDI with value-added and error prone data. Exercises will rely on PostGIS, GeoServer and the 52 North SOS service. 2) Automatically retrieve some metadata about these different data sources for use and dissemination in a catalog service, such as GeoNetwork. 3) Easily expose some ETL transformations as true Web Processing Services (WPS) in order to disseminate advanced on-line geoprocessing capabilities through their SDI. Exercises will use the 52 North WPS service. At the end, attendees should have a working knowledge of GeoKettle and should be able to design advanced geospatial data transformations in order to automate numerous loading and updating tasks in their SDI.

