Similarity and affine transformations are useful when integrating spatial data from several sources. It is often the case that vectors from one dataset (let's call it 'A') don't coincide with a base dataset ('B'), which could be raster or vector. In such a scenario one would like to reposition the dataset A taking the other dataset ('B') as reference.
There are a number of cases when this situation occurs, including but not limited to:
- The dataset A lacks of projection data and there is no hint about it.
- The dataset A was already projected but simply doesn't fit well with B (think about data in local datums when migrated to WGS84.)
- The dataset A has an arbitrary/false coordinate reference system and it's going to be integrated with national data, e.g. archaeological or engineering projects.
- The dataset A was digitized from a distorted image like a scanned old map.
- The dataset A was digitized from a wrongly georeferenced satellite image (it recently happened in OpenStreetMap.)
The package 'vec2dtransf'
The R package 'vec2dtransf' provides classes for defining and applying both affine and similarity transformations on vector data, namely on 'sp' objects (objects created by the R package called 'sp'.) Transformations can be defined from control points or directly from transformation parameters. If redundant control points are provided Least Squares is applied allowing to obtain residuals and RMSE (Root Mean Square Error.)
Similarity transformations can rotate, shift and scale geometries, whereas affine transformations can rotate, shift, scale (even applying different factors on each axis) and skew geometries. At least two (2) control points are required for similarity and three (3) for affine transformations. See (Knippers, 2009) for an introduction to the formulae of each transformation.
As an example, we could take these two datasets: La Guajira department (dataset A, source: SIGOT) and the Colombian boundary (dataset B). The datasets are included in this post just for demonstration purposes. The datasets share the spatial reference system specified by the EPSG code 3116.
Let's import them into R through the rgdal package (first extract the data and adjust their path when required):
This is how they look like (dataset A: red line, dataset B: green polygon with black border):
We can also have a closer look:
It can be noticed that both datasets simply don't fit, regardless of sharing the same spatial reference system. In this example, we will make La Guajira department fit with the Colombian border.
For repositioning the dataset A taking the dataset B as a basis, we select points that are present in both datasets and arrange these points' coordinates this way:
X SOURCE Y SOURCE X TARGET Y TARGET
Source coordinates correspond to the dataset A, i.e, the dataset to be transformed, whereas target coordinates correspond to the base dataset ('B').
According to (Iliffe, pp.135-137), control points must be well spread over the region to be repositioned. For this example, I have selected 16 control points that are included in the 'vec2dtransf' package as sample data, so we can access them by typing:
These control points are well spread over the region to be transformed:
First, we define the affine transformation object from control points (note that we only pass the columns containing coordinates):
Now, we calculate the transformation parameters
We can see the calculated parameters in this way:
And since we provided more control points than required for an affine transformation, we can access the residuals and the RMSE (in meters), which result from the Least Squares.
In a nutshell, residuals are the difference between transformed source coordinates and target coordinates of control points, whereas the RMSE measures the general deviation of transformed source coordinates with respect to target coordinates of control points. More information on residuals and RMSE can be found in (Iliffe, 2008).
To see the overall effect of the transformation, we could also plot the result of the transformation on a grid, based on (UC Davis Soil Resource Laboratory, 2007):
And perhaps also add the dataset A:
Finally, we apply the affine transformation on the dataset A:
This is how the datasets look like now (transformed dataset in blue color):
And with the detailed view:
Certainly better, isn't it? Let's compare it with the original (wrong) dataset:
By using the package rgdal we can export the result as, e.g a Shapefile, in this case to a temporal directory:
- Iliffe, J. and Lott, R. Datums and map projections: For remote sensing, GIS and surveying. Section 4.5. pp.109-117,135-137, 2008.
- UC Davis Soil Resource Laboratory. Case Study: Fixing Bad TIGER Line data with R and PostGIS. 2007. <URL: http://casoilresource.lawr.ucdavis.edu/drupal/node/433>
- Knippers, R. 2D Cartesian coordinate transformations. Section 5.4. 2009. <URL: http://kartoweb.itc.nl/geometrics/Coordinate%20transformations/coordtrans.html>