20. Geometry Constructing Functions¶
All the functions we have seen so far work with geometries «as they are» and returns
analyses of the objects (ST_Length(geometry), ST_Area(geometry)),
serializations of the objects (ST_AsText(geometry), ST_AsGML(geometry)),
parts of the object (ST_RingN(geometry,n)) or
true/false tests (ST_Contains(geometry,geometry), ST_Intersects(geometry,geometry)).
«Geometry constructing functions» take geometries as inputs and output new shapes.
20.1. ST_Centroid / ST_PointOnSurface¶
A common need when composing a spatial query is to replace a polygon feature with a point representation of the feature. This is useful for spatial joins (as discussed in Polygon/Polygon Joins) because using ST_Intersects(geometry,geometry) on two polygon layers often results in double-counting: a polygon on a boundary will intersect an object on both sides; replacing it with a point forces it to be on one side or the other, not both.
ST_Centroid(geometry) returns a point that is approximately on the center of mass of the input argument. This simple calculation is very fast, but sometimes not desirable, because the returned point is not necessarily in the feature itself. If the input feature has a convexity (imagine the letter “C”) the returned centroid might not be in the interior of the feature.
ST_PointOnSurface(geometry) returns a point that is guaranteed to be inside the input argument. This makes it more useful for computing «proxy points» for spatial joins.
-- Compare the location of centroid and point-on-surface for a concave geometry SELECT ST_Intersects(geom, ST_Centroid(geom)) AS centroid_inside, ST_Intersects(geom, ST_PointOnSurface(geom)) AS pos_inside FROM (VALUES ('POLYGON ((30 0, 30 10, 10 10, 10 40, 30 40, 30 50, 0 50, 0 0, 0 0, 30 0))'::geometry) ) AS t(geom);
centroid_inside | pos_inside -----------------+------------ f | t
The buffering operation is common in GIS workflows, and is also available in PostGIS. ST_Buffer(geometry,distance) takes in a buffer distance and geometry type and outputs a polygon with a boundary the buffer distance away from the input geometry.
For example, if the US Park Service wanted to enforce a marine traffic zone around Liberty Island, they might build a 500 meter buffer polygon around the island. Liberty Island is a single census block in our
nyc_census_blocks table, so we can easily extract and buffer it.
-- Make a new table with a Liberty Island 500m buffer zone CREATE TABLE liberty_island_zone AS SELECT ST_Buffer(geom,500)::geometry(Polygon,26918) AS geom FROM nyc_census_blocks WHERE blkid = '360610001001001';
The ST_Buffer function also accepts negative distances and builds inscribed polygons within polygonal inputs. For lines and points you will just get an empty return.
Another classic GIS operation – the «overlay» – creates a new coverage by calculating the intersection of two superimposed polygons. The resultant has the property that any polygon in either of the parents can be built by merging polygons in the resultant.
The ST_Intersection(geometry A, geometry B) function returns the spatial area (or line, or point) that both arguments have in common. If the arguments are disjoint, the function returns an empty geometry.
-- What is the area these two circles have in common? -- Using ST_Buffer to make the circles! SELECT ST_AsText(ST_Intersection( ST_Buffer('POINT(0 0)', 2), ST_Buffer('POINT(3 0)', 2) ));
In the previous example we intersected geometries, creating a new geometry that had lines from both the inputs. The ST_Union function does the reverse; it takes inputs and removes common lines. There are two forms of the ST_Union function:
ST_Union(geometry, geometry): A two-argument version that takes in two geometries and returns the merged union. For example, our two-circle example from the previous section looks like this when you replace the intersection with a union.
-- What is the total area these two circles cover? -- Using ST_Buffer to make the circles! SELECT ST_AsText(ST_Union( ST_Buffer('POINT(0 0)', 2), ST_Buffer('POINT(3 0)', 2) ));
ST_Union([geometry]): An aggregate version that takes in a set of geometries and returns the merged geometry for the entire group. The aggregate ST_Union can be used with the
GROUP BYSQL statement to create carefully merged subsets of basic geometries. It is very powerful.
As an example of ST_Union aggregation, consider our
nyc_census_blocks table. Census geography is carefully constructed so that larger geographies can be built up from smaller ones. So, we can create a census tracts map by merging the blocks that form each tract (as we do later in Creating a Census Tracts Table). Or, we can create a county map by merging blocks that fall within each county.
To carry out the merge, note that the unique key
blkid actually embeds information about the higher level geographies. Here are the parts of the key for Liberty Island we used earlier:
360610001001001 = 36 061 000100 1 001 36 = State of New York 061 = New York County (Manhattan) 000100 = Census Tract 1 = Census Block Group 001 = Census Block
So, we can create a county map by merging all geometries that share the same first 5 digits of their
blkid. Be patient; this is computationally expensive and can take a minute or two.
-- Create a nyc_census_counties table by merging census blocks CREATE TABLE nyc_census_counties AS SELECT ST_Union(geom)::Geometry(MultiPolygon,26918) AS geom, SubStr(blkid,1,5) AS countyid FROM nyc_census_blocks GROUP BY countyid;
An area test can confirm that our union operation did not lose any geometry. First, we calculate the area of each individual census block, and sum those areas grouping by census county id.
SELECT SubStr(blkid,1,5) AS countyid, Sum(ST_Area(geom)) AS area FROM nyc_census_blocks GROUP BY countyid ORDER BY countyid;
countyid | area ----------+------------------ 36005 | 110196022.906506 36047 | 181927497.678368 36061 | 59091860.6261323 36081 | 283194473.613692 36085 | 150758328.111199
Then we calculate the area of each of our new county polygons from the county table:
SELECT countyid, ST_Area(geom) AS area FROM nyc_census_counties ORDER BY countyid;
countyid | area ----------+------------------ 36005 | 110196022.906507 36047 | 181927497.678367 36061 | 59091860.6261324 36081 | 283194473.593646 36085 | 150758328.111199
The same answer! We have successfully built an NYC county table from our census blocks data.
20.5. Function List¶
ST_Centroid(geometry): Returns a point geometry that represents the center of mass of the input geometry.
ST_PointOnSurface(geometry): Returns a point geometry that is guaranteed to be in the interior of the input geometry.
ST_Buffer(geometry, distance): For geometry: Returns a geometry that represents all points whose distance from this Geometry is less than or equal to distance. Calculations are in the Spatial Reference System of this Geometry. For geography: Uses a planar transform wrapper.
ST_Intersection(geometry A, geometry B): Returns a geometry that represents the shared portion of geomA and geomB. The geography implementation does a transform to geometry to do the intersection and then transform back to WGS84.
ST_Union(): Returns a geometry that represents the point set union of the Geometries.
ST_AsText(text): Returns the Well-Known Text (WKT) representation of the geometry/geography without SRID metadata.
substring(string [from int] [for int]): PostgreSQL string function to extract substring matching SQL regular expression.
sum(expression): PostgreSQL aggregate function that returns the sum of records in a set of records.