Capitolo 5. Interrogazioni spaziali
Indietro		Avanti

Capitolo 5. Interrogazioni spaziali

Indice

La ragion d'essere delle basi di dati spaziali è quella di effettuare all'interno del database le interrogazioni che normalmente richiederebbero l'impiego di un sistema GIS desktop. Usare PostGIS in maniera efficace richiede la conoscienza di quali funzioni sono disponibili, come usarle nelle interrogazioni e la creazione di indici che garantiscano una buona performance.

5.1. Determinare le relazioni spaziali

Le relazioni spaziali indicano come due geometrie interagiscono tra loro. Rappresentano la principale capacità di interrogazione delle geometries.

5.1.1. Modello dimensionale esteso 9-Intersection

Secondo la specifica OpenGIS Simple Features Implementation Specification for SQL, "l'approccio di base per comparare due geometrie è quello di determinare l'interesezione tra gli interni, i confini e gli esterni di ognuna, classificando così la relazione tra le due geometrie attraverso la risultante matrice di 'intersezione'."

Nella teoria della topologia degli insiemi di punti, i punti di una geometria nello spazio bidimensionale sono categorizzate in 3 insiemi:

Boundary: Il confine di una geometria è l'insieme di geometrie della dimensione immediatamente inferiore. Per i POINT, che hanno dimensione 0, il confine è l'insieme vuoto. Il confine di una LINESTRING è l'insieme dei due punti estremi. Per i POLYGON, il confine è l'insieme delle linee che ne formano il ring esterno e i ring interni.
Interior: L'interno di una geometria è costituito da quei punti della geometria che non sono parte del confine. Per i POINT, l'interno è il punto stesso. L'interno di una LINESTRING è l'insieme dei punti tra i due estremi. Per i POLYGON, l'interno è la superficie all'interno del poligono.
Exterior: L'esterno di una geometria è lo spazio attorno al quale la geometria è immersa; in altre parole, tutti i punti che non fanno parte dell'interno o del confine della geometria. È una superficie non chiusa a 2 dimensioni.

L'articolo Dimensionally Extended 9-Intersection Model (DE-9IM) descrive la relazione spatiale tra due geometrie specificando le dimensioni delle 9 intersezioni tra gli insiemi di punti sopracitati di ogni geometria. La dimensione dell'intersezione può essere formalmente rappresentata in una matrice d'intersezione 3x3.

Per una geometria g Interno, Confine e Esterno si denotano usando la notazione I(g), B(g) ed E(g). Inoltre, dim(s) denota la dimensione di un insieme s nel dominio {0,1,2,F}:

0 => point
1 => line
2 => area
F => empty set

Usando questa notazione, la matrice d'intersezione per due geometrie a e b è:

	Interno	Confine	Esterno
Interno	dim( I(a) ∩ I(b) )	dim( I(a) ∩ B(b) )	dim( I(a) ∩ E(b) )
Confine	dim( B(a) ∩ I(b) )	dim( B(a) ∩ B(b) )	dim( B(a) ∩ E(b) )
Esterno	dim( E(a) ∩ I(b) )	dim( E(a) ∩ B(b) )	dim( E(a) ∩ E(b) )

The following overlapping polygons provide a concrete example. ST_Relate computes the matrix 212101212; it matches the overlap pattern T*T***T** used by ST_Overlaps.

Code

WITH example AS (
  SELECT
    'POLYGON ((140 140,140 122,135 105,126 100,118 99,110 94,100 86,97 73,102 59,98 49,87 38,70 30,55 29,40 30,28 38,20 50,14 66,10 84,6 100,4 119,4 143,6 166,10 180,18 191,28 195,40 190,55 189,67 186,86 179,112 165,124 158,133 148,140 140),(48 177,40 177,30 174,24 166,22 159,25 155,30 153,40 154,45 157,52 162,53 169,51 174,48 177))'::geometry AS a,
    'POLYGON ((75 63,79 50,87 38,95 31,108 25,124 22,140 18,154 11,166 6,176 10,184 21,188 35,190 58,190 82,193 104,190 121,185 139,178 154,166 163,154 171,139 172,124 171,112 165,96 152,92 142,92 126,86 116,79 110,75 104,72 94,73 86,75 76,75 63))'::geometry AS b
), relation AS (
  SELECT a, b, ST_Relate(a, b) AS matrix
  FROM example
)
SELECT matrix,
       ST_RelateMatch(matrix, 'T*T***T**') AS matches_overlap_pattern,
       ST_Overlaps(a, b) AS overlaps
FROM relation;

Output

matrix   | matches_overlap_pattern | overlaps
-----------+-------------------------+----------
 212101212 | t                       | t
(1 row)

Figure

The executable query and generated overlay above are the visual source for the relation. The query below renders the nine cells of the same kind of matrix as a 3-by-3 set of examples. The final exterior/exterior cell is clipped to a finite viewport, since the true exterior/exterior intersection is unbounded.

Code

WITH example AS (
  SELECT
    ST_MakeEnvelope(0, 0, 6, 4) AS a,
    ST_MakeEnvelope(3, -1, 8, 3) AS b,
    ST_MakeEnvelope(-1, -2, 9, 5) AS viewport
), parts AS (
  SELECT
    a,
    b,
    ST_Boundary(a) AS ba,
    ST_Boundary(b) AS bb,
    viewport
  FROM example
)
SELECT
  ST_Relate(a, b) AS matrix,
  ST_AsText(ST_Normalize(a)) AS input_a,
  ST_AsText(ST_Normalize(b)) AS input_b,
  ST_AsText(ST_Normalize(ST_Intersection(a, b))) AS "2 I(a) ∩ I(b)",
  ST_AsText(ST_Normalize(ST_Intersection(a, bb))) AS "1 I(a) ∩ B(b)",
  ST_AsText(ST_Normalize(ST_Difference(a, b))) AS "2 I(a) ∩ E(b)",
  ST_AsText(ST_Normalize(ST_Intersection(ba, b))) AS "1 B(a) ∩ I(b)",
  ST_AsText(ST_Normalize(ST_Intersection(ba, bb))) AS "0 B(a) ∩ B(b)",
  ST_AsText(ST_Normalize(ST_Collect(
      'LINESTRING(0 0,0 4,6 4,6 3)'::geometry,
      'LINESTRING(0 0,3 0)'::geometry
  ))) AS "1 B(a) ∩ E(b)",
  ST_AsText(ST_Normalize(ST_Difference(b, a))) AS "2 E(a) ∩ I(b)",
  ST_AsText(ST_Normalize(ST_Collect(
      'LINESTRING(3 -1,8 -1,8 3,6 3)'::geometry,
      'LINESTRING(3 -1,3 0)'::geometry
  ))) AS "1 E(a) ∩ B(b)",
  ST_AsText(ST_Normalize(ST_Difference(viewport, ST_Union(a, b))))
    AS "2 E(a) ∩ E(b) clipped"
FROM parts;

Output

-[ RECORD 1 ]----------+-----------------------------------------------------------------------------
matrix                 | 212101212
input_a                | POLYGON((0 0,0 4,6 4,6 0,0 0))
input_b                | POLYGON((3 -1,3 3,8 3,8 -1,3 -1))
2 I(a) ∩ I(b)          | POLYGON((3 0,3 3,6 3,6 0,3 0))
1 I(a) ∩ B(b)          | LINESTRING(3 0,3 3,6 3)
2 I(a) ∩ E(b)          | POLYGON((0 0,0 4,6 4,6 3,3 3,3 0,0 0))
1 B(a) ∩ I(b)          | LINESTRING(3 0,6 0,6 3)
0 B(a) ∩ B(b)          | MULTIPOINT((6 3),(3 0))
1 B(a) ∩ E(b)          | MULTILINESTRING((0 0,0 4,6 4,6 3),(0 0,3 0))
2 E(a) ∩ I(b)          | POLYGON((3 -1,3 0,6 0,6 3,8 3,8 -1,3 -1))
1 E(a) ∩ B(b)          | MULTILINESTRING((3 -1,8 -1,8 3,6 3),(3 -1,3 0))
2 E(a) ∩ E(b) clipped  | POLYGON((-1 -2,-1 5,9 5,9 -2,-1 -2),(0 0,3 0,3 -1,8 -1,8 3,6 3,6 4,0 4,0 0))

Figure

Leggendola da sinistra a destra e dall'alto verso il basso, la matrice d'intersezione viene rappresentata come la stringa di testo '212101212'.

Per maggiori informazioni si veda:

OpenGIS Simple Features Implementation Specification for SQL (version 1.1, section 2.1.13.2)
Wikipedia: Dimensionally Extended Nine-Intersection Model (DE-9IM)
GeoTools: Point Set Theory and the DE-9IM Matrix

5.1.2. Named Spatial Relationships

To make it easy to determine common spatial relationships, the OGC SFS defines a set of named spatial relationship predicates. PostGIS provides these as the functions ST_Contains, ST_Crosses, ST_Disjoint, ST_Equals, ST_Intersects, ST_Overlaps, ST_Touches, ST_Within. It also defines the non-standard relationship predicates ST_Covers, ST_CoveredBy, and ST_ContainsProperly.

Spatial predicates are usually used as conditions in SQL WHERE or JOIN clauses. The named spatial predicates automatically use a spatial index if one is available, so there is no need to use the bounding box operator && as well. For example:

Code

SELECT city.name, state.name, city.geom
FROM city JOIN state ON ST_Intersects(city.geom, state.geom);

For more details and illustrations, see the PostGIS Workshop.

5.1.3. General Spatial Relationships

In some cases the named spatial relationships are insufficient to provide a desired spatial filter condition. These requirements can be expressed by computing the full DE-9IM intersection matrix with ST_Relate.

To test a particular spatial relationship, an intersection matrix pattern is used. This is the matrix representation augmented with the additional symbols {T,*}:

T => intersection dimension is non-empty; i.e. is in {0,1,2}
* => don't care

Using intersection matrix patterns, specific spatial relationships can be evaluated succinctly. The ST_Relate and ST_RelateMatch functions can both test these patterns.

These road segments share a line segment. ST_Crosses does not find this relationship for linear features, because it only returns true when their interiors intersect at a point. The pattern 1*1***1** tests for a one-dimensional intersection instead.

Code

WITH roads AS (
  SELECT
    'LINESTRING(10 10,40 90,70 110,140 110,170 130,190 190)'::geometry AS "road A",
    'LINESTRING(10 190,50 130,90 110,130 110,160 70,180 10)'::geometry AS "road B"
), relation AS (
  SELECT "road A", "road B", ST_Relate("road A", "road B") AS matrix
  FROM roads
)
SELECT
  ST_Intersection("road A", "road B") AS overlap,
  matrix,
  ST_Crosses("road A", "road B") AS "ST_Crosses",
  ST_RelateMatch(matrix, '1*1***1**') AS "matches 1*1***1**"
FROM relation;

Output

overlap           |  matrix   | ST_Crosses | matches 1*1***1**
----------------------------+-----------+------------+-------------------
 LINESTRING(90 110,130 110) | 1F1FF0102 | f          | t
(1 row)

01020000000200000000000000008056400000000000805B4000000000004060400000000000805B40 | 1F1FF0102 | f | t

Figure

The next example finds a wharf which runs from the lake interior onto its shoreline, with one endpoint on the shoreline. The other wharves provide nearby counterexamples in the figure. The pattern 102101FF2 expresses the complete required relationship in one test.

Code

WITH data AS (
  SELECT
    'POLYGON((-20 10,30 30,80 70,110 80,145 85,190 110,300 220,230 220,-20 220,-20 10))'::geometry AS lake,
    'MULTILINESTRING((120 180,145 85,110 80),(10 130,30 70),(90 140,110 80,94.81118881118876 75.83496503496498),(150 160,180 80))'::geometry AS wharves
), wharf AS (
  SELECT lake, (item).path[1] AS id, (item).geom
  FROM data CROSS JOIN LATERAL ST_Dump(wharves) AS item
), relation AS (
  SELECT id, geom, ST_Relate(lake, geom) AS matrix
  FROM wharf
)
SELECT
  id AS wharf,
  geom AS "matching wharf",
  matrix
FROM relation
WHERE ST_RelateMatch(matrix, '102101FF2')
ORDER BY id;

Output

wharf |           matching wharf           |  matrix
-------+-----------------------------------+-----------
     1 | LINESTRING(120 180,145 85,110 80) | 102101FF2
(1 row)

1 | 0102000000030000000000000000005E400000000000806640000000000020624000000000004055400000000000805B400000000000005440 | 102101FF2

Figure

5.2. Using Spatial Indexes

When constructing queries using spatial conditions, for best performance it is important to ensure that a spatial index is used, if one exists (see Sezione 4.9, «Spatial Indexes»). To do this, a spatial operator or index-aware function must be used in a WHERE or ON clause of the query.

Spatial operators include the bounding box operators (of which the most commonly used is &&; see Sezione 7.10.1, «Bounding Box Operators» for the full list) and the distance operators used in nearest-neighbor queries (the most common being <->; see Sezione 7.10.2, «Operatori» for the full list.)

Index-aware functions automatically add a bounding box operator to the spatial condition. Index-aware functions include the named spatial relationship predicates ST_Contains, ST_ContainsProperly, ST_CoveredBy, ST_Covers, ST_Crosses, ST_Intersects, ST_Overlaps, ST_Touches, ST_Within, ST_Within, and ST_3DIntersects, and the distance predicates ST_DWithin, ST_DFullyWithin, ST_3DDFullyWithin, and ST_3DDWithin .)

Functions such as ST_Distance do not use indexes to optimize their operation. For example, the following query would be quite slow on a large table:

Code

SELECT geom
FROM geom_table
WHERE ST_Distance(geom, 'SRID=312;POINT(100000 200000)') < 100

This query selects all the geometries in geom_table which are within 100 units of the point (100000, 200000). It will be slow because it is calculating the distance between each point in the table and the specified point, ie. one ST_Distance() calculation is computed for every row in the table.

The number of rows processed can be reduced substantially by using the index-aware function ST_DWithin:

Code

SELECT geom
FROM geom_table
WHERE ST_DWithin(geom, 'SRID=312;POINT(100000 200000)', 100)

This query selects the same geometries, but it does it in a more efficient way. This is enabled by ST_DWithin() using the && operator internally on an expanded bounding box of the query geometry. If there is a spatial index on geom, the query planner will recognize that it can use the index to reduce the number of rows scanned before calculating the distance. The spatial index allows retrieving only records with geometries whose bounding boxes overlap the expanded extent and hence which might be within the required distance. The actual distance is then computed to confirm whether to include the record in the result set.

For more information and examples see the PostGIS Workshop.

5.3. Examples of Spatial SQL

The examples in this section make use of a table of linear roads, and a table of polygonal municipality boundaries. The definition of the bc_roads table is:

Output

Column    | Type              | Description
----------+-------------------+-------------------
gid       | integer           | Unique ID
name      | character varying | Road Name
geom      | geometry          | Location Geometry (Linestring)

The definition of the bc_municipality table is:

Output

Column   | Type              | Description
---------+-------------------+-------------------
gid      | integer           | Unique ID
code     | integer           | Unique ID
name     | character varying | City / Town Name
geom     | geometry          | Location Geometry (Polygon)

5.3.1.	What is the total length of all roads, expressed in kilometers?
	You can answer this question with a very simple piece of SQL: Code SELECT sum(ST_Length(geom))/1000 AS km_roads FROM bc_roads; Output km_roads ------------------ 70842.1243039643
5.3.2.	How large is the city of Prince George, in hectares?
	This query combines an attribute condition (on the municipality name) with a spatial calculation (of the polygon area): Code SELECT ST_Area(geom)/10000 AS hectares FROM bc_municipality WHERE name = 'PRINCE GEORGE'; Output hectares ------------------ 32657.9103824927
5.3.3.	What is the largest municipality in the province, by area?
	This query uses a spatial measurement as an ordering value. There are several ways of approaching this problem, but the most efficient is below: Code SELECT name, ST_Area(geom)/10000 AS hectares FROM bc_municipality ORDER BY hectares DESC LIMIT 1; Output name \| hectares ---------------+----------------- TUMBLER RIDGE \| 155020.02556131 Note that in order to answer this query we have to calculate the area of every polygon. If we were doing this a lot it would make sense to add an area column to the table that could be indexed for performance. By ordering the results in a descending direction, and them using the PostgreSQL "LIMIT" command we can easily select just the largest value without using an aggregate function like MAX().
5.3.4.	What is the length of roads fully contained within each municipality?
	This is an example of a "spatial join", which brings together data from two tables (with a join) using a spatial interaction ("contained") as the join condition (rather than the usual relational approach of joining on a common key): Code SELECT m.name, sum(ST_Length(r.geom))/1000 as roads_km FROM bc_roads AS r JOIN bc_municipality AS m ON ST_Contains(m.geom, r.geom) GROUP BY m.name ORDER BY roads_km; Output name \| roads_km ----------------------------+---------- SURREY \| 1539.476 VANCOUVER \| 1450.331 LANGLEY DISTRICT \| 833.793 BURNABY \| 773.769 PRINCE GEORGE \| 694.376 ... This query takes a while, because every road in the table is summarized into the final result (about 250K roads for the example table). For smaller datasets (several thousand records on several hundred) the response can be very fast.
5.3.5.	Create a new table with all the roads within the city of Prince George.
	This is an example of an "overlay", which takes in two tables and outputs a new table that consists of spatially clipped or cut resultants. Unlike the "spatial join" demonstrated above, this query creates new geometries. An overlay is like a turbo-charged spatial join, and is useful for more exact analysis work: Code CREATE TABLE pg_roads as SELECT ST_Intersection(r.geom, m.geom) AS intersection_geom, ST_Length(r.geom) AS rd_orig_length, r.* FROM bc_roads AS r JOIN bc_municipality AS m ON ST_Intersects(r.geom, m.geom) WHERE m.name = 'PRINCE GEORGE';
5.3.6.	What is the length in kilometers of "Douglas St" in Victoria?
	Code SELECT sum(ST_Length(r.geom))/1000 AS kilometers FROM bc_roads r JOIN bc_municipality m ON ST_Intersects(m.geom, r.geom WHERE r.name = 'Douglas St' AND m.name = 'VICTORIA'; Output kilometers ------------------ 4.89151904172838
5.3.7.	What is the largest municipality polygon that has a hole?
	Code SELECT gid, name, ST_Area(geom) AS area FROM bc_municipality WHERE ST_NRings(geom) > 1 ORDER BY area DESC LIMIT 1; Output gid \| name \| area -----+--------------+------------------ 12 \| SPALLUMCHEEN \| 257374619.430216

Indietro		Avanti
Capitolo 4. Data Management	Partenza	Capitolo 6. Suggerimenti per le prestazioni