ST_ClusterDBSCAN — Windowing function that returns integer id for the cluster each input geometry is in based on 2D implementation of Density-based spatial clustering of applications with noise (DBSCAN) algorithm.
integer ST_ClusterDBSCAN(
geometry winset
geom, float8
eps, integer
minpoints)
;
Returns cluster number for each input geometry, based on a 2D implementation of the
Density-based spatial clustering of applications with noise (DBSCAN)
algorithm. Unlike ST_ClusterKMeans, it does not require the number of clusters to be specified, but instead
uses the desired distance (eps
) and density(minpoints
) parameters to construct each cluster.
An input geometry will be added to a cluster if it is either:
A "core" geometry, that is within eps
distance (Cartesian) of at least minpoints
input geometries (including itself) or
A "border" geometry, that is within eps
distance of a core geometry.
Note that border geometries may be within eps
distance of core geometries in more than one cluster; in this
case, either assignment would be correct, and the border geometry will be arbitrarily asssigned to one of the available clusters.
In these cases, it is possible for a correct cluster to be generated with fewer than minpoints
geometries.
When assignment of a border geometry is ambiguous, repeated calls to ST_ClusterDBSCAN will produce identical results if an ORDER BY
clause is included in the window definition, but cluster assignments may differ from other implementations of the same algorithm.
Input geometries that do not meet the criteria to join any other cluster will be assigned a cluster number of NULL. |
Availability: 2.3.0 - requires GEOS
Assigning a cluster number to each polygon within 50 meters of each other. Require at least 2 polygons per cluster
SELECT name, ST_ClusterDBSCAN(geom, eps := 50, minpoints := 2) over () AS cid FROM boston_polys WHERE name > '' AND building > '' AND ST_DWithin(geom, ST_Transform( ST_GeomFromText('POINT(-71.04054 42.35141)', 4326), 26986), 500);
| name | bucket -------------------------------------+-------- Manulife Tower | 0 Park Lane Seaport I | 0 Park Lane Seaport II | 0 Renaissance Boston Waterfront Hotel | 0 Seaport Boston Hotel | 0 Seaport Hotel & World Trade Center | 0 Waterside Place | 0 World Trade Center East | 0 100 Northern Avenue | 1 100 Pier 4 | 1 The Institute of Contemporary Art | 1 101 Seaport | 2 District Hall | 2 One Marina Park Drive | 2 Twenty Two Liberty | 2 Vertex | 2 Vertex | 2 Watermark Seaport | 2 Blue Hills Bank Pavilion | NULL World Trade Center West | NULL (20 rows) |
Combining parcels with the same cluster number into a single geometry. This uses named argument calling
SELECT cid, ST_Collect(geom) AS cluster_geom, array_agg(parcel_id) AS ids_in_cluster FROM ( SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid, geom FROM parcels) sq GROUP BY cid;