The selectivity is a float from 0-1, that estimates the proportion of the rows in the table that will be returned as a result of the search box.
To get our estimate, we need "only" sum up the values * the proportion of each cell in the histogram that falls within the search box, then divide by the number of features that generated the histogram.
1907 double total_count = 0.0;
1914 elog(NOTICE,
" estimate_selectivity called with null input");
1926 POSTGIS_DEBUGF(3,
" mode: %d", mode);
1929 POSTGIS_DEBUG(3,
" in 2d mode, stripping the computation down to 2d");
1945 POSTGIS_DEBUG(3,
" search box does not overlap histogram, returning 0");
1952 POSTGIS_DEBUG(3,
" search box contains histogram, returning 1");
1959 POSTGIS_DEBUG(3,
" search box overlap with stats histogram failed");
1964 for ( d = 0; d < nd_stats->
ndims; d++ )
1969 cell_size[d] = (max[d] - min[d]) / nd_stats->
size[d];
1970 POSTGIS_DEBUGF(3,
" cell_size[%d] : %.9g", d, cell_size[d]);
1973 at[d] = nd_ibox.
min[d];
1979 float cell_count, ratio;
1983 for ( d = 0; d < nd_stats->
ndims; d++ )
1985 nd_cell.
min[d] = min[d] + (at[d]+0) * cell_size[d];
1986 nd_cell.
max[d] = min[d] + (at[d]+1) * cell_size[d];
1993 total_count += cell_count * ratio;
1994 POSTGIS_DEBUGF(4,
" cell (%d,%d), cell value %.6f, ratio %.6f", at[0], at[1], cell_count, ratio);
2001 POSTGIS_DEBUGF(3,
" nd_stats->histogram_features = %f", nd_stats->
histogram_features);
2002 POSTGIS_DEBUGF(3,
" nd_stats->histogram_cells = %f", nd_stats->
histogram_cells);
2003 POSTGIS_DEBUGF(3,
" sum(overlapped histogram cells) = %f", total_count);
2004 POSTGIS_DEBUGF(3,
" selectivity = %f", selectivity);
2007 if (selectivity > 1.0) selectivity = 1.0;
2008 else if (selectivity < 0.0) selectivity = 0.0;
static int nd_increment(ND_IBOX *ibox, int ndims, int *counter)
Given an n-d index array (counter), and a domain to increment it in (ibox) increment it by one...
static int nd_box_overlap(const ND_STATS *nd_stats, const ND_BOX *nd_box, ND_IBOX *nd_ibox)
What stats cells overlap with this ND_BOX? Put the lowest cell addresses in ND_IBOX->min and the high...
#define ND_DIMS
The maximum number of dimensions our code can handle.
static int nd_box_contains(const ND_BOX *a, const ND_BOX *b, int ndims)
Return TRUE if ND_BOX a contains b, false otherwise.
static void nd_box_from_gbox(const GBOX *gbox, ND_BOX *nd_box)
Set the values of an ND_BOX from a GBOX.
#define FALLBACK_ND_SEL
More modest fallback selectivity factor.
static int nd_stats_value_index(const ND_STATS *stats, int *indexes)
Given a position in the n-d histogram (i,j,k) return the position in the 1-d values array...
static int nd_box_intersects(const ND_BOX *a, const ND_BOX *b, int ndims)
Return TRUE if ND_BOX a overlaps b, false otherwise.
N-dimensional box index type.
float4 histogram_features
static char * nd_box_to_json(const ND_BOX *nd_box, int ndims)
Convert an ND_BOX to a JSON string for printing.
static double nd_box_ratio(const ND_BOX *b1, const ND_BOX *b2, int ndims)
Returns the proportion of b2 that is covered by b1.
N-dimensional box type for calculations, to avoid doing explicit axis conversions from GBOX in all ca...
static int gbox_ndims(const GBOX *gbox)
Given that geodetic boxes are X/Y/Z regardless of the underlying geometry dimensionality and other bo...