July 15th 2014, R-Tree Example

July 14th 2014 Geo-Spatial Coordinate Conversions | | July 15th 2014 Geo-Spatial Coordinate Systems

In this section we closely follow the R-Tree example as given on sqlite.org.

1) We enter the sqlite CLI.

If no database file is specified as command line argument, the CLI is using a transient in-memory database.

2) We create a table that is suited for searching two-dimensional geo-spatial points of interest:

   id,              -- Integer primary key
   minX, maxX,      -- Minimum and maximum X coordinate
   minY, maxY       -- Minimum and maximum Y coordinate

Note that you get an error message, if you are trying the above with an out-of-the-box sqlite3 CLI:

SQL error: no such module: rtree

The SQLite amalgamation needs to be compiled with “-DSQLITE_ENABLE_RTREE=1” to support the rtree module.

3) And insert some geographic points:

    1,                   -- Primary key -- SQLite.org headquarters
    -80.7749, -80.7747,  -- Longitude range
    35.3776, 35.3778     -- Latitude range
    2,                   -- NC 12th Congressional District in 2010
    -81.0, -79.6,
    35.0, 36.2

4) As with all sqlite tables, searching for the primary key is fast:

SELECT * FROM demo_index WHERE id=1;

Primary keys are indexed automatically by sqlite. The performance of indexed searches is independent of the actual size of the table, so that the time spent by searching for primary keys can be thought to be constant.

5) But how about a geo-spatial search for the objects that fall within a certain search area. How fast is that?

Actually, we did not insert points, but bounding boxes. The R-Tree is not designed to find exact matches, but to speed up a geo-spatial (or multi-dimensional range) search by first finding an approximate result set.

This means, that the R-Tree finds all matches where the bounding box of an object overlaps with the search area. As a result, the search space is narrowed down from millions of objects to a few dozens. This can be done very quickly using a range tree, hence the name of the sqlite module. The exact solution is computed afterwards by applying the exact search criteria to the reduced set. Since the latter may be time-consuming to compute, millions of time-consuming computations are saved.

Therefore, when using an R-Tree, we do not perform geo-spatial search in the sense of overlapping polygons (as the SpatialLite SQLite extension does), but we search for the bounding boxes of objects that overlap with a rectangular search area.

For example:

SELECT id FROM demo_index
 WHERE maxX>=-81.08 AND minX<=-80.58
   AND maxY>=35.00  AND minY<=35.44;

yields the index of both inserted table entries, since both overlap with the search rectangle. If there were zillions of other non-overlapping entries in the table, the search would perform equally fast.

July 14th 2014 Geo-Spatial Coordinate Conversions | | July 15th 2014 Geo-Spatial Coordinate Systems