Boston Geographic Information Systems
Part 2: Introduction to Spatial Queries and SFSQL with PostGIS

# What is SFSQL?

One of the greatest things about Spatial Relational Databases is that they bring GIS to a new level by allowing you to apply the expressive SQL declarative language to the spatial domain. With spatial relational databases, you can easily answer questions such as what is the average household income of a neighborhood block. What political district does X reside in. This new animal that marries SQL with GIS is called Simple Features for SQL (SFSQL). In essence SFSQL introduces a new set of functions and aggregate functions to the SQL Language.

# Some Common Queries

In this next section we'll go over some common queries utilizing the data we downloaded in Part 1. Although we are using PostGIS for this exercise, most Spatial Relational Databases such as Oracle Spatial, ArcSDE, DB Spatial Extender, have similar syntax.

## Transforming from One Coordinate System to Another

As noted in Part 1, the data we downloaded is in NAD 83 Meters MA State Plane. What if we needed the data in NAD 83 feet. To accomplish this, we would need to transform our spatial data to the new coordinate system with the transform function. If you look in the spatial_ref_sys table you'll notice that srid = 2249 is the srid of the Nad 83 ft MA State Plane. Our query would look something like this.

```SELECT town, ST_Transform(geom, 2249) as geom_nad83ft FROM towns ```

### Getting Latitude and Longitude Centroid

In order to get the latitude and longitude of our data, we need our coordinate reference system to be in some for of longlat.  To begin with, we first transform our data from NAD 83 Meters Massachusetts State Plane to some variant of longlat - closest match is NAD 83 North American Datum (srid = 4269).  Then we find the centroid and then the x and y coordinates of that.

``` SELECT town, ST_X(ST_Centroid(ST_Transform(geom, 4269))) as longitude, ``````ST_Y(ST_Centroid(ST_Transform(geom, 4269))) as latitude FROM towns``````  ```

## Aggregate Functions

Spatial aggregate functions are much like regular SQL aggregate functions such as AVG, SUM, COUNT in that they work with GROUP BY and HAVING predicates and collapse multiple records into a single record. If you are unfamiliar with the above terms - take a look at Summarizing data with SQL (Structured Query Language)

### Extent

The extent function is an aggregate function that gives you the bounding box of a set of geometries. It is especially useful for determining the bounding rectangle of the area you are trying to map. For example if we wanted to find the bounding box of the boston area in NAD 83 feet, we would do something like this.

``` SELECT town, ST_Extent(ST_Transform(geom, 2249)) as the_extent FROM towns WHERE town = 'BOSTON' GROUP BY town ```

### ST_Union

The ST_Union function is an aggregate function that takes a set of geometries and unions them together to create a single geometry field. In versions prior to PostGIS 1.2, this was called geomunion and the old name was completely removed in PostGIS 2.0. For our towns data, we may have multiple records per town. To get a single geometry that represents the total region of a town, we could use the geomunion function like the example below.

``````
select town,  ST_Union(geom) as thegeom  from towns group by town;
```
```

It is important to note that while the above query will give you one record per town. Our original plain vanilla of

``` select town, geom as thegeom from towns; ```

will give you multiple records per town if multiple record exist per town.

### Seeing Results Visually

To get a visual sense of what all these different queries look like, you can dump out the above outputs as an ESRI shape file using the pgsql2shp tool and view it using a shape viewer such as QuantumGIS or use the query plugin tools directly to output directly from the db. .

``` pgsql2shp -f myshp -h myserver -u apguser -P apgpassword -g thegeom mygisdb "select town, geomunion(geom) as thegeom from towns group by town" ```

One caveat: the shape dumper utility can only dump out fields of type geometry. So for example to dump out a bbox type such as what is returned by the extent function, you'll need to cast the output as a geometry something like

``````
SELECT town, ST_SetSRID(ST_Extent(ST_Transform(geom, 2249))::geometry,2249) as theextent
FROM towns
WHERE town = 'BOSTON' GROUP BY town
```

```

## Distance Queries

### Getting Minimal Distance using Distance function

One common question asked for of a spatial database is how far one object is from another. PostGIS like similar spatial databases has a function called distance which will tell you the minimum distance between one geometry from another.

An example of such a use is below. The below query will tell you How far each zip in your zipcode table is from another zip 02109?. In this case we are comparing the field geom_meter which we have in NAD 83 State Plane meters. Distance is always measured in the metric of your spatial reference system.

``` SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(z.geom_meter,z2.geom_meter) As thedistance FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' ```

If the above geometries were polygons, then we would be getting the minimal distance from any point on each polygon to the other. For point geometries there is no distinction between a min distance and other distances.

### Getting Average Distance using the Centroid function

If we are not dealing with just points, then there are all kinds of distances we could be asking for. For example If we wanted the average distance between 2 polygons, then we would want the distance between the centroids. An average distance compare would then look like.

``` SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(ST_Centroid(z.geom_meter),ST_Centroid(z2.geom_meter)) As averagedistance FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' ```

### Getting a list of objects that are within X distance from another object

An often asked question is what zipcodes are within x away from my zip?. To do that one would think that simply doing a distance < x would suffice. This would suffice except it would be slow since it is not taking advantage of PostGIS indexes and also because if the geometries you are comparing are fairly complex, the search could be exhaustive. To write the query in the most efficient way, we would include the use of the Expand function. Like so ``` SELECT z.zipcode FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' AND ST_DWithin(z2.geom_meter, z.geom,3000) ```

The ST_DWithin takes the bounding box of a geometry and expands it out X in all directions and then does a more exhaustive distance check on the actual geometries. So in this case we would be expanding our 02109 zip geometry out 3000 meters.

It internally uses the && operator is the interacts operator that returns true if two geometries bounding boxes intersect and false if they do not. It is a much faster especially if you have indexes on your geometries because it utilizes the bounding box index.

Post Comments About Part 2: Introduction to Spatial Queries and SFSQL with PostGIS

This Document is available under the GNU Free Documentation License 1.2 http://www.gnu.org/copyleft/fdl.html & for download at the BostonGIS site http://www.bostongis.com

Boston GIS      Copyright 2021      Paragon Corporation