Boston Geographic Information Systems

Get \$15 off on Apress.com with our Halloween Sale and Coupon TREAT - Shop Now
Part 2: Introduction to Spatial Queries and SFSQL with PostGIS

# What is SFSQL?

One of the greatest things about Spatial Relational Databases is that they bring GIS to a new level by allowing you to apply the expressive SQL declarative language to the spatial domain. With spatial relational databases, you can easily answer questions such as what is the average household income of a neighborhood block. What political district does X reside in. This new animal that marries SQL with GIS is called Simple Features for SQL (SFSQL). In essence SFSQL introduces a new set of functions and aggregate functions to the SQL Language.

# Some Common Queries

In this next section we'll go over some common queries utilizing the data we downloaded in Part 1. Although we are using PostGIS for this exercise, most Spatial Relational Databases such as Oracle Spatial, ArcSDE, DB Spatial Extender, have similar syntax.

## Transforming from One Coordinate System to Another

As noted in Part 1, the data we downloaded is in NAD 83 Meters MA State Plane. What if we needed the data in NAD 83 feet. To accomplish this, we would need to transform our spatial data to the new coordinate system with the transform function. If you look in the spatial_ref_sys table you'll notice that srid = 2249 is the srid of the Nad 83 ft MA State Plane. Our query would look something like this.

```SELECT town, ST_Transform(geom, 2249) as geom_nad83ft FROM towns ```

### Getting Latitude and Longitude Centroid

In order to get the latitude and longitude of our data, we need our coordinate reference system to be in some for of longlat.  To begin with, we first transform our data from NAD 83 Meters Massachusetts State Plane to some variant of longlat - closest match is NAD 83 North American Datum (srid = 4269).  Then we find the centroid and then the x and y coordinates of that.

``` SELECT town, ST_X(ST_Centroid(ST_Transform(geom, 4269))) as longitude, ``````ST_Y(ST_Centroid(ST_Transform(geom, 4269))) as latitude FROM towns``````  ```

## Aggregate Functions

Spatial aggregate functions are much like regular SQL aggregate functions such as AVG, SUM, COUNT in that they work with GROUP BY and HAVING predicates and collapse multiple records into a single record. If you are unfamiliar with the above terms - take a look at Summarizing data with SQL (Structured Query Language)

### Extent

The extent function is an aggregate function that gives you the bounding box of a set of geometries. It is especially useful for determining the bounding rectangle of the area you are trying to map. For example if we wanted to find the bounding box of the boston area in NAD 83 feet, we would do something like this.

``` SELECT town, ST_Extent(ST_Transform(geom, 2249)) as the_extent FROM towns WHERE town = 'BOSTON' GROUP BY town ```

### ST_Union

The ST_Union function is an aggregate function that takes a set of geometries and unions them together to create a single geometry field. In versions prior to PostGIS 1.2, this was called geomunion and the old name was completely removed in PostGIS 2.0. For our towns data, we may have multiple records per town. To get a single geometry that represents the total region of a town, we could use the geomunion function like the example below.

``````
select town,  ST_Union(geom) as thegeom  from towns group by town;
```
```

It is important to note that while the above query will give you one record per town. Our original plain vanilla of

``` select town, geom as thegeom from towns; ```

will give you multiple records per town if multiple record exist per town.

### Seeing Results Visually

To get a visual sense of what all these different queries look like, you can dump out the above outputs as an ESRI shape file using the pgsql2shp tool and view it using a shape viewer such as QuantumGIS or use the query plugin tools directly to output directly from the db. .

``` pgsql2shp -f myshp -h myserver -u apguser -P apgpassword -g thegeom mygisdb "select town, geomunion(geom) as thegeom from towns group by town" ```

One caveat: the shape dumper utility can only dump out fields of type geometry. So for example to dump out a bbox type such as what is returned by the extent function, you'll need to cast the output as a geometry something like

``````
SELECT town, ST_SetSRID(ST_Extent(ST_Transform(geom, 2249))::geometry,2249) as theextent
FROM towns
WHERE town = 'BOSTON' GROUP BY town
```

```

## Distance Queries

### Getting Minimal Distance using Distance function

One common question asked for of a spatial database is how far one object is from another. PostGIS like similar spatial databases has a function called distance which will tell you the minimum distance between one geometry from another.

An example of such a use is below. The below query will tell you How far each zip in your zipcode table is from another zip 02109?. In this case we are comparing the field geom_meter which we have in NAD 83 State Plane meters. Distance is always measured in the metric of your spatial reference system.

``` SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(z.geom_meter,z2.geom_meter) As thedistance FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' ```

If the above geometries were polygons, then we would be getting the minimal distance from any point on each polygon to the other. For point geometries there is no distinction between a min distance and other distances.

### Getting Average Distance using the Centroid function

If we are not dealing with just points, then there are all kinds of distances we could be asking for. For example If we wanted the average distance between 2 polygons, then we would want the distance between the centroids. An average distance compare would then look like.

``` SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(ST_Centroid(z.geom_meter),ST_Centroid(z2.geom_meter)) As averagedistance FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' ```

### Getting a list of objects that are within X distance from another object

An often asked question is what zipcodes are within x away from my zip?. To do that one would think that simply doing a distance < x would suffice. This would suffice except it would be slow since it is not taking advantage of PostGIS indexes and also because if the geometries you are comparing are fairly complex, the search could be exhaustive. To write the query in the most efficient way, we would include the use of the Expand function. Like so ``` SELECT z.zipcode FROM zipcode z, zipcode z2 WHERE z2.zipcode = '02109' AND ST_DWithin(z2.geom_meter, z.geom,3000) ```

The ST_DWithin takes the bounding box of a geometry and expands it out X in all directions and then does a more exhaustive distance check on the actual geometries. So in this case we would be expanding our 02109 zip geometry out 3000 meters.

It internally uses the && operator is the interacts operator that returns true if two geometries bounding boxes intersect and false if they do not. It is a much faster especially if you have indexes on your geometries because it utilizes the bounding box index.

Post Comments About Part 2: Introduction to Spatial Queries and SFSQL with PostGIS

CommenterComment
5/13/2011 12:43:42 PMReneRegina,

thanks for the answer, my knowledge so far was that I need the ending semicolon to end a SQL statement. Learned again something :)
Regina, I assume that you're the author of "PostGIS in Action"? We here in Europe are waiting patiently for your book, it's postponed all the time :-/ They can't say when the distributor or the publisher can deliver, but I already have the ebook, so no harm done :)

5/13/2011 7:51:23 AMReginaRene,

Sorry for that inconsistency. The ; is optional if you have only one sql statement. It's used to separate SQL statements, but if you have only one there is nothing that follows so it doesn't matter if you have it or not.
5/12/2011 12:57:35 PMRenéHello Dwayne,

as all the others I would like to thank you for this tutorial :)
I have a question for Page 2, on some SELECT examples you don't have an ending ;
e.g.:
"SELECT town, transform(the_geom, 2249) as the_geom_nad83ft FROM towns"

why is that? Jusr forgotten or do I don't understand something?

With regards,

1/10/2011 1:35:16 PMNeilGreat article! Another great way to see a query is to save it as a view and connect directly to PostgreSQL with QGIS. No shapfile required. If you change your view just click refresh in QGIS.

Probably best to ask these questions on the PostGIS Newsgroup. That gets monitored a bit more. It's good to subscribe to at anyrate since these kind of questions are common are answered often.
http://www.postgis.org/support/

1) 1. Can the table name be edited after it is imported, and if so, how?
Yes -- in PgAdmin III just right click on the table properties and type over the name.

The only tricky thing is you should update the geometry_columns table as well to correct.

Alternatively -- you can just rebuild the geometry_columns table using the populate_geometry_columns() function.
(I you don't pass in any arguments -- it will rebuild the table by inspecting all tables with geometries)
http://www.postgis.org/documentation/manual-1.5SVN/Populate_Geometry_Columns.html

2) In importing my Nevada data, I did not no how to look up an appropriate srid, so it loaded with the default -1. Can that be edited, or do I have to reload the table?
Yes -- use UpdateGeometrySRID

http://www.postgis.org/documentation/manual-1.5SVN/UpdateGeometrySRID.html

Regarding password: hmm was this on Windows. I think that's still on our todo to escape special characters in the installer, so we apologize for that one.

Hope you enjoy the rest of the book :)
http://www.postgis.us/chapters (paperclicks next to each will download the code and data for that chapter

Hopefully the book will be in its final form mid-late January and be out in hard-copy.
11/29/2010 9:30:47 AMDwayneRegina, I am amazed and pleased at your quick response to my question. Case sensitivity was the problem, and I can now query both of the tables I imported. I used pgAdminIII for both of them, after downloading the shapfiles, to bring them into my PostgreSQL database. The Massachusetts shapefile was TOWNS_POLY. In trying to follow the tutorial in Part 1, when I imported them to PostGIS, capitalization was retained for the table name. On looking back at the illustration given in Part 1, I now notice that the destination table was in lower case.

BTW, I am using PostGIS 1.5. Now I have two more questions:
1. Can the table name be edited after it is imported, and if so, how?
2. In importing my Nevada data, I did not no how to look up an appropriate srid, so it loaded with the default -1. Can that be edited, or do I have to reload the table?

So far most of the problems I'm having are from the little hidden issues like case sensitivity. It took me two days to get a successful install of PostGIS because I had set up PostgreSQL with a password containing an ampersand, resulting in a password authentication failure during attempted installation.

I just bought a copy of your book last night after reading the first chapter. It looks like a very helpful resource.

Which version of PostGIS are you using? If you are using PostGIS 1.5, the install file is in contrib/postgis-1.5/postgis.sql

What tool did you use for importing your files?

A common issue people have with querying is that depending on the tool you use, it may bring in the tables with the same case as the files and the fields.

e.g. town is not the same as TOWN.

If your table is called TOWN in the database for example, then you have to query it

SELECT ...
FROM "TOWN"

If you stick with lowercase tables, then you don't need to quote to select from them.

Make sure all your tables are lowercase in your database and all your column names are lower case as well.
11/29/2010 12:02:38 AMDwayneI cannot seem to work with tables I have loaded as shapefiles, attempting to follow the example in Part I. The table I downloaded was towns_poly (I did not find one called towns). I used the standard template. I can view the table, but cannot select. For example:
select town from towns_poly

Produces the error:
ERROR: relation "towns_poly" does not exist
LINE 1: select town from towns_poly
^

********** Error **********

ERROR: relation "towns_poly" does not exist
SQL state: 42P01
Character: 18

Attempting to use one I downloaded for Clark County Nevada produces similar results. The only table in the gisdb I created that allows queries is spacial_ref_sys. Towns_poly and Gillis tables are there, but I cannot query them. I could not find lwpostgis.sql or liblwgeom.dll in the installation folders.
6/15/2010 4:45:23 PMGarretWorked like a charm. Thank you very much for your help.
6/7/2010 10:16:58 AMReginaOops wrong example -- for towns try below --
SELECT t.town, MIN(ST_Distance(t.the_geom, b.the_geom)) As dist
FROM towns As t CROSS JOIN towns as b
WHERE b.town = 'BOSTON' and t.town != 'BOSTON'
GROUP BY t.town;

Note the use of SQL MIN here -- because Boston is broken up into many polygons rather than a single geometry in this set. So are other towns by the way, so I just want minimum distance between any set of same named geometries.

The above will give you the shortest distance between any other town to Boston.

6/7/2010 10:04:10 AMReginaGarret,
Actually we really need to update this to use the new function names and simpler syntax for within distance. The whole expand/distance has been superceded in PostGIS 1.3+ with ST_DWithin. Distance has been replaced with ST_Distance. Check out the first chapter of our book which is a free download, for more examples

http://www.manning.com/obe/PostGIS_MEAPCH01.pdf

If you wanted to list the distance in feet between each neighborhood to 'Hyde Park' for neighborhoods within 10000 ft of Hyde Park, it would be the below. Keep in mind distance works for all kinds of geometries and will give you the minimum distance between two objects. (so distance of the 2 closest points)

SELECT n1.name, ST_Distance(n1.the_geom, hp.the_geom) As dist_to_hp_ft
FROM neighborhoods As n1 INNER JOIN
neighborhoods As hp ON ST_DWithin(n1.the_geom, hp.the_geom, 10000)
WHERE hp.name = 'Hyde Park' AND n1.name != 'Hyde Park';
6/4/2010 5:15:09 PMGarretI was wondering if you could give an example of the distance queries using the towns_poly database. Thank you.

This Document is available under the GNU Free Documentation License 1.2 http://www.gnu.org/copyleft/fdl.html & for download at the BostonGIS site http://www.bostongis.com

Boston GIS      Copyright 2015      Paragon Corporation