Paragon Corpoation PostGIS Spatial Database Engine OSGeo.org The Open Source Geospatial Foundation UMN Mapserver Boston Geographic Information Systems       PostGreSQL Object Relational Database Management System
Home   About Boston GIS   Consulting Services  Boston GIS Blog  Postgres OnLine Journal  Planet PostGIS  PostGIS Funding

Purpose of BostonGIS

BostonGIS is a testbed for GIS and Web Mapping solutions utilizing open source, freely available and/or open gis technologies. We will be using mostly Boston, Massachusetts data to provide mapping and spatial database examples.

If you have some thoughts or comments on what you would like to see covered on this site, drop us a line on our Feed Back page.


GIS Tutorials on Opensource and OpenGIS technologies Tutorials
GIS Article comments Article and Tutorial Comments
Boston GIS BLog Rss FeedBoston GIS blog

PDF HTML All BostonGIS tutorials packaged together in an E-Book.


Tutorial and Tip Sites
Desktop GIS
External Data
GIS Events and Groups
GIS SDKs and Frameworks
External Resources
Glossary
GIS Blogs Around Boston
External GIS Blogs
External Papers Articles
GIS Quick Guides and References
OpenStreetMap and OpenLayers Tutorials
PostGIS, pgRouting, and PostgreSQL Tutorials
Part 1: Getting Started With PostGIS: An almost Idiot's Guide

Printer Friendly

What Is PostGIS?

PostGIS is an open source, freely available, and fairly OGC compliant spatial database extender for the PostgreSQL Database Management System. In a nutshell it adds spatial functions such as distance, area, union, intersection, and specialty geometry, geography, and raster data types to the database. PostGIS is very similar in functionality to SQL Server Spatial support, ESRI ArcSDE, Oracle Spatial, and DB2 spatial extender except it has more functionality and generally better performance than all of those (and it won't bankrupt you). The latest release version now comes packaged with the PostgreSQL DBMS installs as an optional add-on. As of this writing PostGIS 3.3.2 is the latest stable release and PostGIS 3.4.0 will be coming out within the next year. Features new in PostGIS 3.3 and new in Upcoming PostGIS 3.4. Notable features of PostGIS:

  • Flat earth including 3D geometry support (part of postgis extension), advanced 3D in postgis_sfcgal extension
  • Round-earth (geography type) spatial support (part of postgis extension)
  • Advanced 3D analytics via SFCGAL Enhanced 3D geometry and other geometry processing functions.
  • SQL/MM topology support: via postgis_topology extension
  • Seamless raster/vector analysis support via the postgis_raster extension including an easy to use command line raster database loader that supports various types and can load whole folders of raster files with one commandline statement, and really jazzy image export functions to output both raster and geometries as PNG/TIFF and other raster formats.
  • the graphical gui loader,which is packaged with the Windows Application Stack builder and some other desktop distros, includes batch file uploading as well as exporting. You will find this installed in your PostgreSQL/bin folder and is called shp2pgsql-gui.exe.
  • Geocoding for US data utilizing US Census Tiger Data using the postgis_tiger_geocoder extension
  • Standardizing addresses using address_standardizer extension for parsing addresses into parts, useful for geocoding addresses.

The PostGIS Windows bundle also includes cool PostGIS related extensions that augment your spatial enjoyment. In the bundle you will also find:

  • pgRouting for building routing applications.
  • ogrfdw for querying remote spatial data sources. This includes curl support so you can query web services such as WFS services as well.
  • pgpointcloud for storing LIDAR data in compress POINTPATCH and also performing various operations on it.
  • h3-pg for using Uber h3 indexing scheme and converting h3 hexagons back and forth between postgis geometry/geography representations and h3index hashes

We will assume a windows environment for this tutorial, but most of the tutorial will apply to other supported platforms such as Linux, Unix, BSD, Mac etc. We will also be using Massachusetts/Boston data for these examples. For desktop users, the EnterpriseDB one-click installer exists as well for Mac/OSX and Linux desktops, so you should be able to follow along without too much fuss.

Installing PostgreSQL with PostGIS Functionality

We will not go into too much detail here since the install wizard (at least the windows one) is pretty good. Below are the basic steps.

  1. Download the install for your specific platform from the PostgreSQL Binary Download ( https://www.postgresql.org/download/ ) . As of this writing the latest version is PostgreSQL 15 and we will be assuming PostGIS 3.3+. The minimum support for PostgreSQL for PostGIS 3.3 is PostgreSQL 11, PostgreSQL for PostGIS 3.1 is PostgreSQL 9.5, and PostGIS 3.2 is 9.6 (for windows we build installers for 9.6-14 for the 3.2 series)
  2. Launch exe to install PostgreSQL
  3. Once PostgreSQL is installed, launch Application Stack Builder from (Start->Programs->PostgreSQL ..->Application Stackbuilder and pick the version of PostgreSQL you want to install PostGIS on and the version of PostGIS to install.

    You can upgrade from 2.+ to 3.+ without having to backup and restore your db. and ALTER EXTENSION postgis UPDATE; will do the trick after you've installed the latest PostGIS.

    For versions of PostGIS from 2.5.3+ you can use SELECT postgis_extensions_upgrade(); to upgrade postgis, postgis_raster, postgis_topology, postgis_sfcgal, and postgis_tiger_geocoder.

    There is a postgis sampler option) which creates a regular spatial db with all the extensions installed. The dumper,loader commandline and GUI tools in PostgreSQL bin folder will get overwritten by the last PostGIS you installed so be careful. Generally speaking PostGIS 3.1 should work just fine everywhere you were using PostGIS before. If you were upgrading from PostGIS < 2.2 you may need to install the legacy_minimal.sql or legacy.sql scripts packaged in PostgreSQL//contrib/postgis-3.1

  4. Navigate to spatial extensions and pick PostGIS 3.3. Download , install. Please note the PostGIS windows installer, no longer creates a template database. Using CREATE EXTENSION postgis; for enabling PostGIS in a database is the recommended way.

    The create spatial database checkbox is optional, and we generally uncheck it. It creates a spatial database for you to experiment with and has all the extensions packaged with PostGIS Bundle pre-installed.

  5. Do you want to register the PROJ dir. This is needed for geometry and raster ST_Transform
  6. Do you want to register the GDAL_DATA prompt is needed for PostGIS raster and ogr_fdw extensions. This is because in order to do operations that require raster transformations or other rater warping / clipping etc, PostGIS uses GDAL epsg files. ogr_fdw uses configuration files for some vector types and these are stored in GDAL_DATA folder. The windows build, makes a local copy of these in the PostgreSQL install\gdal-data folder and saying yes will automatically add/change an GDAL_DATA environment variable putting this path in for you. If you use GDAL already (or you are running both PostGIS 32-bit and 64-bit, chances are you already have this environment variable set and you may not want to overwrite it. PostGIS will work happily with an existing one, but just remember if you uninstall a PostGIS or your GDAL, these functions may stop working and you'll need to repoint the environment variable.
  7. -- Enable Raster drivers. All raster drivers are disabled by default. Saying yes to this prompt allows the most common drivers (that are considered safe). E.g. they don't call out to web services etc. If you are not content with the list shown there, you may want to explicitly enable additional drivers using the GUC raster features PostGIS GDAL Enabled Raster Drivers GUC, that can either be set using ALTER SYSTEM (for 9.4+), or ALTER DATABASE for specific databases.
  8. -- Out of database rasters are disabled by default for security reasons. If you need them, say yes to this prompt. Again if you want each database to have different settings, you can opt for the GUC route. Enable Out of Database rasters GUC
  9. For those of you who want to try experimental builds -- e.g. We have experimental Windows builds made as part of our continuous integration process when anything changes in the PostGIS code base. These can be downloaded from http://postgis.net/windows_downloads.

Creating a spatial database using SQL

You can use something like psql or the pgAdmin query window to create a database and spatially enable your database. This is the way to go if you have only a terminal interface to your db or you have a butt load of extensions you want to enable. Your psql steps would look something like this:

To install a bunch of extensions, just open up the pgAdmin SQL Query window (which we'll cover shortly) or psql and run this including only the extensions you want.


CREATE DATABASE gisdb;
\connect gisdb;
-- Enable PostGIS (includes raster)
CREATE EXTENSION postgis;
-- Enable Topology
CREATE EXTENSION postgis_topology;
-- Enable PostGIS Advanced 3D 
-- and other geoprocessing algorithms
CREATE EXTENSION postgis_sfcgal;
-- fuzzy matching needed for Tiger
CREATE EXTENSION fuzzystrmatch;
-- rule based standardizer
CREATE EXTENSION address_standardizer;
-- example rule data set
CREATE EXTENSION address_standardizer_data_us;
-- Enable US Tiger Geocoder
CREATE EXTENSION postgis_tiger_geocoder;
-- routing functionality
CREATE EXTENSION pgrouting;
-- spatial foreign data wrappers
CREATE EXTENSION ogr_fdw;

-- LIDAR support
CREATE EXTENSION pointcloud;
-- LIDAR Point cloud patches to geometry type cases
CREATE EXTENSION pointcloud_postgis;

--- Uber h3 hexagon indexing scheme for PostGIS 3.3.2+ bundles
CREATE EXTENSION h3;
--- converts between h3 index representations 
-- and  postgis geometry/geography
CREATE EXTENSION h3_postgis;

Creating a spatial database using pgAdmin GUI

PostgreSQL comes packaged with a fairly decent admin tool called PgAdmin3. If you are a newbie or not sure what extensions you want, it's best just to use that tool to create a new database and look at the menu of extensions you have available to you.

  1. On windows PgAdmin is under Start->Programs->PostgreSQL 14->PgAdmin 4

    You can also run pgAdmin not on the server and also just install it separately downloading from https://pgadmin.org,

  2. Login with the super user usually postgres and the password you chose during install. If you forgot it, then go into pg_hba.conf (which is located where you specified data cluster in PostgreSQL install) (just open it with an editor such as notepad or a programmer editor). Set the line
    host all all 127.0.0.1/32 md5

    to

    host all all 127.0.0.1/32 trust


    If you are on a newer windows (say 2008 or Windows 7), you may see an additional option
    host all all ::1/128 trust

    The ::1/128 is usually the controlling one and is what localhost resolves to in IPV6 so you'll want to set this one.

    This will allow any person logging locally to the computer that PostgreSQL is installed on to access all databases without a password. (127.0.0.1/32) means localhost only (32 is the bit mask). Note you can add additional lines to this file or remove lines to allow or block certain ip ranges. The lines highest in the file take precedence.

    So for example if you wanted to allow all users logging in access as long as they successfully authenticate with an md5 password, then you can add the line

    host all all 0.0.0.0/0 md5
    . If it is below, you will still be able to connect locally without a password but non-local connections will need a valid username and password.



    Note: - PgAdmin allows editing Postgresql.conf and pg_hba.conf using the PgAdmin tool. These are accessible from Tools->Server Configuration and provide a fairly nice table editor to work with. This feature is only available if you have installed the adminpack.sql (this is located in C:\Program Files\PostgreSQL\9.x\share\contrib) (Admin Pack) in the postgres database .

    To install it --- switch to postgres database

    and run this command in the sql window of PgAdmin or using psql: CREATE EXTENSION adminpack;

    (NOTE: you can also use the extensions gui of PgAdmin to install in the postgres db)

  3. Now for the fun part - Create your database. Call it gisdb or whatever you want.
  4. It's generally a good idea to create a user too that owns the database that way you don't need to use your superuser account to access it.
  5. Verify you have the newest PostGIS with this query:

    SELECT postgis_full_version();

    Should output something of the form

    POSTGIS="3.3.2 3.3.2" [EXTENSION] PGSQL="150" 
    GEOS="3.11.1-CAPI-1.17.1" 
    SFCGAL="SFCGAL 1.4.1, CGAL 5.3, BOOST 1.78.0" 
    PROJ="7.2.1" GDAL="GDAL 3.4.3, released 2022/04/22" LIBXML="2.9.9" 
    LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER
    

Loading GIS Data Into the Database

Now we have a nice fully functional GIS database with no spatial data. So to do some neat stuff, we need to get some data to play with.

Get the Data

Download data from the MassGIS site.
For this simple exercise just download Towns
Extract the files into some folder. We will only be using the _POLY files for this exercise.

NOTE: Someone asked how you extract the file if you are on a linux box.

---FOR LINUX USERS ---

If you are on Linux/Unix, I find the exercise even easier. If you are on linux or have Wget handy - you can do the below to download the file after you have cded into the folder you want to put it in.

wget http://wsgw.mass.gov/data/gispub/shape/state/boundaries.zip

Now to extract it simply do the following from a shell prompt

unzip boundaries.zip

---END FOR LINUX USERS ---

Figure out SRID of the data

You will notice one of the files it extracts is called BOUNDARY_POLY.prj. A .prj is often included with ESRI shape files and tells you the projection of the data. We'll need to match this descriptive projection to an SRID (the id field of a spatial ref record in the spatial_ref_sys table) if we ever want to reproject our data.

  • Open up the .prj file in a text editor. You'll see something like NAD_1983_StatePlane_Massachusetts_Mainland_FIPS_2001 and UNIT["Meter",1.0]
  • Open up your PgAdmin III query tool and type in the following statement select srid, srtext, proj4text from spatial_ref_sys where srtext ILIKE '%Massachusetts%' And then click the green arrow. This will bring up about 10 records.
  • Note the srid of the closest match. In this case its 26986. NOTE: srid is not just a PostGIS term. It is an OGC standard so you will see SRID mentioned a lot in other spatial databases, gis webservices and applications. Most of the common spatial reference systems have globally defined numbers. So 26986 always maps to NAD83_StatePlane_Massachusetts_Mainland_FIPS_2001 Meters. Most if not all MassGIS data is in this particular projection.

Loading the Data

The easiest data to load into PostGIS is ESRI shape data since PostGIS comes packaged with a nice command line tool called shp2pgsql which converts ESRI shape files into PostGIS specific SQL statements that can then be loaded into a PostGIS database.

This file is located in the PostgreSQL bin folder which default location in Windows is Program Files/PostGreSQL/15/bin

Make a PostGIS mini toolkit

Since these files are so embedded, it is a bit annoying to navigate to. To create yourself a self-contained toolkit you can carry with you anywhere, copy the following files from the bin folder into say c:\pgutils:

 comerr32.dll krb5_32.dll libeay32.dll
libiconv-2.dll libintl-2.dll libpq.dll pgsql2shp.exe psql.exe
pg_dump.exe pg_restore.exe shp2pgsql.exe ssleay32.dll libproj-9.dll geos_c.dll geos

Note: The GUI loader is packaged as a self-contained postgisgui folder in the bin of your PostgreSQL install. If you prefer the GUI interface, you can copy that folder and run the shp2pgsql-gui.exe file from anywhere even an external file network path.

Load Towns data

  1. Open up a command prompt.
  2. Cd to the folder you extracted the towns data
  3. Run the following command:
    c:\pgutils\shp2pgsql -s 26986 BOUNDARY_POLY towns > towns.sql
  4. Load into the database with this command:
    psql -d gisdb -h localhost -U postgres -f towns.sql
    If you are on another machine different from the server, you will need to change localhost to the name of the server. Also you may get prompted for a password. For the above I used the default superuser postgres account, but its best to use a non-super user account.
  5. Alternatively you can use the gui to load the data and when you do, your screen will look something like this. PostGIS shapefile GUI loader Which is a little different from the PostGIS 1.5 loader, because it allows uploading multiple files at ones. To edit any of the settings for each file, click into the cell and the cell will become editable. In this case we replaced the default table name boundary_poly with towns.
    prepare etc
  6. This particular dataset is only polygons. You can override the behavior of bringing in as multipolgons by clicking the Options button checking the Generate simple geometries ... .

    One thing you can do with the shp2pgsql command line version pacakged with PostGIS 2.0-2.2, that you can't do with the GUI is to do a spatial transformation from one coordiante system to another. So witht eh command line, we can transform to 4326 (WGS 84) and load to geography type with a single command. Hopefully we'll see this in the GUI in a future release.

Indexing the data

Table indexes are very important for speeding up the processing of most queries. There is also a downside to indexes and they are the following

  1. Indexes slow down the updating of indexed fields.
  2. Indexes take up space. You can think of an index as another table with bookmarks to the first similar to an index to a book.

Given the above, it is often times tricky to have a good balance.  There are a couple general rules of thumb to go by that will help you a long way.

  1. Never put indexes on fields that you will not use as part of a where condition or join condition.
  2. Be cautious when putting index fields on heavily updated fields.  For example if you have a field that is frequently updated and is frequently used for updating, you'll need to do benchmark tests to make sure the index does not cause more damage in update situations than it does for select query situations.  In general if the number of records you are updating at any one time for a particular field is small, its safe to put in an index.
  3. Corrollary to 2.  For bulk uploads of a table - e.g. if you are loading a table from a shape, its best to put  the indexes in place after the data load because if an index is in place, the system will be creating indexes as its loading which could slow things down considerably.
  4. If you know a certain field is unique in a table, it is best to use a unique or primary index. The reason for this is that it tells the planner that once its found a match, there is no need to look for another.  It also prevents someone from accidentally inserting a duplicate record as it will throw an error.
  5. For spatial indexes - use a gist index. A gist basically stores the bounding box of the geometry as the index. For large complex geometries unfortunately, this is not too terribly useful.

The most common queries we will be doing on this query are spatial queries and queries by the town field. So we will create 2 indexes on these fields. NOTE: The loader has an option to create the spatial index which we took advantage of, so the spatial index one is not necessary but we present it here, just so if you ever need to create a spatial index, like for csv loaded data, you know how to.

CREATE INDEX idx_towns_geom
ON towns
USING gist(geom);


CREATE INDEX idx_towns_town
ON towns
USING btree(town);

Querying Data

Go back into PgAdmin III and refresh your view. Verify that you have a towns database now.

Test out the following queries from the query tool

SELECT ST_Extent(geom) FROM towns WHERE town = 'BOSTON';

SELECT ST_Area(ST_Union(geom)) FROM towns WHERE town = 'BOSTON';

Viewing the Data

If you are a GIS newbie, I highly recommend using QGIS. QGIS has ability to view PostGIS data both geometry and raster directly, do simple filters on it, is free, is cross-platform (Linux, Windows, MacOSX,Unix) and is the least threatening of all the GIS Viewers I have seen out there for people new to GIS.








Post Comments About Part 1: Getting Started With PostGIS: An almost Idiot's Guide
pgRouting: Loading OpenStreetMap with Osm2Po and route querying more ...
Part 1: Getting Started With PostGIS: An almost Idiot's Guide (PostGIS 2.0) more ...
OSCON 2009: Tips and Tricks for Writing PostGIS Spatial Queries more ...
PGCon2009: PostGIS 1.4, PostgreSQL 8.4 Spatial Analysis Queries, Building Geometries, Open Jump more ...
PLR Part 3: PL/R and Geospatial Data Abstraction Library (GDAL) RGDAL more ...
PostGIS Nearest Neighbor: A Generic Solution - Much Faster than Previous Solution more ...
Solving the Nearest Neighbor Problem in PostGIS more ...
PLR Part 2: PL/R and PostGIS more ...
PLR Part 1: Up and Running with PL/R (PLR) in PostgreSQL: An almost Idiot's Guide more ...
Part 3: PostGIS Loading Data from Non-Spatial Sources more ...
Part 2: Introduction to Spatial Queries and SFSQL with PostGIS more ...
Miscellaneous Tutorials/Cheatsheets/Examples
SpatiaLite Tutorials
Boston External Map Examples
SQL Server 2008 Tutorials
UMN Mapserver Tutorials
General Commentary
Boston GIS      Copyright 2024      Paragon Corporation