Paragon Corpoation PostGIS Spatial Database Engine OSGeo.org The Open Source Geospatial Foundation UMN Mapserver Boston Geographic Information Systems       PostGreSQL Object Relational Database Management System
Home   About Boston GIS   Consulting Services  Boston GIS Blog  Postgres OnLine Journal  Planet PostGIS  PostGIS Funding

Purpose of BostonGIS

BostonGIS is a testbed for GIS and Web Mapping solutions utilizing open source, freely available and/or open gis technologies. We will be using mostly Boston, Massachusetts data to provide mapping and spatial database examples.

If you have some thoughts or comments on what you would like to see covered on this site, drop us a line on our Feed Back page.


GIS Tutorials on Opensource and OpenGIS technologies Tutorials
GIS Article comments Article and Tutorial Comments
Boston GIS BLog Rss FeedBoston GIS blog

PDF HTML All BostonGIS tutorials packaged together in an E-Book.


Tutorial and Tip Sites
Desktop GIS
External Data
GIS Events and Groups
GIS SDKs and Frameworks
External Resources
Glossary
GIS Blogs Around Boston
External GIS Blogs
External Papers Articles
GIS Quick Guides and References
OpenStreetMap and OpenLayers Tutorials
PostGIS, pgRouting, and PostgreSQL Tutorials
Part 1: Getting Started With PostGIS: An almost Idiot's Guide more ...
pgRouting: Loading OpenStreetMap with Osm2Po and route querying more ...
Part 1: Getting Started With PostGIS: An almost Idiot's Guide (PostGIS 2.0) more ...
OSCON 2009: Tips and Tricks for Writing PostGIS Spatial Queries more ...
PGCon2009: PostGIS 1.4, PostgreSQL 8.4 Spatial Analysis Queries, Building Geometries, Open Jump more ...
PLR Part 3: PL/R and Geospatial Data Abstraction Library (GDAL) RGDAL more ...
PostGIS Nearest Neighbor: A Generic Solution - Much Faster than Previous Solution more ...
Solving the Nearest Neighbor Problem in PostGIS more ...
PLR Part 2: PL/R and PostGIS more ...
PLR Part 1: Up and Running with PL/R (PLR) in PostgreSQL: An almost Idiot's Guide more ...
Part 3: PostGIS Loading Data from Non-Spatial Sources more ...
Part 2: Introduction to Spatial Queries and SFSQL with PostGIS more ...
Miscellaneous Tutorials/Cheatsheets/Examples
SpatiaLite Tutorials
Part 1: Getting Started with SpatiaLite: An almost Idiot's Guide

Printer Friendly

What is SpatiaLite?

SpatiaLite is an SQLite database engine with Spatial functions added. You can think of it as a spatial extender for SQLite database engine which is similar in concept to what PostGIS does for the PostgreSQL Object-Relational Database.

For those unfamiliar with SQLite -- it is this little adorable single file relational database system that runs equally well on Windows, Linux, Unix, Mac and is easily embeddable in larger apps. Is Free and Open Soure -- Public domain so no restrictions for commercial and embedding in other applications and I think is superb as a mini-replicator. We use it in some of our projects for synching data back and forth between a mainstream server database such as PostgreSQL, SQL Server, MySQL and a client offline database that needs to synch differential data back and forth to the mother ship.

What is special about it is as follows:

  • all the tables fit into a single file so easy for transport
  • It has this cute little dblink like feature that allows you to mount external files as virtual tables and join with your SQLite data.
  • PHP has built-in drivers for SQLite it, .NET has drivers for it you can dump in your bin folder and get going, SharpMap (has a driver for the SpatiaLite version in the works) - thanks to Bill Dollins similar to what it has for PostGIS. (note the drivers unlike other databases contain the engine as well). This is very important to understand since you want your driver to have the SpatiaLite enhancements.
  • The SQLite core engine is a relatively tiny application that provides you more or less standard ANSI-SQL 92 syntax so with a basic abstraction layer, you can treat this as you would any other relational database.
  • My favorite - it even supports SQL views
  • It supports triggers
  • Has basic ACID/Transactions -- BEGIN/COMMIT;
  • Did I mention its cute, but don't let its cuteness deceive you of its potential. Seems to be lacking an animal mascot that embodies its cuteness.

SpatiaLite sweetens this little database by allowing you to store geometries and query them with spatial functions similar to what you would find in PostgreSQL/PostGIS, Microsoft SQL Server 2008, MySQL, IBM DBII, Oracle Locator/Spatial. In terms of the model it uses, it seems closest in syntax to PostGIS and in fact modeled after PostGIS and also piggy-backs on GEOS and Proj.4 like PostGIS. In fact the functionality you will see is pretty much the functions you get in PostGIS including their names minus the aggregates and some other things, except you just need to strip off the ST_ and add in a bbox filter, so PostGIS users should feel very much at home.

Some key features of SpatiaLite

  1. OGC functions similar to what you find implemented in PostGIS/GEOS. This uses GEOS.
  2. R-Tree Spatial Index if you use SQLite 3.6+, and rudimentary MBR index for lower SQLite
  3. Rudimentary support for curves
  4. Lots of MBR functions (Minimum bounding rectangle), similar to what MySQL 5.1 and below has, but also has real functions as described above for more exact calculations
  5. Unfortunately the current release seems to lack spatial aggregate functions that PostGIS has such as Extent, Union, collect.
  6. Spatial Transformation support using Proj.4 that hmm even SQL Server 2008 lacks

As far as licensing goes, SpatiaLite does not have the same License as SQLite. It is under 3 licenses Mozilla Public and GPLv3

In this little exercise, we shall get you up and running quickly with this cute cute database engine.

In terms of tutorials -- here is a recent one by MapFish people -- SpatiaLite in 5 minutes. This article will be similar except we shall go thru similar exercises we did for SQL Server 2008 and PostGIS as a contrast and compare.

Installing SpatiaLite

Installation is simple. -- just download and extract and run the GUI.

  1. Download from the http://www.gaia-gis.it/spatialite/ : Download SpatiaLite GUI statically linked (choose your platform)
  2. under Useful spatial scripts download http://www.gaia-gis.it/spatialite/binaries.html
  3. Extract the spatialite-gui. Note you can use spatialite-gis to view and import tables, but the spatialite-gui gives you ability to do free SQL queries, but not as good viewing features, while the spatialite-gis can create a new database, import and can show you a whole map, but has limited filtering. We'll cover spatialite-gis graphical tool in the next tutorial.

Creating a spatial database

The GUI has imbedded in it the engine that runs the database similar in concept to the way Microsoft Access application has embedded the engine to control an MS Access MDB. So to get started, launch the GUI.

  • Launch the GUI
  • From menu choose --> files -> Create a new SQLiteDb
  • Call the file boston.sqlite
  • Choose the little icon with tooltip "Execute SQL Script" and choose init_spatialite-2.2.sql
  • Choose Latin if prompted for encoding
  • Right mouse-click on the database and hit refresh and you should see three tables. If you right click on the spatial_ref_sys table and click edit rows, Your screen should look like this:

Loading GIS Data Into the Database

Now we have a nice fully functional GIS database with no spatial data. So to do some neat stuff, we need to get some data to play with.

Get the Data

Download data from the MassGIS site.

NOTE: MAssGIS has restructured their site quite a bit. When we wrote this article it was on a file located at ftp://data.massgis.state.ma.us/pub/shape/state/towns.exe. We've updated this article to link to the newer comparable file

For this simple exercise just download Towns


Extract the file into some folder. We will only be using the _POLY files for this exercise.

Figure out SRID of the data

You will notice one of the files it extracts is called TOWNS_POLY.prj. A .prj is often included with ESRI shape files and tells you the projection of the data. We'll need to match this descriptive projection to an SRID (the id field of a spatial ref record in the spatial_ref_sys table) if we ever want to reproject our data.

  • Open up the .prj file in a text editor. You'll see something like NAD_1983_StatePlane_Massachusetts_Mainland_FIPS_2001 and UNIT["Meter",1.0]
  • In your SpatiaLite GUI From the top tool bar -> choose the little icon with tooltip search srid by name that looks like this and type in Massachusetts and generates and executes SQL on the query pad like this
    SELECT *
    				FROM spatial_ref_sys
    					WHERE ref_sys_name LIKE '%Massachusetts%'
    						ORDER BY srid 
    . This will bring up about 10 records.
  • Note the srid of the closest match. In this case its 26986. NOTE: srid is an OpenGIS consortium term so you will hear the term used in context of most spatial databases. It is an OGC standard so you will see SRID mentioned a lot in other spatial databases, gis webservices and applications. Most of the common spatial reference systems have globally defined numbers. So 26986 always maps to NAD83_StatePlane_Massachusetts_Mainland_FIPS_2001 Meters. Most if not all MassGIS data is in this particular projection.

Loading the Data

The easiest data to load into SpatiaLite is to use the ESRI shape data loader packaged with the SpatiaLite gui.

Load Towns data

  • Open up the SpatiaLite gui if you don't already have it open
  • From the SpatiaLite gui --> File -> Load Shape file and browse for TOWNS_POLY.shp and type in 26986 for SRID. Type in the_geom for field name. Windows Latin1 is selected by default and is fine for Massachusetts data and most data sources. Type in towns for table name. Your screen should look like this. and then click okay. It should state 631 records created and you should see a new table called towns. Note that SQLite is not case sensitive (so similar in concept to SQL Server in normal state), so its okay to use mixed case.

There are a couple of things I would like to point out that the shape file loader auto-magically does for you.

  • It adds an entry to geometry_columns table -- if you type in -- SELECT * from geometry_columns; in the Query window and click the Execute SQL Statement icon to the right of the window, you'll see one record listed in the query result and that the type field says its a MULTIPOLYGON.
  • It adds triggers to the TOWNS_POLY table that prevent data that is not of right SRID or geometry from being added. If you expand the TOWNS_POLY table on the left - you'll see these. These serve the same purpose as the geometry contraints in PostGIS. If you show the source of the triggers, one of them looks something like this: So you see it looks back on the geometry_columns table to ensure what is in that table matches what someone tries to insert.
    CREATE TRIGGER gtu_towns_the_geom BEFORE UPDATE ON towns
    FOR EACH ROW BEGIN
    SELECT RAISE(ROLLBACK,
     '''towns.the_geom'' violates Geometry constraint [geom-type not allowed]')
    WHERE (SELECT type FROM geometry_columns
    WHERE f_table_name = 'towns' AND f_geometry_column = 'the_geom'
    AND (type = GeometryType(NEW.the_geom)
     OR type = GeometryAliasType(NEW.the_geom))) IS NULL;
    END

Indexing the data

Table indexes are very important for speeding up the processing of most queries. There is also a downside to indexes and they are the following

  1. Indexes slow down the updating of indexed fields.
  2. Indexes take up space. You can think of an index as another table with bookmarks to the first similar to an index to a book.

Given the above, it is often times tricky to have a good balance.  There are a couple general rules of thumb to go by that will help you a long way.

  1. Never put indexes on fields that you will not use as part of a where condition or join condition.
  2. Be cautious when putting index fields on heavily updated fields.  For example if you have a field that is frequently updated and is frequently used for updating, you'll need to do benchmark tests to make sure the index does not cause more damage in update situations than it does for select query situations.  In general if the number of records you are updating at any one time for a particular field is small, its safe to put in an index.
  3. Corrollary to 2.  For bulk uploads of a table - e.g. if you are loading a table from a shape, its best to put  the indexes in place after the data load because if an index is in place, the system will be creating indexes as its loading which could slow things down considerably.
  4. If you know a certain field is unique in a table, it is best to use a unique or primary index. The reason for this is that it tells the planner that once its found a match, there is no need to look for another.  It also prevents someone from accidentally inserting a duplicate record as it will throw an error.
  5. For spatial indexes -an R-Tree index. An R-Tree basically stores the bounding box of the geometry as the index. For large complex geometries unfortunately, this is not too terribly useful. SpatiaLite 3.6+ as mentioned has R-Tree indexes

The most common queries we will be doing on this query are spatial queries and queries by the town field. So we will create 2 indexes on these fields.

In the tree view that shows the tables, expand towns and right mouse click on the the_geom field and then choose Build Spatial Index. You will see the little icon change on the column once that is done.

The SpatiaLite GUI doesn't seem to have a right-click feature for other indexes besides spatial, so to create an index on the town field do the following:

CREATE INDEX idx_town ON towns(town); vacuum towns;

Querying Data

Go back into SpatiaLite gui

Test out the following queries from the query tool: -- this is similar to the extent function we did in PostGIS but makes up for the fact that there is no extent function in SpatiaLite. The Rtree index is implemented as a virtual table in SQLite and each record has the bbox min,max settings. So to simulate extent, we simply join by the ROWID and pkid of the two tables.

			
SELECT MIN(idx.xmin) As bxmin,MIN(idx.ymin) As bymin, MAX(idx.xmax) As bxmax, MAX(idx.ymax) As bymax FROM towns As t INNER JOIN idx_towns_the_geom As idx ON t.rowid = idx.pkid WHERE t.town = 'BOSTON';
returns -- bxmin: 225473.6094 bymin: 886444.6250 bxmax: 252005.5781 bymax: 905284.0000
SELECT SUM(Area(the_geom)) FROM towns where town = 'BOSTON';
returns 128251224.3644 --reproject to Massachusetts state plane feet SELECT town, astext(centroid(transform(the_geom,2249))) FROM towns WHERE town > '' ORDER BY town LIMIT 2; returns ABINGTON POINT(802961.762366 2868485.109604) ACTON POINT(672950.258529 3001540.967504); --to see what indices its using EXPLAIN QUERY PLAN SELECT SUM(Area(the_geom)) FROM towns where town = 'BOSTON'; --or the painful steps EXPLAIN SELECT SUM(Area(the_geom)) FROM towns where town = 'BOSTON';

Example: --Here is an exercise we did in SQL Server 2008 --

Here we arbitrarily take the first point that defines a polygon in Boston and ask what town POLYGON/MULTIPOLYGON geometries are within 1 mile of this point and we also want to know the exact distances and results ordered by distance. The speed of this was surprisingly good and finished in under a second and returned 3 rows.



--I was hoping this would use a spatial index but it doesn't
SELECT t.town, Distance(ref.point1,t.the_geom)/0.3048 As dist_ft,
Distance(ref.point1, t.the_geom) As dist_m
FROM towns As t
INNER JOIN (
SELECT PointN(Boundary(the_geom),1) As point1
FROM towns WHERE town = 'BOSTON' LIMIT 1) As ref
ON MbrIntersects(BuildCircleMbr(X(ref.point1), Y(ref.point1),1609.344), t.the_geom)
WHERE Distance(ref.point1, t.the_geom) < 1609.344
ORDER BY Distance(ref.point1, t.the_geom);

--this seems to perform just as well  -- would need a larger set to do a real test
SELECT t.town, Distance(ref.point1,t.the_geom)/0.3048 As dist_ft,
Distance(ref.point1, t.the_geom) As dist_m
FROM towns As t
INNER JOIN (
SELECT PointN(Boundary(the_geom),1) As point1
FROM towns WHERE town = 'BOSTON' LIMIT 1) As ref
ON ( Distance(ref.point1, t.the_geom) < 1609.344)
ORDER BY Distance(ref.point1, t.the_geom);


Viewing the Data

If you are a GIS newbie, I highly recommend using Quantum GIS. The latest binary distributions of QuantumGIS (as of this writing QGIS 1.4+) now have the SpatiaLite driver built in.

You can also use the latest Spatialite-gis minimalist, which has viewer and shapefile importer packaged in. Latest version is 1.0 Alpha and binaries available for Windows, Linux, and Mac OSX.





Post Comments About Part 1: Getting Started with SpatiaLite: An almost Idiot's Guide
Boston External Map Examples
SQL Server 2008 Tutorials
UMN Mapserver Tutorials
General Commentary
Boston GIS      Copyright 2024      Paragon Corporation