Saturday, November 20. 2010
First I'd like to say if you haven't tried GIS Stack Exchange, give it a try. It's a good resource to ask GIS questions of all kind and get answers.
We haven't had a chance to try out the new Denali CTP1, and probably won't be able to get to it for another month or so. It does seem to have some new exciting features for
both Spatial as well as the basic database features. We are also looking to do another compare shoot-out between SQL Server, PostGIS, and Oracle.
Since PostGIS and SQL Server are the ones we are most familiar with, we'll do a bit of contrast and compare here
On the spatial features we have summarized below which are gleaned from New Spatial Features in SQL Server Code-Named "Denali" Community Technology Preview 1
More details can be found on SQL Server "Denali" Books Online
Continue reading "SQL Server 2010 Denali is coming"
Saturday, May 29. 2010
Spatial compares of the top 3 commonly used spatial databases
Last time we did a survey such as this Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6
we had in our mix MySQL instead of Oracle. Many people wanted to see Oracle in that comparison, so this time, we have replaced MySQL with Oracle.
In our new compare, Compare SQL Server 2008 R2, Oracle 11G R2, PostgreSQL/PostGIS 1.5 Spatial Features, we focus mostly on the feature sets as documented by each offering in their manuals.
Given the breadth of functionality we were comparing, its quite possible we missed something.
Please let us know if we made any erroneous statements or missed a critical function, and we will be happy to correct. Part of the point of this compare is also for migration purposes. If you are using a feature in databasse X and want to know
if you can still do the same or if there is a comparable function in database Y, we hope this will be helpful in ascertaining that.
We really wanted to include SpatiaLite in there, but SpatiaLite fills a slightly different niche from the aforementioned 3 that I think in most cases is complimentary to the 3 rather than direct competition.
The fact that SpatiaLite is built on SQLite and much of the same plumbing that PostGIS is built on (GEOS/Proj), means it lacks some of the enterprise features we have come to expect from an enterprise database -- e.g strong aggregation functions,
stored procedures/functions, but on the plus side is much more portable (suitable for in the field work) and still has a lot of spatial power under its toolbelt.
PostGIS in Action status
We have received our final set of reviews as well as blog reviews of our book PostGIS in Action and are still making revisions to the chapters based on them and prior reviews.
The reviews were very positive. The last set of reviews, we got perfect scores. However there were noted areas of improvement. For one, I guess people are really into real examples rather than made up ones.
As a result, we have rewritten chapter 1 to be more of a 60 minute quick course in loading geometry/geography, viewing, and writing spatial queries against which will be filled in with greater and greater depth in later chapters.
We've also tried to sprinkle along the way in the book, non-US examples.
This is tricky though since US data is the data set we have most access to and can esily package in without licensing issues. Any thoughts on Non-US datasets we can use would be welcome. There are some things we will have a harder
time with such as adding more pictures and more polished pictures. Those unfortunately just add a lot of overhead. We will try our best to improve on that where it doesn't considerably impact cost.
As far as blog reviews, we thank Bill Dollins and Mark Leslie for their great blog reviews.
It is nice to know that we are hitting the audience and level of depth we had hoped to.
Friday, November 21. 2008
I've been whining a lot lately about how SQL Server 2008 (and none of the other SQL Server's)
have a generate_series() function that I have grown to love in PostgreSQL. Admittedly I've just been too lazy to
create one even though its not that difficult of a task.
Simon Greener over at Spatial DBAdvisor heard my whining and I guess got fed up enough
to create a generate_series() function for SQL Server 2008.
He also has a generate_series function for Oracle too by the way.
Now there are a couple of differences between the way you use it in the 3 databases which are caused
by fundamental differences between the architectures of the 3 databases.
Continue reading "Chocolate and Peanut Butter Cross-Breeding with PostgreSQL, SQL Server 2008, and Oracle"
Saturday, July 19. 2008
A few people have been asking us what are the pros and cons of using SQL Server 2008 Spatial and PostGIS and as a Windows user, why would
you still consider using PostGIS. Rather than simply providing some hand-waving saying "well if you just care about displaying data, then use whatever you
feel comfortable with, but if you want to do real intensive sophisticated spatial analysis and geometric processing without having to purchase a bunch of expensive software, then
PostGIS is probably better for you. Hell why must you think in either or propositions - just use both using the strengths of each.", we have tried really hard to quantify the similarities and differences between the 2 and to boot - we have
also added in MySQL.
Our analysis can be found at Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6
If you have any comments, suggestions of additions, things you felt we got wrong, then please don't hesitate to comment and we'll try to update our
Friday, October 12. 2007
Map Crushers and Silverlight
My husband, Leo, and I went to the Re-MIX Boston conference recently. They have posted some videos of some of the Boston Re-Mix sessions and more are coming in case anyone wants to see them.
On the GIS Microsoft Virtual Earth Front
I attended the Virtual Earth presentation. I didn't get too much out of it that I didn't know already except for 2 points.
- I spoke to Scott Ellington from MapDotNet who was in the audience and he said MapDotNet supports PostGIS. I knew MapDotNet would be supporting SQL Server 2008 when it comes out and currently supported SQL Server 2005 and Arc SDE, but this tidbit of news was news to me.
- One of the items showcased was a free product from Microsoft Research called MSR Map Cruncher which allows one to pre-create custom map tiles at various VEarth Zoom Levels to overlay on Microsoft Virtual Earth. I'm not sure what commercial restrictions it has if any, but this tool is basically a poor man's tile cache.
On the Silverlight front
- First Silverlight 1.0 just got released in the past few weeks and people have started to make bouncing balls using it. In fact most of the Silverlight presentations had
some variant of media player and bouncing balls. Kind of disappointing.
- The real stuff is in Silverlight 1.1 with DLR (IronPython, IronRuby, etc.) which is currently in Alpha and more useful controls forthcoming. This will be out in an 8-12 month timeframe. The alpha is out but will change a great deal before final release.
- I went to a talk given by Miguel De Icaza of Novell and Mono fame and it is official that Novell and Microsoft have an agreement where Novell will provide Silverlight compatibility on
all Linux distros via Moonlight. It also sounds like Silverlight will deploy similar to Flash where a user is prompted to install when visiting a Silverlight page and the download will be between 1-4 MB.
- Sadly the neat Moonlight desklets are a Mono/Moonlight enhancement and not part of Silverlight spec and rely on something Miguel called "compass" (I'm sure I'm spelling it wrong), which he thinks is only supported on Linux .
Map Dicing in Spatial Databases: PostGIS Example
The Microsoft Virtual Earth presentation did get me thinking about map dicing in spatial databases. Normally when I get maps, I get them on a silver platter - already at the lowest granularity I need.
But on rare occasions I would have liked to dice up the data more.
Why would you want to do this in a spatial database - particularly PostGIS? For one - the more granular your data, the more generally useful your spatial indexes since they more closely mirror your actual data and also it is speedier for statistical aggregation and doing thematic maps when spatial joining with bigger pieces if you know that your smaller pieces can be fully contained in your larger area of interest. To a point though - at a certain point the over-head of the additional records counteracts the benefits of more useful indexing and also your indexes just become less useful for other reasons.
Anyrate, the approach I am about to show may not be the best since its something I thought up in my sleep.
The basic approach I would use to dice up say US state boundaries or Massachusetts towns or say census blocks into smaller rectangular quadrants - would be to first create a grid of the extent of the area broken out in even rectangles of x width and y height and then do an intersection with my map of interest to get a new diced map.
I will not only dice space - but I shall also pseudo-dice attribute data. I say pseudo because I am going to assume that things like population density etc. are even across each town boundary which we know is not true. For data such as census blocks and depending how low you dice, this assumption may not be so inaccurate.
For those not familiar with the wonders of spatial intersection - take a peak at our explanation of ST_Intersection and ST_Intersects complete with sample diagrams.
This technique uses OGC standard functions so should work as well with slight variation in syntax in other spatial databases such as Oracle Spatial, IBM DB Spatial Extender, MSSQL Spatial etc. The only part not OGC compliant is my use of the PostgreSQL specific function generate_series, but this can be easily simulated in other databases by creating a dummy table of numbers say from 0 to 10000.
The following SQL calls demonstrate this approach. I would be interested in learning how real GIS people do this kind of thing.
Step 1 - Get the data of interest and load it - for this I will be using TOWNSSURVEY_POLYM from MassGIS.
Note psql and pgsql2shp are located in the PostgreSQL bin folder.
shp2pgsql -s 26986 TOWNSSURVEY_POLYM public.towns > towns.sql
psql -h myserver -d mydb -U myuser -f towns.sql
psql -h myserver -d mydb -U myuser -c "CREATE INDEX idx_towns_the_geom ON towns USING gist(the_geom);"
- Figure out the extent of your data and the size rectangle you will need to create. For this I want to know what size rectangle
to make a 200x200 grid (~40000 rows of grid data).
SELECT ceiling((ST_xmax(ST_Extent(the_geom)) - ST_xmin(ST_Extent(the_geom)))/200) as width,
ceiling((ST_ymax(ST_Extent(the_geom)) - ST_ymin(ST_Extent(the_geom)))/200) as height,
ST_Extent(the_geom) as thee
- Make our throw away grid - alas we found a good use for a cartesian product
In case it is not quite clear to folks what I am doing here - here it is in simple english
Create a reference box starting at the origin of our extent of massachusetts that is of dimension 1485x911 meters - in quasi OGC notation - BOX(xorigin yorigin, (xorigin + 1485) (yorigin + 911))
Next take this box and use it as a paint brush to paint across and then down by translating it hor.n*1485, ver.n*911
CREATE TABLE throwaway_grid(gid SERIAL PRIMARY KEY);
SELECT AddGeometryColumn('public', 'throwaway_grid', 'the_geom', 26986, 'POLYGON', 2);
INSERT INTO throwaway_grid(the_geom)
SELECT ST_translate(ref.boxrep, hor.n*width, ver.n*height) As slice
(SELECT ST_xmin(ST_Extent(the_geom)) As xstart, ST_xmin(ST_Extent(the_geom)) as ystart, ST_SetSRID(CAST('BOX(33863.73046875 777606.3125,35348.73046875 778517.3125)' as box2d), 26986) as boxrep,
ceiling((ST_xmax(ST_Extent(the_geom)) - ST_xmin(ST_Extent(the_geom)))/200) as width,
ceiling((ST_ymax(ST_Extent(the_geom)) - ST_ymin(ST_Extent(the_geom)))/200) as height
FROM towns) As ref, generate_series(1,200) as hor(n), generate_series(1,200) as ver(n);
CREATE INDEX idx_throwaway_grid_the_geom ON throwaway_grid USING gist(the_geom);
Step 4 - Create our final dataset - the towns nicely diced up into smaller rectangler shapes
CREATE TABLE towns_grid
gid serial NOT NULL PRIMARY KEY,
town character varying(21),
"type" character varying(2),
sum_acres double precision,
sum_square double precision
SELECT AddGeometryColumn('public', 'towns_grid', 'the_geom', 26986, 'MULTIPOLYGON', 2);
--Query returned successfully: 22713 rows affected, 1273117 ms execution time.
INSERT INTO towns_grid(objectid, town, town_id, pop1980, pop1990, pop2000, popch80_90,
popch90_00, "type", fourcolor, fips_stco, sum_acres, sum_square,
SELECT objectid, town, town_id,
popch90_00*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom), "type", fourcolor, fips_stco, sum_acres*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
FROM towns t INNER JOIN throwaway_grid tg ON ST_Intersects(t.the_geom, tg.the_geom);
CREATE INDEX idx_towns_grid_the_geom ON towns_grid USING gist(the_geom);
Before and After pictures
|Massachusetts Town Boundary (towns) - Before - 351 records|
|SELECT gid, the_geom |
WHERE town = 'BOSTON' -- 1 record
|Massachusetts Extent Diced into Rectangles - 40,000 records|
|Massachusetts Town (towns_grid) - After Dicing - 22,713 records|
|SELECT gid, the_geom |
WHERE town = 'BOSTON' -- 175 records