Friday, November 21. 2008
Chocolate and Peanut Butter ... Posted by Regina Obe
in dbii, generate_series, microsoft, oracle, postgis postgresql, sql server 2008 at
15:35
Comments (0) Trackback (1) Chocolate and Peanut Butter Cross-Breeding with PostgreSQL, SQL Server 2008, and OracleI've been whining a lot lately about how SQL Server 2008 (and none of the other SQL Server's) have a generate_series() function that I have grown to love in PostgreSQL. Admittedly I've just been too lazy to create one even though its not that difficult of a task. Simon Greener over at Spatial DBAdvisor heard my whining and I guess got fed up enough to create a generate_series() function for SQL Server 2008. He also has a generate_series function for Oracle too by the way. Now there are a couple of differences between the way you use it in the 3 databases which are caused by fundamental differences between the architectures of the 3 databases. Continue reading "Chocolate and Peanut Butter Cross-Breeding with PostgreSQL, SQL Server 2008, and Oracle"Friday, October 12. 2007
Map Dicing and other stuff Posted by Regina Obe
in generate_series, microsoft, postgis postgresql at
00:13
Comment (1) Trackbacks (4) Map Dicing and other stuffMap Crushers and SilverlightMy husband, Leo, and I went to the Re-MIX Boston conference recently. They have posted some videos of some of the Boston Re-Mix sessions and more are coming in case anyone wants to see them. On the GIS Microsoft Virtual Earth FrontI attended the Virtual Earth presentation. I didn't get too much out of it that I didn't know already except for 2 points.
On the Silverlight front
Map Dicing in Spatial Databases: PostGIS ExampleThe Microsoft Virtual Earth presentation did get me thinking about map dicing in spatial databases. Normally when I get maps, I get them on a silver platter - already at the lowest granularity I need. But on rare occasions I would have liked to dice up the data more. Why would you want to do this in a spatial database - particularly PostGIS? For one - the more granular your data, the more generally useful your spatial indexes since they more closely mirror your actual data and also it is speedier for statistical aggregation and doing thematic maps when spatial joining with bigger pieces if you know that your smaller pieces can be fully contained in your larger area of interest. To a point though - at a certain point the over-head of the additional records counteracts the benefits of more useful indexing and also your indexes just become less useful for other reasons. Anyrate, the approach I am about to show may not be the best since its something I thought up in my sleep. The basic approach I would use to dice up say US state boundaries or Massachusetts towns or say census blocks into smaller rectangular quadrants - would be to first create a grid of the extent of the area broken out in even rectangles of x width and y height and then do an intersection with my map of interest to get a new diced map. I will not only dice space - but I shall also pseudo-dice attribute data. I say pseudo because I am going to assume that things like population density etc. are even across each town boundary which we know is not true. For data such as census blocks and depending how low you dice, this assumption may not be so inaccurate. For those not familiar with the wonders of spatial intersection - take a peak at our explanation of ST_Intersection and ST_Intersects complete with sample diagrams. This technique uses OGC standard functions so should work as well with slight variation in syntax in other spatial databases such as Oracle Spatial, IBM DB Spatial Extender, MSSQL Spatial etc. The only part not OGC compliant is my use of the PostgreSQL specific function generate_series, but this can be easily simulated in other databases by creating a dummy table of numbers say from 0 to 10000. The following SQL calls demonstrate this approach. I would be interested in learning how real GIS people do this kind of thing. Step 1 - Get the data of interest and load it - for this I will be using TOWNSSURVEY_POLYM from MassGIS.
Step 3 - Make our throw away grid - alas we found a good use for a cartesian product In case it is not quite clear to folks what I am doing here - here it is in simple english Create a reference box starting at the origin of our extent of massachusetts that is of dimension 1485x911 meters - in quasi OGC notation - BOX(xorigin yorigin, (xorigin + 1485) (yorigin + 911)) Next take this box and use it as a paint brush to paint across and then down by translating it hor.n*1485, ver.n*911
Step 4 - Create our final dataset - the towns nicely diced up into smaller rectangler shapes
CREATE TABLE towns_grid
(
gid serial NOT NULL PRIMARY KEY,
objectid integer,
town character varying(21),
town_id smallint,
pop1980 integer,
pop1990 integer,
pop2000 integer,
popch80_90 smallint,
popch90_00 integer,
"type" character varying(2),
fourcolor smallint,
fips_stco integer,
sum_acres double precision,
sum_square double precision
);
SELECT AddGeometryColumn('public', 'towns_grid', 'the_geom', 26986, 'MULTIPOLYGON', 2);
--Query returned successfully: 22713 rows affected, 1273117 ms execution time.
INSERT INTO towns_grid(objectid, town, town_id, pop1980, pop1990, pop2000, popch80_90,
popch90_00, "type", fourcolor, fips_stco, sum_acres, sum_square,
the_geom)
SELECT objectid, town, town_id,
pop1980*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
pop1990*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
pop2000*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
popch80_90*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
popch90_00*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom), "type", fourcolor, fips_stco, sum_acres*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
sum_square*ST_Area(ST_Intersection(t.the_geom, tg.the_geom))/ST_Area(t.the_geom),
ST_multi(ST_Intersection(t.the_geom, tg.the_geom))
FROM towns t INNER JOIN throwaway_grid tg ON ST_Intersects(t.the_geom, tg.the_geom);
CREATE INDEX idx_towns_grid_the_geom ON towns_grid USING gist(the_geom);
Before and After pictures
Tuesday, October 02. 2007
More generate_series tricks Posted by Regina Obe
in generate_series, postgis postgresql at
18:09
Comments (0) Trackbacks (11) More generate_series tricksPostGIS generate_series tricksgenerate_series comes in particularly handy for manipulating geometries in a database. The following examples are done using Postgis 1.3.1. Below are some common use cases
/**break multipolygon geometries into single polygon geometries ***/
INSERT INTO mypoly(the_geom)
SELECT ST_GeometryN(the_geom, generate_series(1, ST_NumGeometries(the_geom))) AS poly
FROM mymultipoly
/**The ST_ExteriorRing function only takes Polygons. If each of your geometries
is a multipolygon and you wanted use ST_ExteriorRing
to get the exterior line string of
each polygon in each polygon, you would do something like the below. **/
SELECT a.gid, ST_ExteriorRing(ST_GeometryN(a.the_geom,ST_NumGeometries(a.the_geom))) AS a_singlelinenoholes
FROM somegeomtable a
/** If you wanted to do the above but still maintain
the same number of records as before, you would do this - which would give you a
multilinestring geometry where each linestring represents
an exterior of each polygon
**/
SELECT a.gid, ST_Collect(ST_ExteriorRing(ST_GeometryN(a.the_geom,ST_NumGeometries(a.the_geom)))) AS a_multilinesnoholes
FROM somegeomtable a
GROUP BY a.gid
NOTE: Because of the way generate_series works we can't have 2 generate series calls in the SELECT. So if we need an inner and outer loop then we put one in the SELECT and one in the FROM and have a limiting WHERE clause as shown in the below.
/**Get all interior linestring rings (holes) of a multipolygon into a separate table with one record per interior ring. **/ SELECT ST_BuildArea(ST_InteriorRingN(ST_GeometryN(a.the_geom,gn.n),generate_series(1,ST_NumInteriorRings(ST_GeometryN(a.the_geom,gn.n))) )) AS a_hole FROM somegeomtable a, generate_series(1, (SELECT Max(ST_NumGeometries(the_geom)) FROM somegeomtable)) gn(n) WHERE ST_NumGeometries(the_geom) >= gn.n AND ST_NumInteriorRings(ST_GeometryN(the_geom,gn.n)) > 0 To a Database programmer all problems look like database problemsDoing a lot of database programming warps your thinking. Normally I would consider thinking in the languages you program in to be a bad thing because to some extent it limits your thinking to what is supported in said language. In general I think doing a lot of database programming has had positive effects on me. I find myself thinking about problems in parallel, unencumbered by the optical illusions of step dependency, and instead grouping problems in sets. It seems that regardless of what language I program in, I see sets and patterns everywhere. When that happens I can't help but hit that particular nail with a database hammer. Below is a PostgreSQL example that uses generate_series to generate ASP.NET gridview column markup for a month column cross tab. Originally I used AutogenerateColumns=true property of a grid view, but I needed to manipulate the formatting of my columns so that was a bit less than satisfying. The code below generates a markup record for each grid view month column
SELECT '<asp:BoundField DataField="' || trim(to_char(date '2007-01-01' + (n || ' month')::interval, 'Month'))
|| '" HeaderText="' || trim(to_char(date '2007-01-01' + (n || ' month')::interval, 'Mon'))
|| '" ItemStyle-HorizontalAlign="Center" />' as newval
FROM generate_series(0,11) n
If you were to change the code above to add a SUM aggregate function on strings - (definition of a PostgreSQL SUM aggregate for text you can find here) you would get just one row with all your markup. For this particular one we also created a column that returns the corresponding SQL for the cross tab query.
SELECT '<asp:GridView id="grv" runat="server">' || E'\r\n' || SUM('<asp:BoundField DataField="' || mth.long_mname
|| '" HeaderText="' || mth.short_mname
|| '" ItemStyle-HorizontalAlign="Center" />' || E'\r\n') || '</asp:GridView>' as aspxmarkup,
'SELECT ' ||
SUM('SUM(CASE WHEN report_date BETWEEN \'' || mth.start_date
|| '\' AND \''
|| mth.end_date
|| '\' THEN amount ELSE NULL END) As '
|| mth.long_mname
|| ', ' || E'\r\n') || ' SUM(amount) As total ' || E'\r\n'
|| ' FROM sometable WHERE report_date between \'2007-01-01\' AND \'2007-12-31\'' as sqlcrosstab
FROM
(SELECT (n + 1) As mnum,
trim(to_char(date '2007-01-01' + (n || ' month')::interval, 'Mon')) As short_mname,
trim(to_char(date '2007-01-01' + (n || ' month')::interval, 'Month')) As long_mname,
date '2007-01-01' + (n || ' month')::interval As start_date,
date '2007-01-01' + ((n + 1) || ' month')::interval + - '1 day'::interval As end_date
FROM generate_series(0,11) n) As mth
The gridview markup output of the aspxmarkup column looks like this
<asp:GridView id="grv" runat="server" >
<asp:BoundField DataField="January" HeaderText="Jan" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="February" HeaderText="Feb" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="March" HeaderText="Mar" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="April" HeaderText="Apr" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="May" HeaderText="May" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="June" HeaderText="Jun" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="July" HeaderText="Jul" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="August" HeaderText="Aug" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="September" HeaderText="Sep" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="October" HeaderText="Oct" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="November" HeaderText="Nov" ItemStyle-HorizontalAlign="Center" />
<asp:BoundField DataField="December" HeaderText="Dec" ItemStyle-HorizontalAlign="Center" />
</asp:GridView>
And the Sql output of the sqlcrosstab column looks like this
SELECT SUM(CASE WHEN report_date BETWEEN '2007-01-01 00:00:00' AND '2007-01-31 00:00:00' THEN amount ELSE NULL END) As January,
SUM(CASE WHEN report_date BETWEEN '2007-02-01 00:00:00' AND '2007-02-28 00:00:00' THEN amount ELSE NULL END) As February,
SUM(CASE WHEN report_date BETWEEN '2007-03-01 00:00:00' AND '2007-03-31 00:00:00' THEN amount ELSE NULL END) As March,
SUM(CASE WHEN report_date BETWEEN '2007-04-01 00:00:00' AND '2007-04-30 00:00:00' THEN amount ELSE NULL END) As April,
SUM(CASE WHEN report_date BETWEEN '2007-05-01 00:00:00' AND '2007-05-31 00:00:00' THEN amount ELSE NULL END) As May,
SUM(CASE WHEN report_date BETWEEN '2007-06-01 00:00:00' AND '2007-06-30 00:00:00' THEN amount ELSE NULL END) As June,
SUM(CASE WHEN report_date BETWEEN '2007-07-01 00:00:00' AND '2007-07-31 00:00:00' THEN amount ELSE NULL END) As July,
SUM(CASE WHEN report_date BETWEEN '2007-08-01 00:00:00' AND '2007-08-31 00:00:00' THEN amount ELSE NULL END) As August,
SUM(CASE WHEN report_date BETWEEN '2007-09-01 00:00:00' AND '2007-09-30 00:00:00' THEN amount ELSE NULL END) As September,
SUM(CASE WHEN report_date BETWEEN '2007-10-01 00:00:00' AND '2007-10-31 00:00:00' THEN amount ELSE NULL END) As October,
SUM(CASE WHEN report_date BETWEEN '2007-11-01 00:00:00' AND '2007-11-30 00:00:00' THEN amount ELSE NULL END) As November,
SUM(CASE WHEN report_date BETWEEN '2007-12-01 00:00:00' AND '2007-12-31 00:00:00' THEN amount ELSE NULL END) As December,
SUM(amount) As total
FROM sometable WHERE report_date between '2007-01-01' AND '2007-12-31'
|
Planet PostGISThursday, May 16. 2013 Sunday, May 12. 2013 OSGEOCalendar
Categories |
||||||||||||||||||||||||||||||||||||||||||||||||||||||