Issue with QGIS DB Manager SQL Window

QGIS DB Manager has a SQL Window which is useful for extracting, joining, or transforming data before loading it in the map canvas. Unfortunately I found that the syntax highlighting causes keypresses to buffer while the highlighting module checks the syntax, and slows it down to the point of making it completely unusable. I reported this to the QGIS Developer mailing list, but the developer was not able to reproduce the problem and only a handful of the other listers were able to confirm it. I’m not sure why, as I have now experienced it on Linux and Windows (including an entire GIS lab of Windows machines), and a large group of my Geovisualization students who use Macs are also affected by it.

Until it can be fixed, I’ve found it necessary to disable syntax highlighting in the plugin. The affected file is dlg_sql_window.py. It is found in the following locations:

  • Linux: /usr/share/qgis/python/plugins/db_manager/dlg_sql_window.py
  • Mac: /Applications/QGIS.app/Contents/Resources/python/plugins/db_manager/dlg_sql_window.py
  • Windows OSGeo4W: C:\OSGeo4W\apps\qgis\python\plugins\db_manager\dlg_sql_window.py

The solution is to comment out lines 34 and 56. On Linux and Mac, editing this file requires administrative privileges, but this is not required in OSGeo4W. The beginning of the file is reproduced here so that you can see which lines need to be commented out.

# -*- coding: utf-8 -*-

"""
/***************************************************************************
Name                 : DB Manager
Description          : Database manager plugin for QGIS
Date                 : May 23, 2011
copyright            : (C) 2011 by Giuseppe Sucameli
email                : brush.tyler@gmail.com

The content of this file is based on
- PG_Manager by Martin Dobias (GPLv2 license)
 ***************************************************************************/

/***************************************************************************
 *                                                                         *
 *   This program is free software; you can redistribute it and/or modify  *
 *   it under the terms of the GNU General Public License as published by  *
 *   the Free Software Foundation; either version 2 of the License, or     *
 *   (at your option) any later version.                                   *
 *                                                                         *
 ***************************************************************************/
"""

from PyQt4.QtCore import *
from PyQt4.QtGui import *
from qgis.core import *

from .db_plugins.plugin import BaseError
from .dlg_db_error import DlgDbError

from .ui.ui_DlgSqlWindow import Ui_DbManagerDlgSqlWindow as Ui_Dialog

# COMMENT OUT THIS LINE -> from .highlighter import SqlHighlighter
from .completer import SqlCompleter

import re

class DlgSqlWindow(QDialog, Ui_Dialog):

  def __init__(self, iface, db, parent=None):
    QDialog.__init__(self, parent)
    self.iface = iface
    self.db = db
    self.setupUi(self)
    self.setWindowTitle( u"%s - %s [%s]" % (self.windowTitle(), db.connection().connectionName(), db.connection().typeNameString()) )

    self.defaultLayerName = 'QueryLayer'

    settings = QSettings()
    self.restoreGeometry(settings.value("/DB_Manager/sqlWindow/geometry", QByteArray(), type=QByteArray))

    self.editSql.setAcceptRichText(False)
    self.editSql.setFocus()
    SqlCompleter(self.editSql, self.db)
    # COMMENT OUT THIS LINE -> SqlHighlighter(self.editSql, self.db)

    # allow to copy results
    copyAction = QAction("copy", self)
    self.viewResult.addAction( copyAction )
    copyAction.setShortcuts(QKeySequence.Copy)
    QObject.connect(copyAction, SIGNAL("triggered()"), self.copySelectedResults)

# [subsequent code omitted]

 

Load PostGIS geometries in R without rgdal

As I said in my last post, rgdal lacks some of the features of GDAL, including the ability to subset columns and rows the source layer, and I demonstrated a workaround. The workaround relied upon the RPostgreSQL package, and this raises a question: Is it possible to transfer geographic data from PostGIS to R just using RPostgreSQL?

It is, and in fact, this may be necessary if you are working on Windows. The rgdal package comes with its own statically linked GDAL, which unfortunately has a very limited set of format drivers—the ubiquitous ESRI Shapefile, but not the PostgreSQL/PostGIS driver. If you try to use dbGetQuery to return a recordset with a geometry column, R will choke. But you can use the ST_AsText function in SQL to convert the PostGIS geometry to Well-Known Text, read it into R, and then use the readWKT function in the rgeos package to convert it into a SpatialPolygons, SpatialLines, or SpatialPoints object.

The readWKT function is not vectorized, so I build the set of SpatialPolygons and rbind them inside a for loop. I’m sure there’s a better way to do this using the *apply family of functions, but I haven’t been able to figure that one out.

library(RPostgreSQL)
library(rgeos)
library(sp)
 
# Load data from the PostGIS server
conn = dbConnect(
  dbDriver("PostgreSQL"), dbname=dbname, host=host, port=5432, 
  user=user, password=password
  )
 
strSQL = "
  SELECT gid, ST_AsText(geom) AS wkt_geometry, attr1, attr2[, ...]
  FROM geo_layer"
dfTemp = dbGetQuery(conn, strSQL)
row.names(dfTemp) = dfTemp$gid
 
# Create spatial polygons
# To set the PROJ4 string, enter the EPSG SRID and uncomment the 
# following two lines:
# EPSG = make_EPSG()
# p4s = EPSG[which(EPSG$code == SRID), "prj4"]
for (i in seq(nrow(dfTemp))) {
  if (i == 1) {
    spTemp = readWKT(dfTemp$wkt_geometry[i], dfTemp$gid[i])
    # If the PROJ4 string has been set, use the following instead
    # spTemp = readWKT(dfTemp$wkt_geometry[i], dfTemp$gid[i], p4s)
  }
  else {
    spTemp = rbind(
      spTemp, readWKT(dfTemp$wkt_geometry[i], dfTemp$gid[i])
      # If the PROJ4 string has been set, use the following instead
      # spTemp, readWKT(dfTemp$wkt_geometry[i], dfTemp$gid[i], p4s)
    )
  }
}
 
# Create SpatialPolygonsDataFrame, drop WKT field from attributes
spdfFinal = SpatialPolygonsDataFrame(spTemp, dfTemp[-2])

Created by Pretty R at inside-R.org

Subsetting in readOGR

The function readOGR in the rgdal package is used to bring vector spatial data sources into R. readOGR() relies upon OGR (part of the GDAL/OGR library) for format conversion. Unfortunately, while OGR supports the ability to subset columns (with the -select switch) or rows (with the -where switch), or even to request a layer using a full SQL statement, none of that functionality is available in readOGR. Every so often this gets discussed in the mailing lists, but the functionality has not yet been implemented.

If your data is already in a SQL database though, you’re in luck. You can accomplish the same thing within R by creating a spatial view in-database, loading the spatial view with readOGR, then dropping the view. I’ve created a function that does just that for PostGIS data sources.

readOgrSql = function (dsn, sql, ...) {

  require(rgdal)
  require(RPostgreSQL)
  require(stringr)

  # check dsn starts "PG:" and strip
  if (str_sub(dsn, 1, 3) != "PG:") {
    stop("readOgrSql only works with PostgreSQL DSNs")
  }
  dsnParamList = str_trim(str_split(dsn, ":")[[1]][2])

  # Build dbConnect expression, quote DSN parameter values 
  # if not already quoted
  if (str_count(dsnParamList, "=") 
      == str_count(dsnParamList, "='[[:alnum:]]+'")) {
    strExpression = str_c(
      "dbConnect(dbDriver('PostgreSQL'), ", 
      str_replace_all(dsnParamList, " ", ", "), 
      ")"
      )
  }
  else {
    dsnArgs = word(str_split(dsnParamList, " ")[[1]], 1, sep="=")
    dsnVals = sapply(
      word(str_split(dsnParamList, " ")[[1]], 2, sep="="), 
      function(x) str_c("'", str_replace_all(x, "'", ""), "'")
      )
    strExpression = str_c(
      "dbConnect(dbDriver('PostgreSQL'), ", 
      str_c(dsnArgs, "=", dsnVals, collapse=", "), 
      ")"
      )
  }

  # Connect, create spatial view, read spatial view, drop spatial view
  conn = eval(parse(text=strExpression))
  strCreateView = paste("CREATE VIEW vw_tmp_read_ogr AS", sql)
  dbSendQuery(conn, strCreateView)
  temp = readOGR(dsn = dsn, layer = "vw_tmp_read_ogr", ...)
  dbSendQuery(conn, "DROP VIEW vw_tmp_read_ogr;")
  dbDisconnect(conn)
  return(temp)
}

Since readOGR can’t send arbitrary SQL to the data source, the function also requires the RPostgreSQL package. Because I wanted the readOgrSql function to conform to readOGR as closely as possible, the function takes the DSN as a character string, then chops it up to pass the DSN parameters to dbConnect. readOGR expects the DSN in this format:

"PG:dbname='db' host='host' port='5432' user='user' password='***'"

The user requires privileges to create objects in the public schema.

Then, readOgrSql can be called as follows:

strSQL = "
  SELECT gid, name,
    ST_Transform(geom, 4326)::geometry(MultiPolygon, 4326) AS geom
  FROM geo_states"
spdfStates = readOgrSql(dsn, strSQL, stringsAsFactors=FALSE)
row.names(spdfStates) = spdfStates$name

(The last line is not strictly necessary, but is a little trick I use to make it easier to refer to specific features by name.)

At the moment this only works for PostGIS 2.0+. It relies on the fact that in recent versions of PostGIS, spatial views are automatically registered because the geometry_columns “table” is actually a view which finds all geometry columns in all tables. But for the view to work as a full spatial layer, the geometry column in the SELECT statement must be explicitly cast using a typmodded geometry. In the example, this is accomplished with ::geometry(MultiPolygon, 2263), but in general ::geometry(<geometry type>, <srid>). If there are no transformations on the geometry, the explicit cast can be avoided as long as the underlying table is defined using the PostGIS 2.0+ typmodded geometry columns:

SELECT gid, geom FROM my_spatial_table;

will work, but

SELECT gid, ST_Transform(geom, 4326) AS geom
FROM my_spatial_table;

will only work partially, because the transformed geometry is generic. See Manually Registering Geometry Columns in geometry_columns for details. Note that rgdal is smart enough to load the layer into an appropriate Spatial*DataFrame (e.g., SpatialPolygonsDataFrame for an OGC Polygon or MultiPolygon), but will lose the spatial reference information (i.e., the projection or coordinate system). It will plot OK, but unless you fix it in R, you won’t be able to combine it with layers in different coordinate reference systems.

Extending this to work for PostGIS pre-2.0 or for SpatiaLite shouldn’t be too difficult, but will require manually registering the spatial view in geometry_columns.

Hexbinning

Hexbinning is a method for visualizing point data when many similar values mean there is a lot of overplotting. Although it originated in the data visualization field as a an enhancement to the traditional XY scatterplot, within the last few years hexbinning has been used more and more in cartography. (See this great blog post by Zachary Forest Johnson which traces this history and explains how to create hexbin maps using D3.js.)

I wanted to map police stops under the NYPD Stop and Frisk program (official source, easier to use version from NYCLU), but at city or borough scale, this just looks like a continuous carpet of dots. Visualizing that kind of dense point data is exactly what hexbinning is for.
Continue reading

Welcome to the Open Geospatial Technologies Blog

Welcome to my new blog. I have previously blogged (somewhat sporadically) at Free City, where I have occasionally posted some technical stuff about installing and using open source GIS software. I have decided to break out the GIS and other geospatial technology material into a separate blog. My goal with this new site is to collect information about using various geospatial tools effectively. Often, that means I will be documenting methods for things that I am currently working on. It will probably grow a little haphazardly, but hopefully will be of interest to other geonerds.

Feel free to comment or contact me. Enjoy!