Installing PostgreSQL/PostGIS on Linux Mint 17

Linux Mint 17 (based on Ubuntu 14.04) was released before the Summer, but I finally got around to upgrading from Linux Mint 13. I was previously running PostgreSQL 9.0 / PostGIS 2.0, and it was time to upgrade those as well. If you are upgrading from previous version of PostgreSQL/PostGIS and have data you want to bring over to your new cluster, you have to begin with a dump of your old data:

pg_dump -Fc my_db > my_db.dump

The extension *.dump is arbitrary. Some people prefer *.backup. I should have timed the process for this post, but expect something in the order of hours. For smallish DBs (tens to hundreds of gigabytes), I have seen compression ratios of about 10-20%. Not sure what to expect for much larger DBs.

After that the next step for me was installing Linux Mint 17. If you’re not installing/upgrading an OS, your next step might involve going straight to installing/upgrading PostgreSQL 9.3/PostGIS 2.1. If you are not yet using the ubuntugis-unstable PPA, go ahead and set that up. This will give you access to the latest PostGIS, GDAL/OGR drivers, SpatiaLite, QGIS, etc. In spite of the name, the repository is very stable and AFAICT is what all geolinux geeks are using.

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo add-apt update

Then install PostgreSQL and PostGIS:

sudo apt-get install postgresql-9.3-postgis-2.1 pgadmin3

Since the postgresql-9.3-postgis-2.1 package depends on postgresql-9.3 and recommends postgis and postgresql-9.3-contrib, you should get everything you need. If you use pgAdmin III, that has to be added separately, as I have done here.

Before you can restore your database you have to initialize your cluster. Apt will automatically create a postgres user and initialize the cluster in /var/lib/postgresql/9.3/main.  Perhaps, like me, you keep a separate drive partition for your OS, home, and data. In that case, it won’t do you much good to have your database cluster in your OS partition. Create the directory that will hold the cluster (in the example below, I use /data/pg_data) and set the owner to postgres.

You need to initialize the cluster as the postgres user. It is my longstanding habit, learned from the developer who first introduced me to Postgres, to never give a password to the postgres user. Instead, I create a superuser account who does all the potentially destructive in-database stuff, and when I have to do something as postgres, like initialize a cluster, I get to postgres through root. So instead of:

su - postgres

…become root first, then become postgres. Generally I’m not showing the command prompt, but as we step through various accounts the command prompt will change, so I will show the prompts in this listing:

user@host ~ $ su -
host ~ # su - postgres
postgres@host ~ $ /usr/lib/postgresql/9.3/bin/initdb -A trust -D /data/pg_data
postgres@host ~ $ psql
postgres=# CREATE USER superusername WITH SUPERUSER CREATEDB CREATEROLE PASSWORD 'password';
postgres=# \q
postgres@host ~ $ exit
host ~ # exit

Since I am installing on my laptop and no one else is really using this laptop, I have used -A trust to indicate that all local users are trusted. This may not apply to your situation, so use this option judiciously. The final parameter indicates where the cluster will be initialized. I have used /data/pg_data.

Now you need to edit configuration files. In Ubuntu/LM, these files will be located in /etc/postgresql/9.3/main, and will need to be edited as root. The files you need to edit are environment (no extension, the file is just named “environment”) and postgresql.conf. In environment, add the following two lines:

POSTGIS_ENABLE_OUTDB_RASTERS=1
POSTGIS_GDAL_ENABLED_DRIVERS=ENABLE_ALL

Now edit postgresql.conf to point to the new data cluster. The file will have the line:

data_directory = '/var/lib/postgresql/9.3/main'        # use data in another directory

…which you will need to change to:

data_directory = '/data/pg_data'        # use data in another directory

…or whatever path is appropriate on your system.

Now restart the postgres service, or just reboot, so that postgres restarts using the new cluster.

With a typical PostgreSQL dump/restore, the restore will create the database and load any functions, as well as schemas and data. With PostGIS, the new version will have different spatial functions, so we want to create the database first, create the new spatial functions and necessary structures (such as the spatial_ref_sys table) by loading the PostGIS extension, and only then do we load the data, which we do using a special script rather than the standard pg_restore command.

I do the following steps in pgAdmin. First, create a connection to the server:

pgAdminNewServerRegistration

In pgAdmin it is easy to create a database graphically (which I will not show) by right-clicking on the Databases “can” and choosing New Database, or just open the SQL Editor window to follow the commands below. If you prefer to use psql, now that you have added your system user as a database superuser, you can

psql postgres

Either way, you end up at a SQL editor. You are in the postgres database, which should be used only for system data, so create a new database to store your “real” data.

CREATE DATABASE my_db WITH OWNER superusername;

Now create the necessary PostGIS extensions. The only one that is really necessary is PostGIS, but you might want to install topology support and the TIGER geocoder (which is made more useful by the fuzzystrmatch extension) as well. You need to issue the SQL commands in the correct DB, so in pgAdmin, close the SQL Editor, connect to the new database, and reopen the SQL Editor. In psql just do:

\connect my_db

Now create the extensions:

CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
CREATE EXTENSION fuzzystrmatch;
CREATE EXTENSION postgis_tiger_geocoder;

If you are restoring a previous database, now you are ready to reload your data. PostGIS provides a Perl data loader, postgis_restore.pl, that  will not attempt to recreate the spatial functions, since they are already there (although deprecated functions will be missing and will require special handling if you have database applications which make use of them). Any aspatial data will be loaded in the usual manner, so it is harmless to use this instead of pg_restore. The data loader will not, by default, be in your PATH, so either navigate to the PostGIS contrib directory, as I show below, or be prepared to type in the entire path when you call the command. Then pass in the name of the dump file and pipe the result to psql:

cd /usr/share/postgresql/9.3/contrib/postgis-2.1/
perl postgis_restore.pl ~/my_db.dump | psql -U superusername my_db 2> ~/postgis_restore_errors.txt

Go get a cup of coffee. In fact, go out to dinner. Maybe see a movie.

The PostGIS installer also installs the graphical shapeloader (front end to shp2pgsql-gui). As a final tweak, you might want to configure this as a pgAdmin plugin, so that you have easy access to it when working with pgAdmin (and it will launch with the database connection details taken from the pgAdmin environment). To do so, open (as root) the file /usr/share/pgadmin3/plugins.d/plugins.ini. Then add the following section:

;
; pgShapeLoader (Linux):
; Added manually by lee on 10/21/12
Title=PostGIS Shapefile and DBF loader
Command=$$PGBINDIR/shp2pgsql-gui -U $$USERNAME -d $$DATABASE -p $$PORT -h $$HOSTNAME
Description=Open a PostGIS ESRI Shapefile or Plain dbf loader console to the current database.
KeyFile=$$PGBINDIR/shp2pgsql-gui
Platform=unix
ServerType=postgresql
Database=Yes
SetPassword=No

This will create a new entry in the pgAdmin Plugins menu.

That’s it! You should now have a working PostGIS installation. The first thing I did was fire up QGIS, connect to the database, and load some of my old projects to make sure everything was working. Happy geoprocessing!

References: