Remote Sensing tips and tricks


It’s now possible to get Sentinel-2A orthocorrected imagery (Level 1C) from the Sentinel Data Hub.

Sentinel2a Antarctic Peninsula

Antarctic Peninsula

Sentinel2a Cambridge

Cambridge, UK

Sentinel2a Himalayas


Sentinel2a Himalayas


Sentinel2a Brazil


Sentinel2a Elephant Island

Elephant Island, Antarctic Peninsula

The Sentinel Data Hub is pretty easy to use and is much better than previous ESA data discovery systems! Sign up, and you have access to a web map based search.

To quickly view the products, use the SNAP Sentinel-2 Toolbox.

All these images are displaying data which is 10 metres resolution.

Spot an error raise a ticket

Antarctic adventures part 2: geo-blogging from Antarctica

So I have a great blogging platform, now for the fun stuff. Geo-blogging from Antarctica and how I did it.

I will divide up the types of geospatial data input that was loaded into Geoarctica maps, into three main categories. Flight paths, activity tracks and static point and polygon features.

Flight paths

My job when I was in Antarctica was to conduct an aerial photography survey campaign from a Twin Otter aircraft.

The camera system consists of not only the actual sensor and lens, but also a GPS antenna and reciever, and an IMU (innertial measurement unit). By processing the outputs from the GPS and IMU, the position (or trajectory) of the aircraft can be determined to within centimetre level accuracy.

After each flight it was necessary to quality check the data from the GPS and IMU, and process up the trajectory. This gave me the data for the flight paths.

Flight paths

Activity tracks

I made sure that whilst I was in Antarctica, I made use of the outdoor activity opportunities available at Rothera Research Station. To provide data for my geo-blog, I tracked many of these using my Garmin Vivoactive.

Garmin Vivoactive

I bought my Vivoactive last summer and I love it. Apart from logging my daily steps and sleep patterns, it also has GPS functionality for tracking activities. Perfect for quickly switching on for spontaneous skiing, boat trips, climbing and walks.


There were a few steps I had to go through to create geojson to upload to Geoarctica.

  1. Garmin FIT files were transferred from the Vivoactive onto my laptop, using the freely available Garmin Basecamp.

  2. These were exported into GPX format using Garmin Basecamp.
  3. I then loaded the GPX tracks layer into QGIS, simplified the geometry if required and exported to a geojson format layer.

Static points and polygon features

These were static points with pop ups enabled, helping to illustrate posts such as this

Boat trip

… and a set of polygon features which I used in a few posts to show the building layout on station.

Building polygons

These features were all digitised using trusty QGIS. The buildings were traced from a high resolution aerial photograph, and I created point features with a ‘popup’ attribute, as required.

Spot an error raise a ticket

Antarctic adventures part 1: creating a geo-blog

I have recently returned from a month long trip to Antarctica and had a lot of fun blogging about my experiences over at Geoarctica.

In the run up to my trip, I was often faced with blank expressions when I tried to explain to people where I was going, and more specifically what I would be doing over there. So I was keen to be able to communicate my journeys by harnessing the power of mapping, loosly inspired by Strava.

This is part 1 of two posts about Geoarctica and blogging from the far South.


A nice bespoke blog

Confession… I didn’t create the blog itself. I have @orangemugdev to thank for that. It would have been nice to learn some new skills and put something together myself, but I ran out of time. After explaining my vision, I ended up with something really excellent that I was very excited to start using.

The maps

The original plan was to use a polar projected map with MODIS imagery acquired on the day of the post as the basemap, similar to another recent project of ours, Leafarctica. But I only gave @orangemugdev one day to get this up and running (terrible client…) and we came across some problems.

  1. The MODIS imagery wouldn’t have worked without a coastline to give it some context because of the large amount of cloud in each image. Since attempting to add a coastline to Leafarctica ourselves and not achieving anything satisfactory, someone else has done an excellent job of this! Check out Leafarctica2 by @zeigert.

  2. The available zoom levels of the MODIS dataset did not allow for my smaller scale adventures around Rothera Point to be displayed. The larger scale flying posts would look great, but the detail of skiing and climbing for example, would be lost.

  3. I really like the map tiles from Stamen Design that we settled on. However, they lack any kind of indication of what the ground looks like, which is where a satellite image basemap would have been nice.

Overall though, super chuffed and ready to go!

Spot an error raise a ticket

Great non-code talks from PyCon 2015

PyCon 2015 was a great experience. I went to a wide variety of talks, from those that were super technical to others that were widely applicable to work and life in general.

Here are two of the latter, which I particularly enjoyed.

Workplace cultures and diversity

My highlight talk of the conference was by Kate Heddleston presenting insights into workplace cultures and diversity. Everyone should watch this, or just read her blog posts on Criticism and Feedback, Onboarding, Argument Cultures and The Null Process.

A normal distribution of talent and ability

A great talk for anyone who has felt Imposter Syndrome, was the keynote by Jacob Kaplan-Moss. Well worth a watch.

Spot an error raise a ticket

Leafarctica: polar projections and Leaflet

I decided it was about time I made a web map using Leaflet.

To help with this, I teamed up with Orange Mug, and together we created Leafarctica.


We used a nice dataset of daily Antarctic MODIS imagery from GIBS and implemented an idea for a really simple viewer with date selector.

Why polar focussed?

Because when it comes to web mapping, all seems very simple until you want to use a projection that is not web mercator. So this added a nice, challenging component to the plan.

Here’s a walk through of the features we explored and used.

  1. We used Proj4Leaflet to add support for Antarctic Polar Stereographic (EPSG: 3031). We found this to be well documented and pretty straight forward to use.

  2. We were really impressed that Leaflet.Graticule plugin worked beautifully on this projection.

  3. Antarctica is big. So when zooming around the imagery, it’s easy to loose track of where you are. We tried to implement the Leaflet-MiniMap plugin. However, we didn’t have much success in getting this to work. We think it’s an issue with the not using straight forward web mercator, but more investigation is required.

  4. Leaflet-hash is used to retain the current map view when switching between imagery dates.

  5. map.fitBounds() wasn’t working with our polar projection, so Orange Mug had to hack a solution. You can find it here. This isn’t perfect however, because it stops any pan animation.

If you want to look more closely at this pretty simple implementation, head to the repo on GitHub.

In the meantime, I’m continuing to use this project as a way to learn and experiment with Leaflet.

Orange Mug

Thanks Orange Mug!

Spot an error raise a ticket

My first QGIS plugin

I have created my first complete QGIS plugin. Introducing the Sea Ice Concentration Downloader.

The plugin downloads sea ice concentration data of Antarctica within a time range, does a few data conversions, then displays the results in QGIS.

Downloader interface

The dataset is derived from spaceborne microwave radiometer readings of brightness temperature from a series of satellites, and is designed to be a consistent time series going back to 1978. A complete picture of the sea ice conditions is produced everyday, gridded at 25km resolution. Refer to official docs for more info.

Sea ice concentration January 2013

As with all QGIS plugins, installaton is easy and is explained here. I’m hoping to enhance it with a few more features in the future and potentially add it to the official QGIS plugin repo.

Spot an error raise a ticket

3D DEMs on the web

I’ve discovered an easy way to render DEMs in 3D online using GitHub. GitHub can host and render 3D models. You can upload a 3D model and view it via your GitHub account online. You can also embed the viewer into a webpage, like this:

You will need to get your DEMs into stl format. To do this, I exported my DEM into a point cloud x,y,z text file. I then loaded this into the freely available MeshLab, created a surface from the points and exported it to stl format. Detailed instructions can be found here on how to do this.

Here’s another one:

DISCLAIMER: I’ve found sometimes the models don’t load with slow internet connections. If it doesn’t work first time, try reloading.

Spot an error raise a ticket

Using PyQGIS in Mac OS X

The QGIS built in Python console is great. But the real power comes in being able to use PyQGIS in external scripts.

Without completing the following steps, you will encounter this situation when trying to import the modules:

>>> import qgis.core
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    ImportError: No module named qgis.core

The PyQGIS developer cookbook explains how to fix this problem for Windows and Linux, but not Mac. So Mac users, here’s what you do.

Open your ~/.profile for editing. Add the following lines:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/Applications/

    export PYTHONPATH=$PYTHONPATH:/Applications/

    export PATH=$PATH:/Library/Frameworks/GDAL.framework/Programs/

Add this line to load external opengeo libraries, so you can save out your layers in various formats.

export PYTHONPATH=$PYTHONPATH:~/.qgis2/python/plugins/opengeo/ext-libs/

Save this and type the following into the terminal.

source ~/.profile

Now you will be able to import PyQGIS modules into any of your external scripts.

Spot an error raise a ticket

Tracing rasters outlines with GDAL

Back to this set of GDAL tools, that I’ve talked about previously.

gdal_trace_outline is an easy way of producing image footprint polygons, which will work with irregular shaped images and images which are projected to a non regular shape. So to trace an image and produce a shapefile outline polygon (in this case using a Landsat scene):

gdal_trace_outline -ndv 0 -out-cs en -dp-toler 10 -ogr-out outline.shp [in image]

Trace outline example

The green boundary is the output, outline.shp.

With an image such as this Landsat scene, you will probably end up with a polygon with a ridiculous number of vertices, as the image will often have uneven edges. So experiment with the -dp-toler option in order to generalise the outline. For example, the red outline below was created with default settings, and the green with -dp-toler 10:

Trace outline example

Spot an error raise a ticket

Counting atolls of the Maldives


How many atolls are in this Landsat image of the Northern Maldives?

This isn’t really a question that needs answering, but it is quite fun to come up with a fairly accurate count based on simple morphology and thresholding operations.

By eye, I’ve counted a total of 34 atolls, all differing in size and characteristics with varying texture across and between each atoll. Helpfully, they are all set against a relatively homogeneous backdrop (the darker sea).

Pythons scikit-image is a collection of algorithms for image processing and computer vision. It contains a number of packages for segmentation, filtering, morphology and colour manipulation, amongst others. It also has nice documentation.

This post demonstrates using morphological opening and closing and thresholding with the Otsu method to count the atolls.

Firstly, the image needs to be read in as a NumPy array and its projection and geotransform information recorded. Then later on, we can save out any results as image files.

from osgeo import gdal
    import numpy as np
    from skimage.filter import thresholding
    from skimage import morphology
    from scipy.ndimage import measurements

    data  = gdal.Open('maldives.tif')
    im    = data.ReadAsArray()
    trans = data.GetGeoTransform()
    proj  = data.GetProjection()
    [dims, cols, rows] = im.shape

Morphological opening

This is defined as an erosion followed by a dilation. By passing a window of a certain size over an image, it will emphasize the lighter areas that are larger than the window and suppress the light areas that are smaller. The window can be a rectangular array, or a disk (by using the inbuilt disk constructor in scikit-image). The result of this is that the small sun glints on the sea will be suppressed, while the atolls will be enhanced, making an automated thresholding much more effective.

By using all bands and summing the result to create a grayscale image, the atolls have a higher pixel value than the surrounding sea. Making use of all bands in this way ensures all available information which helps define the atolls is retained.

# after some testing, a disk with a radius if 4 pixels seem to work the best.
    selem = morphology.disk(4)

    red_open = morphology.opening(im[2,...], selem)
    gre_open = morphology.opening(im[1,...], selem)
    blu_open = morphology.opening(im[0,...], selem)    
    all_open = red_open + gre_open + blu_open

Saving all_open to an image file gives you this.



Ultimately we need a binary image, so the objects can be counted computationally. Using the Otsu method, we can automatically select a threshold which will bring out the clusters of brighter objects.

Lets add to the code.

thresh = thresholding.threshold_otsu(all_open)
    bin_im = all_open > thresh

Using the selected threshold value generated from the Otsu algorithm, we can create bin_im which gives a value of 0 to all pixels below thresh and 1 to pixels higher.

Saving out bin_im, we get this.

thresholded image

Counting the objects in this image will results in a value of 166. Much too high. We need to close the smaller black areas.

Morphological closing

Closing is exactly the opposite to opening, and is defined by a dilation followed by an erosion, effectively closing the darker areas depending on the size of the defined window. Then we can count the white areas as follows:

selem   = morphology.disk(9)
    bin_im2 = morphology.closing(bin_im, selem)

    labels, nbr_objs = measurements.label(bin_im2)
    print nbr_objs

This measurements.label function from scipy allows the output of a labels array, giving a grayscale image showing each object as a different value. Also we get an integer number of objects.

In this case nbr_objs has given us a value of 32. Which is pretty close to the true number.

We now have the count, so time to save the resultant bin_im2 to a georeferenced image file.

outdata = gdal.GetDriverByName("GTiff")
    dst_im  = outdata.Create('maldives_binary.tif', rows, cols, 1, gdal.GDT_Byte)
    band    = dst_im.GetRasterBand(1)

    outdata = gdal.GetDriverByName("GTiff")
    dst_im  = outdata.Create('maldives_labels.tif', rows, cols, 1, gdal.GDT_Byte)
    band    = dst_im.GetRasterBand(1)

morphological closing

There are a few omission and commission errors. But the varying size of the atolls makes this hard to correct. All the larger atolls have been accounted for and some of the smaller ones too. This is a simple technique demonstrating the use of scikit-image on satellite imagery.

Spot an error raise a ticket

Scaling Mapped Remote Sensing Datasets with NumPy Masked Arrays

In an earlier post I have talked about the usefulness of NumPy’s masked arrays. Now here’s a simple implementation, which I put together after getting a bit unstuck while using them myself.

Input data

Level 3 Global mapped SMOS Ocean Salinity dataset accessed from the Centre Aval de Traitement des Donnees SMOS (CATDS).

SMOS salinity dataset 10 day composite SMOS ocean surface salinity (11/11/2011 - 20/11/2011)

The task

To convert this data from NETCDF with integer values into a scaled GeoTIFF with real salinity values for easy use and visualisation in a desktop GIS. Data centres will store grids of integer values to cut down on data storage volume. So a simple linear scaling can be applied to convert them into their true float values.

However, the areas of land and sea ice are classed as no data with a specific value of -32767 which needs to be ignored in the calculation with the use of a masked array. Sounds simple, but this is where I encountered…

… The problem

The masked values turn into zeros after performing the linear scaling. So my resultant GeoTIFF had values of zero rather than -32767 in areas of land and sea ice.

The solution

Convert the input integer array to a float array before creating the masked array. Here’s the function I wrote. Firstly I converted my data into a NumPy array in my usual way, taking acount of projection and transform information. The key is then to change the input array to data type float before creating a masked array.

import numpy as np
    import osgeo.gdal as gdal
    import osgeo.osr as osr

    def smos_os_convert(infile,outfile):
      Converts SMOS ocean salinity L3 mapped products to scaled tifs
      linear scaling equation: salinity = data * 0.001 + 30

      nodata = -32767 
      slope  = 0.001
      inter  = 30
      proj    = "+proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 \
                 +a=6371228 +b=6371228 +units=m +no_defs"
      outproj = osr.SpatialReference()
      outgeo  = [-17334193.9437, 50171.32835, 0, 7338939.4596, 0.0, -50171.32835]

      os      = gdal.Open(infile)
      osarr   = os.ReadAsArray()
      [cols, rows] = osarr.shape

      # Here is where I define the arrays data type as float.
      osarr  = np.array(osarr, dtype=np.float32)
      # Then you can create the masked array...
      m =, nodata)

      # ... and very quickly apply the linear scaling, ignoring the no data pixels.
      scaled = m * slope + inter

      # And finally save out the GeoTIF
      outdata = gdal.GetDriverByName("GTiff")
      dst_ds = outdata.Create(outfile, rows, cols, 1, gdal.GDT_Float64)
      band = dst_ds.GetRasterBand(1)

To illustrate this further

First we convert the GDAL dataset to a NumPy array. os is the GDAL dataset.

>>> osarr = os.ReadAsArray()
    >>> osarr
    array([[-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767]], dtype=int16)

Then we create the masked array.

>>> m =, nodata)
    >>> m
    masked_array(data =
     [[-- -- -- ..., -- -- --]
     [-- -- -- ..., -- -- --]
     [-- -- -- ..., -- -- --]
     [-- -- -- ..., -- -- --]
     [-- -- -- ..., -- -- --]
     [-- -- -- ..., -- -- --]],
                 mask =
     [[ True  True  True ...,  True  True  True]
     [ True  True  True ...,  True  True  True]
     [ True  True  True ...,  True  True  True]
     [ True  True  True ...,  True  True  True]
     [ True  True  True ...,  True  True  True]
     [ True  True  True ...,  True  True  True]],
           fill_value = -32767)
    array([[-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767],
           [-32767, -32767, -32767, ..., -32767, -32767, -32767]], dtype=int16)

And perform the linear scaling.

>>> scaled = m * slope + inter

Now we look at the output data, and see that all the no data values have turned into zeros, even though they should have been ignored in the calculation…

    array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
           [ 0.,  0.,  0., ...,  0.,  0.,  0.],
           [ 0.,  0.,  0., ...,  0.,  0.,  0.],
           [ 0.,  0.,  0., ...,  0.,  0.,  0.],
           [ 0.,  0.,  0., ...,  0.,  0.,  0.],
           [ 0.,  0.,  0., ...,  0.,  0.,  0.]])

… even though areas of real data do scale (picking a pixel in the sea).


So instead we add this line to define the array as datatype float.

>>> osarr = np.array(osarr, dtype=np.float32)
    >>> m =,nodata)
    array([[-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.]], dtype=float32)
    >>> scaled = m * slope + inter

And now the scaled output no data values remain as they should

    array([[-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.],
           [-32767., -32767., -32767., ..., -32767., -32767., -32767.]], dtype=float32)
    >>> scaled.mask
    array([[ True,  True,  True, ...,  True,  True,  True],
           [ True,  True,  True, ...,  True,  True,  True],
           [ True,  True,  True, ...,  True,  True,  True],
           [ True,  True,  True, ...,  True,  True,  True],
           [ True,  True,  True, ...,  True,  True,  True],
           [ True,  True,  True, ...,  True,  True,  True]], dtype=bool)

So when we write this to file and we get a dataset with all areas of land remaining with the no data value of -32767.

Spot an error raise a ticket

The Basic Knowledge for a Geospatial Newbie

What is the basic skill set required for our profession? Of course, the geospatial community is vast with many specialisms within it, from data collection, data management, data analysis and data display. I try to keep abreast of everything, but of course this is not really possible. The industry moves very quickly and technology leaps ahead all the time. However when we start out early in our careers, there are probably some things we should know. The rest, we can learn as we go along.

Knowing where the data has come from

The ways of gathering spatial data are increasing as technology improves. Even if you are not the one out there capturing the data, knowing the method can be very insightful. Take GNSS as an example. At some point we are all likely to make use of data from positioning systems, so being aware of the differences between accuracies of data from handheld recreational GNSS receivers compared to data processed using differential methods is important. And what about alternative data gathering techniques, such as crowd sourcing and space based remote sensing with nano satellites. Knowing the inherent advantages and problems of these techniques is vital.

Being a GIS monkey: data analysis

Although I am a huge proponent of open source GIS, I still consider it valuable having some working knowledge of ArcGIS. Any of us who have used it will know its quirks, and where it is likely to crash. These things are learnt through putting in the hard yards with what can sometimes be a frustrating piece of software. At the same time as being familiar with this, it is also beneficial to keep in mind where this huge industry leader is headed in the future. Whether we like it or not, ESRI products are still heavily used, and it pays us to be aware of them.

More and more businesses are going open source when it comes to making decisions about budgets. Uptake of QGIS is increasing for this very reason. As a GIS professional, it should be increasingly easy to put together a business case to ditch ESRI and go open source, because the options just keep getting better. Freely available, with a willing and helpful online support community. What more do you need?

And of course, lets not completely rely on big software packages. Quite often the job is better done using smaller, cheaper (and sometimes free) software and command line tools. So knowing these can be very advantageous. If you can churn out some custom programs and scripts, even better.

The art of cartography

Cartographic teaching is declining, but in my opinion it is still important. Whilst the days of hand drawn maps are well over, the need to present information in map form to create impact is definitely still at the forefront of the profession. Maybe these days it should be called geovisualisation rather than cartography, but ultimately, the underlying drivers are the same. It is a tall order to ask us to be experts in design and visualisation as well as geographical data wizards, but having a good understanding of how a map should be laid out and how colour is perceived is very useful. This can go as far as knowing web accessibility guidelines, as after all, our mapping products should be creating an impact on everyone.

With advances in web mapping, getting our message across with a beautiful map is becoming easier. Web mapping is a huge area of growth, so knowing your OGC standards and the difference between WMS and WCS is another useful piece of knowledge.

A key fundamental

And because the earth is not flat, the most important fundamental concepts of our profession are datums and map projections. It might be true that for many purposes, the difference between WGS84 and ITRF08 are not important. But as geospatial professionals, this should be something we are all paying critical attention to and we should know when the difference becomes important. I would argue that we should never assume it does not matter. Datums and map projections are a complicated concept. I have spent many hours scratching my head over datum transformations and what projection I should use to do certain calculations on? But at every stage our work, this is absolutely critical. From data collection, data management, data analysis and data display. And this doesn’t stop at horizontal reference frames. Know what is meant by the geoid.

The last point here is probably the most important, if I had to pick something. But all the above points are useful knowedge areas for early career geospatial professionals.

Spot an error raise a ticket

A review of a course by O'Reilly School of Technology, Python 1: Beginning Python

I have just completed the O’Reilly School of Technology Python 1 course, part of the Python Certificate qualification (consisting of four such courses). I thought it might be useful to write a little summary of what I thought of it.

Whilst I have had previous Python experience, much of this was self taught. I have had a small amount of formal programming and scripting teaching, using IDL and csh. I quickly realised that Python would be a good language to learn in my chosen specialist area and given that I was having to use it more and more at work, I felt a formal education was required to fill gaps in my very basic knowledge (rather than trying to work through a book).

Generally, I have found this course extremely beneficial. The guideline time for completion is 90 hours. As Python was not completely new to me, I completed it quicker than this. I imagine for someone completely new to programming, the 90 hours is a good guide. When you consider the price of the course, that is pretty good value.

Now that I’ve completed the course, I feel more confident with Python, I have learned some really valuable concepts and picked up some ‘good practise’. It is a good introduction and I think it would suit someone completely new to programming, who wants to learn from scratch.

Positives: - The course is well structured. It starts slowly, and gently introduces new concepts. It is broken down into manageable ‘lessons’ consisting of exercises, quizzes and assignments. These lessons are a good length meaning its easy to fit this course into a busy lifestyle. - A real person marks assignments, and very quickly too, generally within 2 working days. Feedback is detailed where it needs to be and quite often you are shown different ways in which you could have written the same assignment. If you don’t pass first time, you are given guidance on how to correct your work. - The web learning environment is simple and uncluttered.


  • Sometimes hard to read through the lessons on screen and there does not seem to be an easy way to download course material.

I have now signed up to the next Python course, which should keep me busy for the next 6 months.

Spot an error raise a ticket

Using ArcPy for Raster Analysis

In order to do bulk raster analysis, or anything slightely bespoke, I will generally opt for using GDAL with Python. However, its possible to achieve similar things within an ESRI ArcGIS world if you want to, using ArcPy.

If all you need to do is a simple point operation (in this case, multiplying each pixel by 2), then this can very quickly be achieved using ArcPy’s raster calculator. The Spatial Analysis extension is required for this.

import arcpy
    import numpy as np
    from import *


    infile = "inputfile.tif"

    # do any calculation on the raster object
    x = Raster(infile) * 2

    # save the new raster"outfile.tif")

If this is implemented using the Python prompt in Arcmap, as soon as x has been created it will appear as a layer in the “Table of Contents” labelled x. At this point it’s only saved in memory. If you quit Arcmap, the raster will be lost.

To do anything more complex with rasters such as filtering and neighbourhood functions, this method won’t cut it. In this case it’s necessary to start using NumPy arrays. Just as I have previously written about using NumPy arrays for raster processing with GDAL, below is a similar example using ArcPy.

Just for the purpose of this post, I’m only going to multiply the raster by 2 again. But obviously once data is in an array, anything in possible.

import arcpy
    import numpy as np

    infile  = "inputfile.tif"
    in_arr  = arcpy.RasterToNumPyArray(infile, nodata_to_value=9999)

    # NB: I find it useful to use the optional argument `nodata_to_value`. 
    # That way, ArcPy will read the no data value from the dataset, and 
    # set it to this defined value in the NumPy array. You then know exactly 
    # what number represent no data in the array.

    # do some processing
    out_arr = in_arr * 2

    # gather some information on the original file
    spatialref = arcpy.Describe(infile).spatialReference
    cellsize1  = arcpy.Describe(infile).meanCellHeight
    cellsize2  = arcpy.Describe(infile).meanCellWidth
    extent     = arcpy.Describe(infile).Extent
    pnt        = arcpy.Point(extent.XMin,extent.YMin)

    # save the raster
    out_ras = arcpy.NumPyArrayToRaster(out_arr,pnt,cellsize1,cellsize2)
    arcpy.DefineProjection_management("outfile.tif", spatialref)
Spot an error raise a ticket

Dealing with no data using NumPy masked array operations

Remotely sensed data can be riddled with no data (or null) values. Take this grid of sea surface chlorophyll A (CHLa) concentration for example, which represents 8 days worth of repeat observations and still has lots of big data gaps (shown in black). With products such as these that use optical data as their input, cloud cover is always going to be a problem.

CHLa concentration Aqua MODIS CHLa concentration, 8 day composite, 01/01/2013 - 08/01/2013 ([])

If you wanted to do any raster processing with this data, identifying no data gaps as invalid is essential. And if you’re creating your own processing scripts, it can become problematic.

You could simply build in conditions to ignore this number. However, I find a much nicer way to deal with no data if using Python, is with NumPys masked array operations. They lend themselves really well to remote sensing analysis. With a basic example, here’s how this can work.

(Please assume all my one dimensional arrays here are actually two dimensional and represent raster values! Of course you first need to convert your raster into a NumPy array.)

>>> import numpy as np 
    >>> x = np.array([1, 2, 3, 99, 5, 6, 99])
    >>> y =, 99)
    >>> y.mask
    array([False, False, False,  True, False, False,  True], dtype=bool)
    array([ 1,  2,  3, 99,  5,  6, 99])
    >>> y
    masked_array(data = [1 2 3 -- 5 6 --],
           mask = [False False False  True False False  True],
     fill_value = 99)

If you want to visually display where your no data cells are in a GIS, you can easily change the mask into an integer array and then save it out in a visual format.

>>> intarray = y.mask.astype(int)
    >>> intarray
    array([0, 0, 0, 1, 0, 0, 1])

Equally powerful, if you already have a mask raster, you can build the mask and the data part of the masked array yourself.

>>> t = ma.array([1,2,3,4,4], mask=[0,0,0,0,1])
    >>> t
    masked_array(data = [1 2 3 4 --],
                 mask = [False False False False  True],
           fill_value = 999999)

Once you have assigned your masked array, there are a number of other methods which can be applied, for example:

>>> y.mean()
    >>> y.max()
    >>> y.sum()

These are basic operations only. The documentation for masked array operations shows the range of possibilities. I’ve used masked arrays very frequently when developing bespoke raster processing scripts.

Spot an error raise a ticket

Pansharpening with a handy GDAL tool

I found this set of GDAL tools recently which I’ve found to be really useful. I thought it would be good to show some of these in action.

gdal_landsat_pansharp is one of these tools and allows easy and effective pansharpening of Landsat ETM images (I haven’t tried it on images from other sensors). It’s handy as it allows quick pansharpening even if you haven’t composited bands from a freshly downloaded Landsat image, for example:

gdal_landsat_pansharp -rgb band3.tif -rgb band2.tif -rgb band1.tif \
    -lum band4.tif 0.52 -lum band3.tif 0.23 -lum landsat2.tif 0.25 \
    -pan band8.tif -ndv 0 -o outimage.tif

The -rgb option is used for each input band, the -lum option is used to simulate a low resolution panchromatic band, each band requires a weighting (0-1). -pan is needed to select the panchromatic band (band 8 for Landsat ETM), -ndv to specify a no data value and finally -o to give an output filename.

Alternatively, if you already have a true colour (bands 3, 2, 1) composite:

gdal_landsat_pansharp -rgb rgbimage.tif \
    -lum band2.tif 0.25 -lum band3.tif 0.23 -lum landsat4.tif 0.52 \
    -pan band8.tif -ndv 0 -o outimage.tif

It also works nicely to quickly pansharpen false colour composites, using the same -lum settings. For example, for a 432 composite:

gdal_landsat_pansharp -rgb band4.tif -rgb band3.tif -rgb band2.tif \
    -lum band4.tif 0.52 -lum band3.tif 0.23 -lum landsat2.tif 0.25 \
    -pan band8.tif -ndv 0 -o outimage.tif

A quick read of the tools wiki page points us to this paper in order to derive the weightings for the low resolution simulated image. I also refer you to the wiki page for further information on the maths behind the technique.

Here’s some results:

A true colour (bands 3, 2, 1) image of Abisko National Park in Sweden, 30m spatial resolution

Landsat RGB image

The panchromatic band, 15m spatial resolution

Landsat panchromatic image

And the pansharpened image

Landsat pansharpened RGB image

Finally, a false colour (bands 4, 3, 2) pansharpened image

Landsat pansharpened FCC image Landsat scene ID: LE71980121999227SGS00 date: 15/08/1999

Spot an error raise a ticket

Using GDAL with Python: a basic introduction

Using GDAL with Python opens up a lot of flexibility in raster processing. By converting your raster data into a NumPy array, you can make use of all Python’s array operations.

So in this very simple example, I will convert a single band georeferenced raster (GeoTiff), to a 2D NumPy array, and back again. Once the raster has been read in as an array, any raster operations are possible.

from osgeo import gdal
    import numpy as np

    # Read the input raster into a Numpy array
    infile = "inputfile.tif"
    data   = gdal.Open(infile)
    arr    = data.ReadAsArray()

    # Do some processing....

    # Save out to a GeoTiff

    # First of all, gather some information from the original file
    [cols,rows] = arr.shape
    trans       = data.GetGeoTransform()
    proj        = data.GetProjection()
    nodatav     = data.GetNoDataValue()
    outfile     = "outputfile.tif"

    # Create the file, using the information from the original file
    outdriver = gdal.GetDriverByName("GTiff")
    outdata   = outdriver.Create(str(outfile), rows, cols, 1, gdal.GDT_Float32)

    # Write the array to the file, which is the original array in this example

    # Set a no data value if required

    # Georeference the image

    # Write projection information
Spot an error raise a ticket

Landsat Imagery: search, download and display

With the launch of the new Landsat satellite last month, it’s a great time to look back over some 40 years of continual high resolution multispectral data acquisition, achieved by the Landsat program. And with it’s free data access policy, the Landsat archive represents a fantastic resource for students and researchers interested in remote sensing analysis. From a general interest point of view also, some of the images are stunning.

Although I completed a Masters degree in Remote Sensing, I was never actually taught how to search for and download Landsat data for myself. Although its not particularly complicated, I hope this post will help save interested people a bit of time in figuring out the best way to do this.

Searching for the images

There are two access points to the archive, both run by the USGS. GloVis and Earth Explorer. I recommend the latter. It’s newer and more intuitive, and runs without having to install a Java plugin! It’s also probably one of the better online image searching sites I’ve come across. In order to download data from EarthExplorer, you must first register.

I think searching for images is pretty self explanatory. Under the data tab on the left, start by just selecting L7 ETM+ SLC-on (1999-2003). Data from Landsat 7 collected after 2003 unfortunately suffers from stripping. You can still use the small middle section of these images, but I find it unlikely that the feature you want the image of will be convieneintly resting in the middle section. The benefit of Landsat 7 is the panchromatic band with 15m spatial resolution.

Data download

If images have been processed already, you will be able to download the GeoTiff straight away. If not, they will have to be added to the cart and sent off for processing. You will be notified by email when your images are ready for download, which usually takes a maximum of a couple of days.

Visualising your images

Unfortunately, the data comes with each band as a separate image file. So to view as colour composite, you will need to merge these together. One way to do this is with Quantum GIS or QGIS for short.

The tool you require within QGIS is called Merge. Which is hidden in the Raster menu. If you select the Layer Stack option, you can stack any number single images together.

Command line options for layer stacking include (using the -separate option), and gdal_merge_simple.

And here’s the result of a quick search, download and visualisation of Cornwall,UK as a false colour composite. Red areas show vegetation, Cornwall gets a lot of rain.

Landsat false colour composite Landsat scene ID: LE72040251999205AGS01 date: 24/07/1999

Spot an error raise a ticket

Installing GDAL on Ubuntu

GDAL is amazing. I use it everyday for various raster handling tasks, such as converting weird file formats to more useable ones, reprojecting and querying datasets.

Below is a quick guide to getting GDAL command line utilities up and running on Ubuntu.

First of all you will need to add the UbuntuGIS repository. Using a text editor, open the file /etc/apt/sources.list and add the following lines:

deb <codename> main 
    deb-src <codename> main

Next, you will need to update the list of available packages. Open a terminal and type the following:

sudo apt-get update

Once this has been done, it is possible to install any of the packages available through the UbuntuGIS repository. For a list of all available packages, see

Alternatively, as we know we want to install GDAL, simply type the following:

apt-cache search gdal

This will bring up a list of all packages that have ‘gdal’ in the description/title. So if we type:

sudo apt-get install gdal-bin

You should also install python bindings, which can be really useful for tools such as and

You’re ready to start using GDAL.

Spot an error raise a ticket