Background
Following on from my blog posting from July 2023 ‘Snowpatches and satellites‘, I have been involved in some technical development work using Earth Observation (EO) technologies and satellite imagery.
For many years I have monitored and observed long-lasting snowpatches in the Scottish mountains, and as mentioned in the previous blog posting, I have carried this out as part of a community of observers (or ‘citizen scientists’). This community has grown over the years, using social media platforms like Facebook. Records of observations, fieldwork and data collected are published every year in the Weather Journal. The community was led by Adam Watson for many years, and continues to be inspired by him. Iain Cameron now continues Adam’s work.
During this time I often wondered if I could combine this activity with my professional experience of software engineering and data analysis, to build an application that would in some way support the work of the Scottish snowpatch community.
I recently joined a small ad-hoc group of software engineers and developers from this community who had similar aims. Our focus was on retrieving data about snowpatches remotely via EO techniques and data available on the internet, as opposed to on-site observations. An opportunity has arisen in recent years to use satellite imagery data from the Copernicus programme. This imagery is free to use, but getting access to it involves some hurdles. The simplest way is to use the Copernicus Browser, but an API is available for more sophisticated usage, and this is what our group was interested in.
One of the aims of the group was to use Machine Learning techniques (such as CNNs) to overcome one of the main difficulties in using satellite imagery to observe Scottish snowpatches, namely cloud cover. Cloud cover is probably the single most important factor affecting satellite imagery of the Scottish Highlands and it can be difficult to find an image that shows snowpatches that are distinct or free from cloud cover, particularly during the winter months. There can be only one day a month, or even none, when suitable imagery is available.
Satellite imagery in optical ‘true colour’ (i.e. electromagnetic spectrum wavelengths corresponding roughly to those that a human eye can see) is available from the Copernicus programme Sentinel-2 satellite. Other satellites offer imagery using other types of radiation such as Synthetic Aperture Radar (SAR) from the Sentinel-1 satellite. SAR can penetrate cloud cover and this is one area that offers potential scope for future investigation, but the focus in the group of developers was on true colour or RGB imagery.
Development
The Copernicus Browser is a very useful tool but it does have a fairly cluttered interface with a lot of options and features. I thought there was scope for developing a tool which was more streamlined and targeted towards the Scottish snowpatch community, and which could be used as a front-end or delivery interface for the work carried out by the group.
The work to utilise Machine Learning techniques is ongoing in the group of developers, but my initial focus was on building an application to could best serve the requirements of the Scottish snowpatches community using existing data and technques. With that in mind I developed with the other software engineers in the group the idea of a web-based application and interactive tool with the principal aim of allowing a user to graphically explore and visualise the appearance and location of well-known long-lasting snowpatches in the Scottish Highlands, using web maps, satellite imagery data and existing support for classifying features on the images.
With Murray Cutforth (https://github.com/murraycutforth), I developed a tool that matched these requirements. The tool can be seen at: https://scottish-snowpatches.streamlit.app, and some help information about using the tool can be seen at: https://scottish-snowpatches.streamlit.app/help.
In my experience, one of the most difficult problems to overcome when developing bespoke web-based applications is finding a suitable platform to host the application. This is not straightforward if the application has a requirement that it should have no ongoing financial dependencies (i.e. paying fees for webhosting or cloud computing services), needs to have no barriers to being as publicly available and easy-to-access as possible on the web (i.e. not behind an authentication or registration barrier, and with no subscription, registration or advertising barriers), and needs to have an independent and long-term ‘lifespan’ (i.e. it is not a temporary implementation that is tied to a particular individual developer and their credit card). GitHub Pages can help with this, but quickly reaches a limitation as the service only supports client-side web technologies such as HTML and JavaScript. When an application is built with a server-side technology such as PHP or Python, or interacts with a database, using GitHub pages is no longer appropriate. Jupyter Binder and Google Colab are also possibilities but are limited to only hosting ‘computational notebooks‘, whilst Binder is limited to Python code and Colab requires a credit card to use even the limited free tier service, which may get charged if resource usage goes over a certain limit. A solution to this for the specific case of Python was suggested to me by the group of developers, namely Streamlit. Streamlit requires no payment and allows Python code which is hosted on GitHub to be served up in a ‘clean’ web-based application, hosted in the cloud. This became the platform architecture of the tool that we built.
Description
The tool enables satellite images to be retrieved from a list of recognised long-lasting snowpatch areas in the Scottish Highlands. A satellite image corresponding to the location appears in a square window (a 10km by 10km bounding box), overlaid on a map allowing the wider context of the satellite image within the Scottish Highlands to be visualised. A date period is also selected and the least cloudy image is used from that period. Transparency of the layers is also configurable for visualisation purposes.
The tool uses data from the Sentinel-2 satellite to create a ‘true colour’ image corresponding what a human eye might see. An important thing to note is that this data is not a ‘snapshot’ taken on one particular day, rather it is a combination of data from the whole date range selected, using data from the least cloudy days, to generate the ‘best’ image of the area available. Images are combined via mosaicking, pixel-by-pixel. A narrow date range will enable a particular single date to be identified, but this increases the chance that there will be no data available due to 1) the satellite orbit was not over the area selected; 2) the satellite was over the area at night; 3) there was significant cloud cover during the date range selected.
The tool uses data from the Copernicus Data Space Ecosystem (CDSE). To retrieve this data, an API is available, which requires using a Python package called sentinelHub. The tool uses this API and was built on some sample client code.
The tool uses an OpenStreetmap (OSM) base map layer for the satellite imagery using Folium, a python package which allows the use of the Leaflet JavaScript library through a Python API.
The tool also allows another data layer to be visualised on the map, and that is a land cover or ‘Scene Classification’ map, generated by an algorithm developed by the CDSE. This shows the results as purple polygons outlining the areas the algorithm has classified as ‘snow/ice’. This algorithm is not perfect, and can miss some snowpatches around 100m² (10m by 10m) in size and can sometimes classify areas of cloud cover as ‘snow/ice;. An important consideration is that this algorithm does not use Machine Learning (ML) techniques, so there is potential for improvement using an ML approach.
The tool also requires a user to have a CDSE account and generate special OAuth client authentication details. This is required as the tool uses a quota for accessing the CDSE data, which resets every month.
The GitHub repository for the code is at: https://github.com/EddieBoyle2019/GeoML.
Conclusions
Developing this tool showed what is possible with existing platforms, services, technologies and data. The tool has limitations (which are listed below) but I think it will still be useful for the Scottish snowpatch community, albeit perhaps to a small group of technically-minded observers within the community. I will monitor any feedback I receive from the community and incorporate this into ongoing development discussions with the rest of the group of developers. The tool is very much an initial version, and there is much scope for future development.
Limitations/future development
- The approach of ‘mosaicking’ data with the least cloud-cover mentioned above means that the resulting images used in the tool do not have a specific single date, rather a data range, which can be a bit confusing and possibly of not much use for serious snowpatch observations (i.e. recording melt dates). The best way to deal with this issue and use the sentinelHub API is to allow a user to choose a percentage cloud cover value, and then show graphically on a calendar interface those dates with images available for the area that have cloud cover less than the chosen value. This is exactly what the Copernicus Browser does, and this is the resource to use if this more detailed approach is required.
- Development of support in the web interface for smartphones and smaller screens like tablets.
- The requirement for a personal SentinelHub OAuth client account is a bit of a hurdle, and could perhaps be overcome by creating a group CDSE account with a suitable data quota.
- Integration of Machine Learning functionality that may be developed by the group of developers in the future.
Leave a Reply