Machine learning tools for pattern recognition in polar climate science

Abstract

This thesis explores the application of two novel machine learning approaches to the study of polar climate, with particular focus on Arctic sea ice. The first technique, complex networks, is based on an unsupervised learning approach which is able to exploit spatio-temporal patterns of variability within geospatial time series data sets. The second, Gaussian Process Regression (GPR), is a supervised learning Bayesian inference approach which establishes a principled framework for learning functional relationships between pairs of observation points, through updating prior uncertainty in the presence of new information. These methods are applied to a variety of problems facing the polar climate community at present, although each problem can be considered as an individual component of the wider problem relating to Arctic sea ice predictability. In the first instance, the complex networks methodology is combined with GPR in order to produce skilful seasonal forecasts of pan-Arctic and regional September sea ice extents, with up to 3 months lead time. De-trended forecast skills of 0.53, 0.62, and 0.81 are achieved at 3-, 2- and 1-month lead time respectively, as well as generally highest regional predictive skill (>0.30> 0.30) in the Pacific sectors of the Arctic, although the ability to skilfully predict many of these regions may be changing over time. Subsequently, the GPR approach is used to combine observations from CryoSat-2, Sentinel-3A and Sentinel-3B satellite radar altimeters, in order to produce daily pan-Arctic estimates of radar freeboard, as well as uncertainty, across the 2018--2019 winter season. The empirical Bayes numerical optimisation technique is also used to derive auxiliary properties relating to the radar freeboard, including its spatial and temporal (de-)correlation length scales, allowing daily pan-Arctic maps of these fields to be generated as well. The estimated daily freeboards are consistent to CryoSat-2 and Sentinel-3 to within <1< 1 mm (standard deviations <6< 6 cm) across the 2018--2019 season, and furthermore, cross-validation experiments show that prediction errors are generally 4\leq 4 mm across the same period. Finally, the complex networks approach is used to evaluate the presence of the winter Arctic Oscillation (AO) to summer sea ice teleconnection within 31 coupled climate models participating in phase 6 of the World Climate Research Programme Coupled Model Intercomparison Project (CMIP6). Two global metrics are used to compare patterns of variability between observations and models: the Adjusted Rand Index and a network distance metric. CMIP6 models generally over-estimate the magnitude of sea-level pressure variability over the north-western Pacific Ocean, and under-estimate the variability over the north Africa and southern Europe, while they also under-estimate the importance of regions such as the Beaufort, East Siberian and Laptev seas in explaining pan-Arctic summer sea ice area variability. They also under-estimate the degree of covariance between the winter AO and summer sea ice in key regions such as the East Siberian Sea and Canada basin, which may hinder their ability to make skilful seasonal to inter-annual predictions of summer sea ice

    Similar works