In times of global change, we must closely monitor the state of our planet in order to understand gradual or abrupt changes early on. In fact, each of the Earth's subsystems-i.e. the biosphere, atmosphere, hydrosphere, cryosphere, and anthroposphere-can be analyzed from a multitude of data streams. However, since it is very hard to jointly interpret multiple monitoring data streams in parallel, one often aims for some summarizing indicator. Climate indices, for example, summarize the state of atmospheric circulation in a region, e.g. the Multivariate ENSO (El Ñino-Southern Oscillation) Index. Indicator approaches have been used extensively to describe socioeconomic data too, and a range of indices have been proposed to synthesize and interpret this information. For instance the "Human Development Index" (HDI) by the United Nations Development Programme was designed to capture specific aspects of development.
"Dimensionality reduction" (DR) is a widely used approach to find low dimensional and interpretable representations of data that are natively embedded in high-dimensional spaces. Here, we propose a robust method to create indicators using dimensionality reduction to better represent the terrestrial biosphere and the global socioeconomic system. We aim to explore the performance of the approach and to interpret the resulting indicators.
For biosphere indicators, the concept was tested using 12 explanatory variables representing the biophysical states of ecosystems and land-atmosphere water, energy, and carbon fluxes. We find that two indicators account for 73% of the variance of the state of the biosphere in space and time. While the first indicator summarizes productivity patterns, the second indicator summarizes variables representing water and energy availability. Anomalies in the indicators clearly identify extreme events, such as the Amazon droughts (2005 and 2010) and the Russian heatwave (2010), they also allow us to interpret the impacts of these events. The indicators also reveal changes in the seasonal cycle, e.g. increasing seasonal amplitudes of productivity in agricultural areas and in arctic regions.
We also apply the method on the "World Development Indicators", a database with more than 1500 variables, to track the socioeconomic development at a country level. The aim was to extract the core dimensions of development in a highly efficient way, using a method of nonlinear dimensionality reduction. We find that over 90% of variance in the WDIs can be represented by five uncorrelated and nonlinear dimensions. The first dimension (explaining 74%) represents the state of education, health, income, infrastructure, trade, population, and pollution. The second dimension (explaining 10%) differentiates countries by gender ratios, labor market, and energy production patterns. Overall, we find that the data contained in the WDIs are highly nonlinear therefore requiring nonlinear methods to extract the main patterns of development. Globally, most countries show rather consistent temporal trends towards wealthier and aging societies. Deviations from the long-term trajectories are detected with our approach during warfare, environmental disasters, or fundamental political changes.
In general we find that the indicator approach is able to extract general patterns from complex databases and that it can be applied to databases of varying characteristics. We also find that indicators are can different kinds of changes occurring in the system, such as extreme events, permanent changes or trends. Therefore it is a useful tool for general monitoring and exploratory data analysis. The approach is flexible and can be applied to complex datasets, such as large data, nonlinear data, as well as data with many missing values.In times of global change, we must closely monitor the state of our planet in order to understand gradual or abrupt changes early on. In fact, each of the Earth's subsystems-i.e. the biosphere, atmosphere, hydrosphere, cryosphere, and anthroposphere-can be analyzed from a multitude of data streams. However, since it is very hard to jointly interpret multiple monitoring data streams in parallel, one often aims for some summarizing indicator. Climate indices, for example, summarize the state of atmospheric circulation in a region, e.g. the Multivariate ENSO (El Ñino-Southern Oscillation) Index. Indicator approaches have been used extensively to describe socioeconomic data too, and a range of indices have been proposed to synthesize and interpret this information. For instance the "Human Development Index" (HDI) by the United Nations Development Programme was designed to capture specific aspects of development.
"Dimensionality reduction" (DR) is a widely used approach to find low dimensional and interpretable representations of data that are natively embedded in high-dimensional spaces. Here, we propose a robust method to create indicators using dimensionality reduction to better represent the terrestrial biosphere and the global socioeconomic system. We aim to explore the performance of the approach and to interpret the resulting indicators.
For biosphere indicators, the concept was tested using 12 explanatory variables representing the biophysical states of ecosystems and land-atmosphere water, energy, and carbon fluxes. We find that two indicators account for 73% of the variance of the state of the biosphere in space and time. While the first indicator summarizes productivity patterns, the second indicator summarizes variables representing water and energy availability. Anomalies in the indicators clearly identify extreme events, such as the Amazon droughts (2005 and 2010) and the Russian heatwave (2010), they also allow us to interpret the impacts of these events. The indicators also reveal changes in the seasonal cycle, e.g. increasing seasonal amplitudes of productivity in agricultural areas and in arctic regions.
We also apply the method on the "World Development Indicators", a database with more than 1500 variables, to track the socioeconomic development at a country level. The aim was to extract the core dimensions of development in a highly efficient way, using a method of nonlinear dimensionality reduction. We find that over 90% of variance in the WDIs can be represented by five uncorrelated and nonlinear dimensions. The first dimension (explaining 74%) represents the state of education, health, income, infrastructure, trade, population, and pollution. The second dimension (explaining 10%) differentiates countries by gender ratios, labor market, and energy production patterns. Overall, we find that the data contained in the WDIs are highly nonlinear therefore requiring nonlinear methods to extract the main patterns of development. Globally, most countries show rather consistent temporal trends towards wealthier and aging societies. Deviations from the long-term trajectories are detected with our approach during warfare, environmental disasters, or fundamental political changes.
In general we find that the indicator approach is able to extract general patterns from complex databases and that it can be applied to databases of varying characteristics. We also find that indicators are can different kinds of changes occurring in the system, such as extreme events, permanent changes or trends. Therefore it is a useful tool for general monitoring and exploratory data analysis. The approach is flexible and can be applied to complex datasets, such as large data, nonlinear data, as well as data with many missing values