15 research outputs found

    An information-theoretical perspective on weighted ensemble forecasts

    Get PDF
    This paper presents an information--theoretical method for weighting ensemble forecasts with new information. Weighted ensemble forecasts can be used to adjust the distribution that an existing ensemble of time series represents, without modifying the values in the ensemble itself. The weighting can, for example, add new seasonal forecast information in an existing ensemble of historically measured time series that represents climatic uncertainty. A recent article in this journal compared several methods to determine the weights for the ensemble members and introduced the pdf-ratio method. In this article, a new method, the minimum relative entropy update (MRE-update), is presented. Based on the principle of minimum discrimination information, an extension of the principle of maximum entropy (POME), the method ensures that no more information is added to the ensemble than is present in the forecast. This is achieved by minimizing relative entropy, with the forecast information imposed as constraints. From this same perspective, an information--theoretical view on the various weighting methods is presented. The MRE-update is compared with the existing methods and the parallels with the pdf-ratio method are analysed. The paper provides a new, information--theoretical justification for one version of the pdf-ratio method that turns out to be equivalent to the MRE-update. All other methods result in sets of ensemble weights that, seen from the information--theoretical perspective, add either too little or too much (i.e. fictitious) information to the ensemble

    Technical note: “Bit by bit”: a practical and general approach for evaluating model computational complexity vs. model performance

    Get PDF
    One of the main objectives of the scientific enterprise is the development of well-performing yet parsimonious models for all natural phenomena and systems. In the 21st century, scientists usually represent their models, hypotheses, and experimental observations using digital computers. Measuring performance and parsimony of computer models is therefore a key theoretical and practical challenge for 21st century science. “Performance” here refers to a model\u27s ability to reduce predictive uncertainty about an object of interest. “Parsimony” (or complexity) comprises two aspects: descriptive complexity – the size of the model itself which can be measured by the disk space it occupies – and computational complexity – the model\u27s effort to provide output. Descriptive complexity is related to inference quality and generality; computational complexity is often a practical and economic concern for limited computing resources. In this context, this paper has two distinct but related goals. The first is to propose a practical method of measuring computational complexity by utility software “Strace”, which counts the total number of memory visits while running a model on a computer. The second goal is to propose the “bit by bit” method, which combines measuring computational complexity by “Strace” and measuring model performance by information loss relative to observations, both in bit. For demonstration, we apply the “bit by bit” method to watershed models representing a wide diversity of modelling strategies (artificial neural network, auto-regressive, process-based, and others). We demonstrate that computational complexity as measured by “Strace” is sensitive to all aspects of a model, such as the size of the model itself, the input data it reads, its numerical scheme, and time stepping. We further demonstrate that for each model, the bit counts for computational complexity exceed those for performance by several orders of magnitude and that the differences among the models for both computational complexity and performance can be explained by their setup and are in accordance with expectations. We conclude that measuring computational complexity by “Strace” is practical, and it is also general in the sense that it can be applied to any model that can be run on a digital computer. We further conclude that the “bit by bit” approach is general in the sense that it measures two key aspects of a model in the single unit of bit. We suggest that it can be enhanced by additionally measuring a model\u27s descriptive complexity – also in bit

    Technical note: “Bit by bit”: a practical and general approach for evaluating model computational complexity vs. model performance

    Get PDF
    One of the main objectives of the scientific enterprise is the development of well-performing yet parsimonious models for all natural phenomena and systems. In the 21st century, scientists usually represent their models, hypotheses, and experimental observations using digital computers. Measuring performance and parsimony of computer models is therefore a key theoretical and practical challenge for 21st century science. “Performance” here refers to a model’s ability to reduce predictive uncertainty about an object of interest. “Parsimony” (or complexity) comprises two aspects: descriptive complexity – the size of the model itself which can be measured by the disk space it occupies – and computational complexity – the model’s effort to provide output. Descriptive complexity is related to inference quality and generality; computational complexity is often a practical and economic concern for limited computing resources. In this context, this paper has two distinct but related goals. The first is to propose a practical method of measuring computational complexity by utility software “Strace”, which counts the total number of memory visits while running a model on a computer. The second goal is to propose the “bit by bit” method, which combines measuring computational complexity by “Strace” and measuring model performance by information loss relative to observations, both in bit. For demonstration, we apply the “bit by bit” method to watershed models representing a wide diversity of modelling strategies (artificial neural network, auto-regressive, processbased, and others). We demonstrate that computational complexity as measured by “Strace” is sensitive to all aspects of a model, such as the size of the model itself, the input data it reads, its numerical scheme, and time stepping. We further demonstrate that for each model, the bit counts for computational complexity exceed those for performance by several orders of magnitude and that the differences among the models for both computational complexity and performance can be explained by their setup and are in accordance with expectations. We conclude that measuring computational complexity by “Strace” is practical, and it is also general in the sense that it can be applied to any model that can be run on a digital computer. We further conclude that the “bit by bit” approach is general in the sense that it measures two key aspects of a model in the single unit of bit. We suggest that it can be enhanced by additionally measuring a model’s descriptive complexity – also in bit.info:eu-repo/semantics/publishedVersio

    Controls on the diurnal streamflow cycles in two subbasins of an alpine headwater catchment

    Get PDF
    In high-altitude alpine catchments, diurnal streamflow cycles are typically dominated by snowmelt or ice melt. Evapotranspiration-induced diurnal streamflow cycles are less observed in these catchments but might happen simultaneously. During a field campaign in the summer 2012 in an alpine catchment in the Swiss Alps (Val Ferret catchment, 20.4 km2, glaciarized area: 2%), we observed a transition in the early season from a snowmelt to an evapotranspiration-induced diurnal streamflow cycle in one of two monitored subbasins. The two different cycles were of comparable amplitudes and the transition happened within a time span of several days. In the second monitored subbasin, we observed an ice melt-dominated diurnal cycle during the entire season due to the presence of a small glacier. Comparisons between ice melt and evapotranspiration cycles showed that the two processes were happening at the same times of day but with a different sign and a different shape. The amplitude of the ice melt cycle decreased exponentially during the season and was larger than the amplitude of the evapotranspiration cycle which was relatively constant during the season. Our study suggests that an evapotranspiration-dominated diurnal streamflow cycle could damp the ice melt-dominated diurnal streamflow cycle. The two types of diurnal streamflow cycles were separated using a method based on the identification of the active riparian area and measurement of evapotranspiration

    Geomorphic signatures on Brutsaert base flow recession analysis

    Get PDF
    This paper addresses the signatures of catchment geomorphology on base flow recession curves. Its relevance relates to the implied predictability of base flow features, which are central to catchment-scale transport processes and to ecohydrological function. Moving from the classical recession curve analysis method, originally applied in the Finger Lakes Region of New York, a large set of recession curves has been analyzed from Swiss streamflow data. For these catchments, digital elevation models have been precisely analyzed and a method aimed at the geomorphic origins of recession curves has been applied to the Swiss data set. The method links river network morphology, epitomized by time-varying distribution of contributing channel sites, with the classic parameterization of recession events. This is done by assimilating two scaling exponents, ÎČ and bG, with |dQ/dt| ĂąÌ‚ Q ÎČ where Q is at-a-station gauged flow rate and N(l) ĂąÌ‚ N(l)ĂąÌ‚G(l)bG where l is the downstream distance from the channel heads receding in time, N(l) is the number of draining channel reaches located at distance l from their heads, and G(l) is the total drainage network length at a distance greater or equal to l, the active drainage network. We find that the method provides good results in catchments where drainage density can be regarded as spatially constant. A correction to the method is proposed which accounts for arbitrary local drainage densities affecting the local drainage inflow per unit channel length. Such corrections properly vanish when the drainage density become spatially constant. Overall, definite geomorphic signatures are recognizable for recession curves, with notable theoretical and practical implications. Key Points signatures of catchment geomorphology on base flow recession curves Analysis of streamflow data and DEM for 27 catchments in Switzerland New conceptual model accounting for uneven drainage densit

    Multi-ethnic genome-wide association study for atrial fibrillation

    Get PDF
    Atrial fibrillation (AF) affects more than 33 million individuals worldwide and has a complex heritability. We conducted the largest meta-analysis of genome-wide association studies (GWAS) for AF to date, consisting of more than half a million individuals, including 65,446 with AF. In total, we identified 97 loci significantly associated with AF, including 67 that were novel in a combined-ancestry analysis, and 3 that were novel in a European-specific analysis. We sought to identify AF-associated genes at the GWAS loci by performing RNA-sequencing and expression quantitative trait locus analyses in 101 left atrial samples, the most relevant tissue for AF. We also performed transcriptome-wide analyses that identified 57 AF-associated genes, 42 of which overlap with GWAS loci. The identified loci implicate genes enriched within cardiac developmental, electrophysiological, contractile and structural pathways. These results extend our understanding of the biological pathways underlying AF and may facilitate the development of therapeutics for AF

    Application of Entropy Ensemble Filter in Neural Network Forecasts of Tropical Pacific Sea Surface Temperatures

    No full text
    Recently, the Entropy Ensemble Filter (EEF) method was proposed to mitigate the computational cost of the Bootstrap AGGregatING (bagging) method. This method uses the most informative training data sets in the model ensemble rather than all ensemble members created by the conventional bagging. In this study, we evaluate, for the first time, the application of the EEF method in Neural Network (NN) modeling of El Nino-southern oscillation. Specifically, we forecast the first five principal components (PCs) of sea surface temperature monthly anomaly fields over tropical Pacific, at different lead times (from 3 to 15 months, with a three-month increment) for the period 1979–2017. We apply the EEF method in a multiple-linear regression (MLR) model and two NN models, one using Bayesian regularization and one Levenberg-Marquardt algorithm for training, and evaluate their performance and computational efficiency relative to the same models with conventional bagging. All models perform equally well at the lead time of 3 and 6 months, while at higher lead times, the MLR model’s skill deteriorates faster than the nonlinear models. The neural network models with both bagging methods produce equally successful forecasts with the same computational efficiency. It remains to be shown whether this finding is sensitive to the dataset size.Applied Science, Faculty ofScience, Faculty ofCivil Engineering, Department ofEarth, Ocean and Atmospheric Sciences, Department ofReviewedFacult

    Accounting for Observational Uncertainty in Forecast Verification: An Information-Theoretical View on Forecasts, Observations, and Truth

    No full text
    Recently, an information-theoretical decomposition of Kullback–Leibler divergence into uncertainty, reliability, and resolution was introduced. In this article, this decomposition is generalized to the case where the observation is uncertain. Along with a modified decomposition of the divergence score, a second measure, the cross-entropy score, is presented, which measures the estimated information loss with respect to the truth instead of relative to the uncertain observations. The difference between the two scores is equal to the average observational uncertainty and vanishes when observations are assumed to be perfect. Not acknowledging for observation uncertainty can lead to both overestimation and underestimation of forecast skill, depending on the nature of the noise process

    Kullback–Leibler Divergence as a Forecast Skill Score with Classic Reliability–Resolution–Uncertainty Decomposition

    No full text
    This paper presents a score that can be used for evaluating probabilistic forecasts of multicategory events. The score is a reinterpretation of the logarithmic score or ignorance score, now formulated as the relative entropy or Kullback–Leibler divergence of the forecast distribution from the observation distribution. Using the information–theoretical concepts of entropy and relative entropy, a decomposition into three components is presented, analogous to the classic decomposition of the Brier score. The information–theoretical twins of the components uncertainty, resolution, and reliability provide diagnostic information about the quality of forecasts. The overall score measures the information conveyed by the forecast. As was shown recently, information theory provides a sound framework for forecast verification. The new decomposition, which has proven to be very useful for the Brier score and is widely used, can help acceptance of the logarithmic score in meteorology
    corecore