35 research outputs found
Quantum statistical inference and communication
This thesis studies the limits on the performances of inference tasks with quantum data
and quantum operations. Our results can be divided in two main parts.
In the first part, we study how to infer relative properties of sets of quantum states,
given a certain amount of copies of the states. We investigate the performance of optimal
inference strategies according to several figures of merit which quantifies the precision of
the inference. Since we are not interested in obtaining a complete reconstruction of the
states, optimal strategies do not require to perform quantum tomography. In particular,
we address the following problems:
- We evaluate the asymptotic error probabilities of optimal learning machines for
quantum state discrimination. Here, a machine receives a number of copies of a
pair of unknown states, which can be seen as training data, together with a test
system which is initialized in one of the states of the pair with equal probability.
The goal is to implement a measurement to discriminate in which state the test
system is, minimizing the error probability. We analyze the optimal strategies for
a number of different settings, differing on the prior incomplete information on the
states available to the agent.
- We evaluate the limits on the precision of the estimation of the overlap between two
unknown pure states, given N and M copies of each state. We find an asymptotic
expansion of a Fisher information associated with the estimation problem, which
gives a lower bound on the mean square error of any estimator. We compute the
minimum average mean square error for random pure states, and we evaluate the
effect of depolarizing noise on qubit states. We compare the performance of the
optimal estimation strategy with the performances of other intuitive strategies,
such as the swap test and measurements based on estimating the states.
- We evaluate how many samples from a collection of N d-dimensional states are
necessary to understand with high probability if the collection is made of identical
states or they differ more than a threshold according to a motivated closeness
measure. The access to copies of the states in the collection is given as follows:
each time the agent ask for a copy of the states, the agent receives one of the states with some fixed probability, together with a different label for each state in the collection. We prove that the problem can be solved with O(pNd=2) copies, and
that this scaling is optimal up to a constant independent on d;N; .
In the second part, we study optimal classical and quantum communication rates for
several physically motivated noise models.
- The quantum and private capacities of most realistic channels cannot be evaluated
from their regularized expressions. We design several degradable extensions
for notable channels, obtaining upper bounds on the quantum and private capacities
of the original channels. We obtain sufficient conditions for the degradability
of flagged extensions of channels which are convex combination of other channels.
These sufficient conditions are easy to verify and simplify the construction of
degradable extensions.
- We consider the problem of transmitting classical information with continuous variable
systems and an energy constraint, when it is impossible to maintain a shared
reference frame and in presence of losses. At variance with phase-insensitive noise
models, we show that, in some regimes, squeezing improves the communication
rates with respect to coherent state sources and with respect to sources producing
up to two-photon Fock states. We give upper and lower bounds on the optimal
coherent state rate and show that using part of the energy to repeatedly restore a
phase reference is strictly suboptimal for high energies
Information Geometry
This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience
Numerical aspects of uncertainty in the design of optimal experiments for model discrimination
This thesis investigates robust strategies of optimal experimental design for discrimination between several nonlinear regression models. It develops novel theory, efficient algorithms, and implementations of such strategies, and provides a framework for assessing and comparing their practical performance. The framework is employed to perform extensive case studies. Their results demonstrate the success of the novel strategies.
The thesis contributes advances over existing theory and techniques in various fields as follows:
The thesis proposes novel “misspecification-robust” data-based approximation formulas for the covariances of maximum-likelihood estimators and of Bayesian posterior distributions of parameters in nonlinear incorrect models. The formulas adequately quantify parameter uncertainty even if the model is both nonlinear and systematically incorrect.
The thesis develops a framework of novel statistical measures and tailored efficient algorithms for the simulation-based assessment of covariance approximations for maximum-likelihood estimator for parameters. Fully parallelized variants of the algorithms are implemented in the software package DoeSim.
Using DoeSim, the misspecification-robust covariance formula for maximum-likelihood estimators (MLEs) and its “classic” alternative are compared in an extensive numerical case study. The results demonstrate the superiority of the misspecification-robust formula.
Two novel sequential design criteria for model discrimination are proposed. They take into account parameter uncertainty with the new misspecification-robust posterior covariance formula. It is shown that both design criteria constitute an improvement over a popular approximation of the Box-Hill-Hunter-criterion. In contrast to the latter, they avoid to overestimate the expected amount of information provided by an experiment.
The thesis clarifies that the popular Gauss-Newton method is generally not appropriate for finding least-squares parameter estimates in the context of model discrimination. Furthermore, it demonstrates that a large class of optimal experimental design optimization problems for model discrimination is intrinsically non-convex even under strong simplifying assumptions. Such problems are NP-hard and particularly difficult to solve numerically.
A framework is developed for the quantitative assessment and comparison of sequential optimal experimental design strategies for model discrimination. It consists of new statistical measures of their practical performance and problem-adapted algorithms to compute these measures. A state-of-the-art modular and parallelized implementation is provided in the software package DoeSim. The framework permits quantitative analyses of the broad range of behaviour that a design strategy shows under fluctuating data.
The practical performance of four established and three novel sequential design criteria for model discrimination is examined in an extensive simulation study. The study is performed with DoeSim and comprises a large number of model discrimination problems. The behaviour of the design criteria is examined under different magnitudes of measurement error and for different number of rival models.
Central results from the study are that a popular approximation of the Box-Hill-Hunter-criterion is surprisingly inefficient, particularly in problems with three or more models, that all parameter-robust design criteria in fact outperform the basic Hunter-Reiner-strategy, and that the newly proposed novel design criteria are among the most efficient ones. The latter show particularly strong advantages over their alternatives when facing demanding model discrimination problems with many rival model and large measurement errors
The Statistical Foundations of Entropy
In the last two decades, the understanding of complex dynamical systems underwent important conceptual shifts. The catalyst was the infusion of new ideas from the theory of critical phenomena (scaling laws, renormalization group, etc.), (multi)fractals and trees, random matrix theory, network theory, and non-Shannonian information theory. The usual Boltzmann–Gibbs statistics were proven to be grossly inadequate in this context. While successful in describing stationary systems characterized by ergodicity or metric transitivity, Boltzmann–Gibbs statistics fail to reproduce the complex statistical behavior of many real-world systems in biology, astrophysics, geology, and the economic and social sciences.The aim of this Special Issue was to extend the state of the art by original contributions that could contribute to an ongoing discussion on the statistical foundations of entropy, with a particular emphasis on non-conventional entropies that go significantly beyond Boltzmann, Gibbs, and Shannon paradigms. The accepted contributions addressed various aspects including information theoretic, thermodynamic and quantum aspects of complex systems and found several important applications of generalized entropies in various systems
Contributions to anomaly detection and correction in co-evolving data streams via subspace learning
During decades, estimation and detection tasks in many Signal Processing and Communications applications have been significantly improved by using subspace and component-based techniques. More recently, subspace methods have been adopted in many hot topics such as Machine Learning, Data Analytics or smart MIMO communications, in order to have a geometric interpretation of the problem. In that way, the Subspace-based algorithms often arise new approaches for already-explored problems, while offering the valuable advantage of giving interpretability to the procedures and solutions. On the other hand, in those recent hot topics, one may also find applications where the detection of unwanted or out-of-the-model artifacts and outliers is crucial. To this extend, we were previously working in the domain of GNSS PPP, detecting phase ambiguities, where we found motivation into the development of novel solutions for this application. After considering the applications and advantages of subspace-based approaches, this work will be focused on the exploration and extension of the ideas of subspace learning in the context of anomaly detection, where we show promising and original results in the areas of anomaly detection and subspace-based anomaly detection, in the form of two new algorithms: the Dual Ascent for Sparse Anomaly Detection and the Subspace-based Dual Ascent for Anomaly Detection and Tracking
Information geometry
This Special Issue of the journal Entropy, titled “Information Geometry I”, contains a collection of 17 papers concerning the foundations and applications of information geometry. Based on a geometrical interpretation of probability, information geometry has become a rich mathematical field employing the methods of differential geometry. It has numerous applications to data science, physics, and neuroscience. Presenting original research, yet written in an accessible, tutorial style, this collection of papers will be useful for scientists who are new to the field, while providing an excellent reference for the more experienced researcher. Several papers are written by authorities in the field, and topics cover the foundations of information geometry, as well as applications to statistics, Bayesian inference, machine learning, complex systems, physics, and neuroscience
Estimation of Disaggregated Indicators with Application to the Household Finance and Consumption Survey
International institutions and national statistical institutes are increasingly expected to report disaggregated indicators, i.e., means, ratios or Gini coefficients for different regional levels, socio-demographic groups or other subpopulations. These subpopulations are called areas or domains in this thesis. The data sources that are used to estimate these disaggregated indicators are mostly national surveys which may have small sample sizes for the domains of interest. Therefore, direct estimates that are based only on the survey data might be unreliable. To overcome this problem, small area estimation (SAE) methods help to increase the precision of survey-based estimates without demanding larger and more costly surveys. In SAE, the collected survey data is combined with other data sources, e.g., administrative and register data or data that is a by-product of digital activities.
The data requirements for various SAE methods depend to a large extent on whether the indicator of interest is a linear or non-linear function of a quantitative variable. For the estimation of linear indicators, e.g., the mean, aggregated data is sufficient, that is, direct estimates and auxiliary information from other data sources only need to be available for each domain. One popular area-level approach in this context is the Fay-Herriot model that is studied in Part 1 of this work. In Chapter 1, the Fay-Herriot model is used to estimate the regional distribution of the mean household net wealth in Germany. The analysis is based on the Household Finance and Consumption Survey (HFCS) that was launched by the European Central bank and several statistical institutes in 2010. The main challenge of applying the Fay-Herriot approach in this context is to handle the issues arising from the data: a) the skewness of the wealth distribution, b) informative weights due to, among others, unit non-response, and c) multiple imputation to deal with item non-response. For the latter, a modified Fay-Herriot model that accounts for the additional uncertainty due to multiple imputation is proposed in this thesis. It is combined with known solutions for the other two issues and applied to estimate mean net wealth at low regional levels.
The Deutsche Bundesbank that is responsible for reporting the wealth distribution in Germany, as well as many economic institutes, predominantly work with the statistical software Stata. In order to provide the Fay-Herriot model and its extensions used in Chapter 1, a new Stata command called fayherriot is programmed in the context of this thesis to make the approach available for practitioners. Chapter 2 describes the functionality of the command with an application to income data from the Socio-Economic Panel, one of the largest panel surveys in Germany. The example application demonstrates how the Fay-Herriot approach helps to increase the reliability of estimates for mean household income compared to direct estimates at three different regional levels.
In an extension to estimating linear indicators, Part 2 deals with the estimation of non-linear income and wealth indicators. Since the mean is sensitive to outliers, the median and other quantiles are also of interest when estimating the income or wealth distribution. As a first approach, this thesis focuses on the direct estimation of quantiles, which is not as straightforward as for the mean. In Chapter 3, common quantile definitions implemented in standard statistical software are empirically evaluated based on income and wealth distributions with regards to their bias. The analysis shows that, especially for wealth data that is mostly heavily skewed, sample sizes need to be large in order to obtain unbiased direct estimates with the common quantile definitions.
Since a design-unbiased direct estimator is one assumption of the aforementioned Fay-Herriot model, further research would be necessary in order to use the Fay-Herriot approach for the estimation of quantiles when the underlying data is heavily skewed. More common methods for producing reliable estimates for non-linear indicators -- including quantiles, poverty indicators, and inequality indicators such as the Gini coefficient -- in small domains are unit-level SAE methods. However, for these methods, the data requirements are more restrictive. Both the survey data and the auxiliary data need to be available for each unit in each domain. Among others, the empirical best prediction (EBP), the World-Bank method, and the M-Quantile approach are well-known methods for the estimation of non-linear indicators in small domains. However, these methods are either not available in statistical software or the user-friendliness is limited. Therefore, in this work the R package emdi is developed that focuses on an user-friendly application of the EBP. Chapter 4 describes how the package emdi supports the user beyond the estimation by tools for assessing and presenting the results.
Both, area- and unit-level SAE models, are based on linear mixed regression models that rely on a set of assumptions, particularly the linearity and normality of the error terms. If these assumptions are not fulfilled, transforming the response variable is one possible solution. Therefore, Part 3 provides a guideline for the usage of transformations. Chapter 5 gives an extensive overview of different transformations applicable in linear and linear mixed regression models and discusses practical challenges. The implementation of various transformations and estimation methods for transformation parameters are provided by the R package trafo that is described in Chapter 6.
Altogether, this work contributes to the literature by
a) combining SAE and multiple imputation proposing a modified Fay-Herriot approach,
b) showing limitations of existing quantile definitions with regards to the bias when data is skewed and the sample size is small,
c) closing the gap between academic research and practical applications by providing user-friendly software for the estimation of linear and non-linear indicators, and
d) giving a framework for the usage of transformations in linear and linear mixed regression models
Advances in approximate Bayesian computation and trans-dimensional sampling methodology
Bayesian statistical models continue to grow in complexity, driven
in part by a few key factors: the massive computational resources
now available to statisticians; the substantial gains made in
sampling methodology and algorithms such as Markov chain
Monte Carlo (MCMC), trans-dimensional MCMC (TDMCMC), sequential
Monte Carlo (SMC), adaptive algorithms and stochastic
approximation methods and approximate Bayesian computation (ABC);
and development of more realistic models for real world phenomena
as demonstrated in this thesis for financial models and
telecommunications engineering. Sophisticated statistical models
are increasingly proposed for practical solutions to real world problems in order to better capture salient features of
increasingly more complex data. With sophistication comes a
parallel requirement for more advanced and automated statistical
computational methodologies.
The key focus of this thesis revolves around innovation related to
the following three significant Bayesian research questions.
1. How can one develop practically useful Bayesian models and corresponding computationally efficient sampling methodology, when the likelihood model is intractable?
2. How can one develop methodology in order to automate Markov chain Monte Carlo sampling approaches to efficiently explore the support of a posterior distribution, defined across multiple Bayesian statistical models?
3. How can these sophisticated Bayesian modelling frameworks and sampling methodologies be utilized to solve practically relevant and important problems in the research fields of financial risk modeling and telecommunications engineering ?
This thesis is split into three bodies of work represented in
three parts. Each part contains journal papers with novel
statistical model and sampling methodological development. The
coherent link between each part involves the novel
sampling methodologies developed in Part I and utilized in Part II and Part III. Papers contained in
each part make progress at addressing the core research
questions posed.
Part I of this thesis presents generally applicable key
statistical sampling methodologies that will be utilized and
extended in the subsequent two parts. In particular it presents
novel developments in statistical methodology pertaining to
likelihood-free or ABC and TDMCMC methodology.
The TDMCMC methodology focuses on several aspects of automation
in the between model proposal construction, including
approximation of the optimal between model proposal kernel via a
conditional path sampling density estimator. Then this methodology
is explored for several novel Bayesian model selection
applications including cointegrated vector autoregressions (CVAR)
models and mixture models in which there is an unknown number of
mixture components. The second area relates to development of
ABC methodology with particular focus
on SMC Samplers methodology in an ABC context via Partial
Rejection Control (PRC). In addition to novel algorithmic
development, key theoretical properties are also studied for the
classes of algorithms developed. Then this methodology is
developed for a highly challenging practically significant
application relating to multivariate Bayesian -stable
models.
Then Part II focuses on novel statistical model development
in the areas of financial risk and non-life insurance claims
reserving. In each of the papers in this part the focus is on
two aspects: foremost the development of novel statistical models
to improve the modeling of risk and insurance; and then the
associated problem of how to fit and sample from such statistical
models efficiently. In particular novel statistical models are
developed for Operational Risk (OpRisk) under a Loss Distributional
Approach (LDA) and for claims reserving in Actuarial non-life
insurance modelling. In each case the models developed include an
additional level of complexity which adds flexibility to the model
in order to better capture salient features observed in real data.
The consequence of the additional complexity comes at the cost
that standard fitting and sampling methodologies are generally not
applicable, as a result one is required to develop and apply the
methodology from Part I.
Part III focuses on novel statistical model development
in the area of statistical signal processing for wireless
communications engineering. Statistical models will be developed
or extended for two general classes of wireless communications
problem: the first relates to detection of transmitted symbols and
joint channel estimation in Multiple Input Multiple Output (MIMO)
systems coupled with Orthogonal Frequency Division Multiplexing
(OFDM); the second relates to co-operative wireless communications
relay systems in which the key focus is on detection of
transmitted symbols. Both these areas will require advanced
sampling methodology developed in Part I to find solutions to
these real world engineering problems