46 research outputs found

    Performance Bounds for Parameter Estimation under Misspecified Models: Fundamental findings and applications

    Full text link
    Inferring information from a set of acquired data is the main objective of any signal processing (SP) method. In particular, the common problem of estimating the value of a vector of parameters from a set of noisy measurements is at the core of a plethora of scientific and technological advances in the last decades; for example, wireless communications, radar and sonar, biomedicine, image processing, and seismology, just to name a few. Developing an estimation algorithm often begins by assuming a statistical model for the measured data, i.e. a probability density function (pdf) which if correct, fully characterizes the behaviour of the collected data/measurements. Experience with real data, however, often exposes the limitations of any assumed data model since modelling errors at some level are always present. Consequently, the true data model and the model assumed to derive the estimation algorithm could differ. When this happens, the model is said to be mismatched or misspecified. Therefore, understanding the possible performance loss or regret that an estimation algorithm could experience under model misspecification is of crucial importance for any SP practitioner. Further, understanding the limits on the performance of any estimator subject to model misspecification is of practical interest. Motivated by the widespread and practical need to assess the performance of a mismatched estimator, the goal of this paper is to help to bring attention to the main theoretical findings on estimation theory, and in particular on lower bounds under model misspecification, that have been published in the statistical and econometrical literature in the last fifty years. Secondly, some applications are discussed to illustrate the broad range of areas and problems to which this framework extends, and consequently the numerous opportunities available for SP researchers.Comment: To appear in the IEEE Signal Processing Magazin

    Direction finding with partly calibrated uniform linear arrays

    Get PDF
    A new method for direction finding with partly calibrated uniform linear arrays (ULAs) is presented. It is based on the conventional estimation of signal parameters via rotational invariance techniques (ESPRIT) by modeling the imperfections of the ULAs as gain and phase uncertainties. For a fully calibrated array, it reduces to the conventional ESPRIT algorithm. Moreover, the direction-of-arrivals (DOAs), unknown gains, and phases of the uncalibrated sensors can be estimated in closed form without performing a spectral search. Hence, it is computationally very attractive. The Cramér-Rao bounds (CRBs) of the partly calibrated ULAs are also given. Simulation results show that the root mean squared error (RMSE) performance of the proposed algorithm is better than the conventional methods when the number of uncalibrated sensors is large. It also achieves satisfactory performance even at low signal-to-noise ratios (SNRs). © 2011 IEEE.published_or_final_versio

    A Spline LR Test for Goodness-of-Fit

    Get PDF
    Goodness-of-Fit tests, nuisance parameters, cubic spline, Neyman smooth test, Lagrange Multiplier test, stable distributions, student t distributions

    Inverse Probability Weighted Generalised Empirical Likelihood Estimators: Firm Size and R&D Revisited

    Get PDF
    The inverse probability weighted Generalised Empirical Likelihood (IPW-GEL) estimator is proposed for the estimation of the parameters of a vector of possibly non-linear unconditional moment functions in the presence of conditionally independent sample selection or attrition.The estimator is applied to the estimation of the firm size elasticity of product and process R&D expenditures using a panel of German manufacturing firms, which is affected by attrition and selection into R&D activities.IPW-GEL and IPW-GMM estimators are compared in this application as well as identification assumptions based on independent and conditionally independent sample selection.The results are similar in all specifications.research and development;generalised emperical likelihood;inverse probability weighting;propensity score;conditional independence;missing at random;selection;attrition

    An Information Fusion Perspective

    Get PDF
    A fundamental issue concerned the effectiveness of the Bayesian filter is raised.The observation-only (O2) inference is presented for dynamic state estimation.The "probability of filter benefit" is defined and quantitatively analyzed.Convincing simulations demonstrate that many filters can be easily ineffective. The general solution for dynamic state estimation is to model the system as a hidden Markov process and then employ a recursive estimator of the prediction-correction format (of which the best known is the Bayesian filter) to statistically fuse the time-series observations via models. The performance of the estimator greatly depends on the quality of the statistical mode assumed. In contrast, this paper presents a modeling-free solution, referred to as the observation-only (O2) inference, which infers the state directly from the observations. A Monte Carlo sampling approach is correspondingly proposed for unbiased nonlinear O2 inference. With faster computational speed, the performance of the O2 inference has identified a benchmark to assess the effectiveness of conventional recursive estimators where an estimator is defined as effective only when it outperforms on average the O2 inference (if applicable). It has been quantitatively demonstrated, from the perspective of information fusion, that a prior "biased" information (which inevitably accompanies inaccurate modelling) can be counterproductive for a filter, resulting in an ineffective estimator. Classic state space models have shown that a variety of Kalman filters and particle filters can easily be ineffective (inferior to the O2 inference) in certain situations, although this has been omitted somewhat in the literature

    Inverse Probability Weighted Generalised Empirical Likelihood Estimators:Firm Size and R&D Revisited

    Get PDF
    The inverse probability weighted Generalised Empirical Likelihood (IPW-GEL) estimator is proposed for the estimation of the parameters of a vector of possibly non-linear unconditional moment functions in the presence of conditionally independent sample selection or attrition.The estimator is applied to the estimation of the firm size elasticity of product and process R&D expenditures using a panel of German manufacturing firms, which is affected by attrition and selection into R&D activities.IPW-GEL and IPW-GMM estimators are compared in this application as well as identification assumptions based on independent and conditionally independent sample selection.The results are similar in all specifications.

    Optimierung von Modellparametern, Quantifizierung von Unsicherheiten und Versuchsplanung in der Klimaforschung

    Get PDF
    Several methods for optimization of model parameters, uncertainty quantification and uncertainty reduction by optimal experimental designs are studied and applied to models with different computational complexity from climate research. The generalized least squares estimator and its special cases the weighted and the ordinary least squares estimator are described in detail together with their statistical properties. They are applied to several models using the SQP algorithm, a derivative based local optimization algorithm, in combination with the OQNLP algorithm, a globalization algorithm. This combination is proven to find model parameters well fitting to the measurement data with few function evaluations which is especially important for computationally expensive models. The uncertainty in the estimated model parameters implied by the uncertainty in the measurement data as well as the resulting uncertainty in the model output is quantified in several ways using the first and second derivative of the model with respect to its parameters. The advantages and disadvantages of the different methods are highlighted. The reduction of the uncertainty by additional measurements is predicted using optimal experimental design methods. It is determined how many measurements are advisable and how their conditions, like time, location and which process to be measured, should be chosen for an optimal uncertainty reduction. Robustification approaches, like sequential optimal experimental design and approximate worst case experimental designs are used to mitigate the dependency of predictions on the model parameters estimate. A detailed statistical description of the measurements is important for the applied methods. Therefore, a statistical analysis of millions of marine measurement data is carried out. The climatological means, the variabilities, split into climatological and short scale variabilities, and correlations are estimated from the data. The associated probability distributions are examined for normality and log-normality using statistical testing and visual inspection. To determine the correlations, an algorithm was developed that generates valid correlation matrices, i.e., positive semidefinite matrices with ones as diagonal values, from estimated correlation matrices. The algorithm tries to keep the changes as small as possible and to achieve a matrix with a low condition number. Its (worst case) execution time and memory consumption are asymptotically equal to those of the fastest algorithms to check positive semidefiniteness, making the algorithm applicable to large matrices. It is also suitable for sparse matrices because it preserves sparsity patterns. In addition to statistics, it can also be useful in numerical optimization. In the context of this thesis, several software packages were developed or extended which are freely available as open source and extensively tested. The results obtained from the models and data help to improve the understanding of the underlying processes. The applied methods are not limited to the application examples used here and can be applied to many data and models in climate research and beyond.Mehrere Methoden zur Optimierung von Modellparametern, Unsicherheitsquantifizierung und Unsicherheitsreduktion durch optimale Versuchsplanung werden untersucht und auf Modelle mit unterschiedlicher Komplexität aus der Klimaforschung angewandt. Der verallgemeinerte Kleinste-Quadrate-Schätzer und seine Spezialfälle, der gewichtete und der gewöhnliche Kleinste-Quadrate-Schätzer, werden zusammen mit ihren statistischen Eigenschaften ausführlich beschrieben. Sie werden auf mehrere Modelle unter Verwendung des SQP-Algorithmus, einem ableitungsbasierten lokalen Optimierungsalgorithmus, in Kombination mit dem OQNLP-Algorithmus, einem Globalisierungsalgorithmus, angewendet. Diese Kombination hat sich bewährt, um gut zu den Messdaten passende Modellparameter mit wenigen Funktionsauswertungen zu finden, was besonders bei rechenintensiven Modellen wichtig ist. Die Unsicherheit in den geschätzten Modellparametern, die sich aus der Unsicherheit in den Messdaten ergibt, sowie die daraus resultierende Unsicherheit in der Modellausgabe werden auf verschiedene Weise unter Verwendung der ersten und zweiten Ableitung des Modells bezüglich dessen Parameter quantifiziert. Die Vor- und Nachteile der verschiedenen Methoden werden aufgezeigt. Die Reduzierung der Unsicherheit durch zusätzliche Messungen wird mit optimale Versuchsplanungsmethoden vorhergesagt. Es wird bestimmt, wie viele Messungen sinnvoll sind und wie deren Bedingungen, wie Zeit, Ort und zu messender Prozess, für eine optimale Unsicherheitsreduzierung gewählt werden sollten. Robustifizierungsansätze, wie sequentielle optimalen Versuchsplanung und approximative Worst-Case Versuchsplanung, werden verwendet, um die Abhängigkeit der Vorhersagen von der Schätzung der Modellparameter zu verringern. Eine detaillierte statistische Beschreibung der Messungen ist für die angewandten Methoden wichtig. Daher wird eine statistische Analyse von Millionen von marinen Messdaten durchgeführt. Die klimatologischen Mittel, die Variabilitäten, unterteilt in klimatologische und kurzskalige Variabilitäten, und Korrelationen werden aus den Daten geschätzt. Die zugehörigen Wahrscheinlichkeitsverteilungen werden mittels statistischer Tests und visueller Inspektion auf Normalität und Log-Normalität untersucht. Um die Korrelationen zu bestimmen, wurde ein Algorithmus entwickelt, der aus geschätzten Korrelationsmatrizen gültige Korrelationsmatrizen erzeugt, d.h. positive semidefinite Matrizen mit Einsen als Diagonalwerte. Der Algorithmus versucht, die Änderungen so klein wie möglich zu halten und dabei eine Matrix mit einer niedrigen Konditionszahl zu erzielen. Seine (ungünstigste) Ausführungszeit und Speicherverbrauch sind asymptotisch gleich zu denen des schnellsten Algorithmus zum Überprüfen von positiv Semidefinitheit, was den Algorithmus auf große Matrizen anwendbar macht. Er ist ebenfalls geeignet für dünnbesetzte Matrizen, da er Besetzungsstrukturen bewahrt. Neben der Statistik kann der Algorithmus auch in der numerischen Optimierung nützlich sein. Im Rahmen dieser Arbeit wurden mehrere Softwarepakete entwickelt oder erweitert, welche als Open Source frei verfügbar sowie umfassend getestet sind. Die Ergebnisse aus den Modellen und Daten helfen das Verständnis der zugrunde liegenden Prozesse zu verbessern. Die angewandten Methoden beschränken sich nicht auf die hier verwendeten Anwendungsbeispiele und können auf viele Daten und Modelle in der Klimaforschung und darüber hinaus angewendet werden

    Weight Adjustment Methods and Their Impact on Sample-based Inference

    Get PDF
    Weighting samples is important to reflect not only sample design decisions made at the planning stage, but also practical issues that arise during data collection and cleaning that necessitate weighting adjustments. Adjustments to base weights are used to account for these planned and unplanned eventualities. Often these adjustments lead to variations in the survey weights from the original selection weights (i.e., the weights based solely on the sample units' probabilities of selection). Large variation in survey weights can cause inferential problems for data users. A few extremely large weights in a sample dataset can produce unreasonably large estimates of national- and domain-level estimates and their variances in particular samples, even when the estimators are unbiased over many samples. Design-based and model-based methods have been developed to adjust such extreme weights; both approaches aim to trim weights such that the overall mean square error (MSE) is lowered by decreasing the variance more than increasing the square of the bias. Design-based methods tend to be ad hoc, while Bayesian model-based methods account for population structure but can be computationally demanding. I present three research papers that expand the current weight trimming approaches under the goal of developing a broader framework that connects gaps and improves the existing alternatives. The first paper proposes more in-depth investigations of and extensions to a newly developed method called generalized design-based inference, where we condition on the realized sample and model the survey weight as a function of the response variables. This method has potential for reducing the MSE of a finite population total estimator in certain circumstances. However, there may be instances where the approach is inappropriate, so this paper includes an in-depth examination of the related theory. The second paper incorporates Bayesian prior assumptions into model-assisted penalized estimators to produce a more efficient yet robust calibration-type estimator. I also evaluate existing variance estimators for the proposed estimator. Comparisons to other estimators that are in the literature are also included. In the third paper, I develop summary- and unit-level diagnostic tools that measure the impact of variation of weights and of extreme individual weights on survey-based inference. I propose design effects to summarize the impact of variable weights produced under calibration weighting adjustments under single-stage and cluster sampling. A new diagnostic for identifying influential, individual points is also introduced in the third paper
    corecore