30,136 research outputs found

    Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data

    Get PDF
    While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through priors of the parameters and states. We investigate joint parameter-state estimation by the regularized posterior in a physically motivated nonlinear stochastic energy balance model (SEBM) for paleoclimate reconstruction. The high-dimensional posterior is sampled by a particle Gibbs sampler that combines MCMC with an optimal particle filter exploiting the structure of the SEBM. In tests using either Gaussian or uniform priors based on the physical range of parameters, the regularized posteriors overcome the ill-posedness and lead to samples within physical ranges, quantifying the uncertainty in estimation. Due to the ill-posedness and the regularization, the posterior of parameters presents a relatively large uncertainty, and consequently, the maximum of the posterior, which is the minimizer in a variational approach, can have a large variation. In contrast, the posterior of states generally concentrates near the truth, substantially filtering out observation noise and reducing uncertainty in the unconstrained SEBM

    Equitability, mutual information, and the maximal information coefficient

    Get PDF
    Reshef et al. recently proposed a new statistical measure, the "maximal information coefficient" (MIC), for quantifying arbitrary dependencies between pairs of stochastic quantities. MIC is based on mutual information, a fundamental quantity in information theory that is widely understood to serve this need. MIC, however, is not an estimate of mutual information. Indeed, it was claimed that MIC possesses a desirable mathematical property called "equitability" that mutual information lacks. This was not proven; instead it was argued solely through the analysis of simulated data. Here we show that this claim, in fact, is incorrect. First we offer mathematical proof that no (non-trivial) dependence measure satisfies the definition of equitability proposed by Reshef et al.. We then propose a self-consistent and more general definition of equitability that follows naturally from the Data Processing Inequality. Mutual information satisfies this new definition of equitability while MIC does not. Finally, we show that the simulation evidence offered by Reshef et al. was artifactual. We conclude that estimating mutual information is not only practical for many real-world applications, but also provides a natural solution to the problem of quantifying associations in large data sets

    Estimating the Spot Covariation of Asset Prices - Statistical Theory and Empirical Evidence

    Full text link
    We propose a new estimator for the spot covariance matrix of a multi-dimensional continuous semi-martingale log asset price process which is subject to noise and non-synchronous observations. The estimator is constructed based on a local average of block-wise parametric spectral covariance estimates. The latter originate from a local method of moments (LMM) which recently has been introduced. We prove consistency and a point-wise stable central limit theorem for the proposed spot covariance estimator in a very general setup with stochastic volatility, leverage effects and general noise distributions. Moreover, we extend the LMM estimator to be robust against autocorrelated noise and propose a method to adaptively infer the autocorrelations from the data. Based on simulations we provide empirical guidance on the effective implementation of the estimator and apply it to high-frequency data of a cross-section of Nasdaq blue chip stocks. Employing the estimator to estimate spot covariances, correlations and volatilities in normal but also unusual periods yields novel insights into intraday covariance and correlation dynamics. We show that intraday (co-)variations (i) follow underlying periodicity patterns, (ii) reveal substantial intraday variability associated with (co-)variation risk, and (iii) can increase strongly and nearly instantaneously if new information arrives

    RascalC: A Jackknife Approach to Estimating Single and Multi-Tracer Galaxy Covariance Matrices

    Full text link
    To make use of clustering statistics from large cosmological surveys, accurate and precise covariance matrices are needed. We present a new code to estimate large scale galaxy two-point correlation function (2PCF) covariances in arbitrary survey geometries that, due to new sampling techniques, runs 104\sim 10^4 times faster than previous codes, computing finely-binned covariance matrices with negligible noise in less than 100 CPU-hours. As in previous works, non-Gaussianity is approximated via a small rescaling of shot-noise in the theoretical model, calibrated by comparing jackknife survey covariances to an associated jackknife model. The flexible code, RascalC, has been publicly released, and automatically takes care of all necessary pre- and post-processing, requiring only a single input dataset (without a prior 2PCF model). Deviations between large scale model covariances from a mock survey and those from a large suite of mocks are found to be be indistinguishable from noise. In addition, the choice of input mock are shown to be irrelevant for desired noise levels below 105\sim 10^5 mocks. Coupled with its generalization to multi-tracer data-sets, this shows the algorithm to be an excellent tool for analysis, reducing the need for large numbers of mock simulations to be computed.Comment: 29 pages, 8 figures. Accepted by MNRAS. Code is available at http://github.com/oliverphilcox/RascalC with documentation at http://rascalc.readthedocs.io

    Statistical Properties of Galactic Starlight Polarization

    Full text link
    We present a statistical analysis of Galactic interstellar polarization from the largest compilation available of starlight data. The data comprises ~ 9300 stars of which we have selected ~ 5500 for our analysis. We find a nearly linear growth of mean polarization degree with extinction. The amplitude of this correlation shows that interstellar grains are not fully aligned with the Galactic magnetic field, which can be interpreted as the effect of a large random component of the field. In agreement with earlier studies of more limited scope, we estimate the ratio of the uniform to the random plane-of-the-sky components of the magnetic field to be B_u/B_r = 0.8. Moreover, a clear correlation exists between polarization degree and polarization angle what provides evidence that the magnetic field geometry follows Galactic structures on large-scales. The angular power spectrum C_l of the starlight polarization degree for Galactic plane data (|b| < 10 deg) is consistent with a power-law, C_l ~ l^{-1.5} (where l ~ 180 deg/\theta is the multipole order), for all angular scales \theta > 10 arcmin. An investigation of sparse and inhomogeneous sampling of the data shows that the starlight data analyzed traces an underlying polarized continuum that has the same power spectrum slope, C_l ~ l^{-1.5}. Our findings suggest that starlight data can be safely used for the modeling of Galactic polarized continuum emission at other wavelengths.Comment: 31 pages, 11 figures. Minor corrections and some clarifications included. Matches version accepted for publication by the Astrophysical Journa

    Cosmic shear analysis of archival HST/ACS data: I. Comparison of early ACS pure parallel data to the HST/GEMS Survey

    Get PDF
    This is the first paper of a series describing our measurement of weak lensing by large-scale structure using archival observations from the Advanced Camera for Surveys (ACS) on board the Hubble Space Telescope (HST). In this work we present results from a pilot study testing the capabilities of the ACS for cosmic shear measurements with early parallel observations and presenting a re-analysis of HST/ACS data from the GEMS survey and the GOODS observations of the Chandra Deep Field South (CDFS). We describe our new correction scheme for the time-dependent ACS PSF based on observations of stellar fields. This is currently the only technique which takes the full time variation of the PSF between individual ACS exposures into account. We estimate that our PSF correction scheme reduces the systematic contribution to the shear correlation functions due to PSF distortions to < 2*10^{-6} for galaxy fields containing at least 10 stars. We perform a number of diagnostic tests indicating that the remaining level of systematics is consistent with zero for the GEMS and GOODS data confirming the success of our PSF correction scheme. For the parallel data we detect a low level of remaining systematics which we interpret to be caused by a lack of sufficient dithering of the data. Combining the shear estimate of the GEMS and GOODS observations using 96 galaxies arcmin^{-2} with the photometric redshift catalogue of the GOODS-MUSIC sample, we determine a local single field estimate for the mass power spectrum normalisation sigma_{8,CDFS}=0.52^{+0.11}_{-0.15} (stat) +/- 0.07 (sys) (68% confidence assuming Gaussian cosmic variance) at fixed Omega_m=0.3 for a LambdaCDM cosmology. We interpret this exceptionally low estimate to be due to a local under-density of the foreground structures in the CDFS.Comment: Version accepted for publication in Astronomy & Astrophysics with 28 pages, 25 figures. A version with full resolution figures can be downloaded from http://www.astro.uni-bonn.de/~schrabba/papers/cosmic_shear_acs1_v2.pd

    FASTLens (FAst STatistics for weak Lensing) : Fast method for Weak Lensing Statistics and map making

    Full text link
    With increasingly large data sets, weak lensing measurements are able to measure cosmological parameters with ever greater precision. However this increased accuracy also places greater demands on the statistical tools used to extract the available information. To date, the majority of lensing analyses use the two point-statistics of the cosmic shear field. These can either be studied directly using the two-point correlation function, or in Fourier space, using the power spectrum. But analyzing weak lensing data inevitably involves the masking out of regions or example to remove bright stars from the field. Masking out the stars is common practice but the gaps in the data need proper handling. In this paper, we show how an inpainting technique allows us to properly fill in these gaps with only NlogNN \log N operations, leading to a new image from which we can compute straight forwardly and with a very good accuracy both the pow er spectrum and the bispectrum. We propose then a new method to compute the bispectrum with a polar FFT algorithm, which has the main advantage of avoiding any interpolation in the Fourier domain. Finally we propose a new method for dark matter mass map reconstruction from shear observations which integrates this new inpainting concept. A range of examples based on 3D N-body simulations illustrates the results.Comment: Final version accepted by MNRAS. The FASTLens software is available from the following link : http://irfu.cea.fr/Ast/fastlens.software.ph

    Transform-based particle filtering for elliptic Bayesian inverse problems

    Get PDF
    We introduce optimal transport based resampling in adaptive SMC. We consider elliptic inverse problems of inferring hydraulic conductivity from pressure measurements. We consider two parametrizations of hydraulic conductivity: by Gaussian random field, and by a set of scalar (non-)Gaussian distributed parameters and Gaussian random fields. We show that for scalar parameters optimal transport based SMC performs comparably to monomial based SMC but for Gaussian high-dimensional random fields optimal transport based SMC outperforms monomial based SMC. When comparing to ensemble Kalman inversion with mutation (EKI), we observe that for Gaussian random fields, optimal transport based SMC gives comparable or worse performance than EKI depending on the complexity of the parametrization. For non-Gaussian distributed parameters optimal transport based SMC outperforms EKI
    corecore