34 research outputs found

    Mining for cosmological information: Simulation-based methods for Redshift Space Distortions and Galaxy Clustering

    Get PDF
    The standard model of cosmology describes the complex large scale structure of the Universe through less than 10 free parameters. However, concordance with observations requires that about 95\% of the energy content of the universe is invisible to us. Most of this energy is postulated to be in the form of a cosmological constant, Λ\Lambda, which drives the observed accelerated expansion of the Universe. Its nature is, however, unknown. This mystery forces cosmologists to look for inconsistencies between theory and data, searching for clues. But finding statistically significant contradictions requires extremely accurate measurements of the composition of the Universe, which are at present limited by our inability to extract all the information contained in the data, rather than being limited by the data itself. In this Thesis, we study how we can overcome these limitations by i) modelling how galaxies cluster on small scales with simulation-based methods, where perturbation theory fails to provide accurate predictions, and ii) developing summary statistics of the density field that are capable of extracting more information than the commonly used two-point functions. In the first half, we show how the real to redshift space mapping can be modelled accurately by going beyond the Gaussian approximation for the pairwise velocity distribution. We then show that simulation-based models can accurately predict the full shape of galaxy clustering in real space, increasing the constraining power on some of the cosmological parameters by a factor of 2 compared to perturbation theory methods. In the second half, we measure the information content of density dependent clustering. We show that it can improve the constraints on all cosmological parameters by factors between 3 and 8 over the two-point function. In particular, exploiting the environment dependence can constrain the mass of neutrinos by a factor of 8$ better than the two-point correlation function alone. We hope that the techniques described in this thesis will contribute to extracting all the cosmological information contained in ongoing and upcoming galaxy surveys, and provide insight into the nature of the accelerated expansion of the universe

    A point cloud approach to generative modeling for galaxy surveys at the field level

    Full text link
    We introduce a diffusion-based generative model to describe the distribution of galaxies in our Universe directly as a collection of points in 3-D space (coordinates) optionally with associated attributes (e.g., velocities and masses), without resorting to binning or voxelization. The custom diffusion model can be used both for emulation, reproducing essential summary statistics of the galaxy distribution, as well as inference, by computing the conditional likelihood of a galaxy field. We demonstrate a first application to massive dark matter haloes in the Quijote simulation suite. This approach can be extended to enable a comprehensive analysis of cosmological data, circumventing limitations inherent to summary statistic -- as well as neural simulation-based inference methods.Comment: 15+3 pages, 7+4 figure

    Cosmological Field Emulation and Parameter Inference with Diffusion Models

    Full text link
    Cosmological simulations play a crucial role in elucidating the effect of physical parameters on the statistics of fields and on constraining parameters given information on density fields. We leverage diffusion generative models to address two tasks of importance to cosmology -- as an emulator for cold dark matter density fields conditional on input cosmological parameters Ωm\Omega_m and σ8\sigma_8, and as a parameter inference model that can return constraints on the cosmological parameters of an input field. We show that the model is able to generate fields with power spectra that are consistent with those of the simulated target distribution, and capture the subtle effect of each parameter on modulations in the power spectrum. We additionally explore their utility as parameter inference models and find that we can obtain tight constraints on cosmological parameters.Comment: 7 pages, 5 figures, Accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 202

    Probabilistic reconstruction of Dark Matter fields from biased tracers using diffusion models

    Full text link
    Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. The relationship between dark matter density fields and galaxy distributions can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain uncertain in many aspects. Based on state-of-the-art galaxy formation simulation suites with varied cosmological parameters and sub-grid astrophysics, we develop a diffusion generative model to predict the unbiased posterior distribution of the underlying dark matter fields from the given stellar mass fields, while being able to marginalize over the uncertainties in cosmology and galaxy formation

    Learning an Effective Evolution Equation for Particle-Mesh Simulations Across Cosmologies

    Full text link
    Particle-mesh simulations trade small-scale accuracy for speed compared to traditional, computationally expensive N-body codes in cosmological simulations. In this work, we show how a data-driven model could be used to learn an effective evolution equation for the particles, by correcting the errors of the particle-mesh potential incurred on small scales during simulations. We find that our learnt correction yields evolution equations that generalize well to new, unseen initial conditions and cosmologies. We further demonstrate that the resulting corrected maps can be used in a simulation-based inference framework to yield an unbiased inference of cosmological parameters. The model, a network implemented in Fourier space, is exclusively trained on the particle positions and velocities.Comment: 7 pages, 4 figures, Machine Learning and the Physical Sciences Workshop, NeurIPS 202

    MGLENS: Modified gravity weak lensing simulations for emulation-based cosmological inference

    Get PDF
    We present MGLENS, a large series of modified gravity lensing simulations tailored for cosmic shear data analyses and forecasts in which cosmological and modified gravity parameters are varied simultaneously. Based on the FORGE and BRIDGE N-body simulation suites presented in companion papers, we construct 100 × 5000 deg2 of mock Stage-IV lensing data from two 4D Latin hypercubes that sample cosmological and gravitational parameters in f(R) and nDGP gravity, respectively. These are then used to validate our inference analysis pipeline based on the lensing power spectrum, exploiting our implementation of these modified gravity models within the COSMOSIS cosmological inference package. Sampling this new likelihood, we find that cosmic shear can achieve 95 per cent CL constraints on the modified gravity parameters of log10[fR0 ] 0.09, after marginalizing over intrinsic alignments of galaxies and including scales up to = 5000. We also investigate the impact of photometric uncertainty, scale cuts, and covariance matrices. We finally explore the consequences of analysing MGLENS data with the wrong gravity model, and report catastrophic biases for a number of possible scenarios. The Stage-IV MGLENS simulations,the FORGE and BRIDGE emulators and the COSMOSIS interface modules will be made publicly available upon journal acceptance

    Constraining νΛ\nu \LambdaCDM with density-split clustering

    Full text link
    The dependence of galaxy clustering on local density provides an effective method for extracting non-Gaussian information from galaxy surveys. The two-point correlation function (2PCF) provides a complete statistical description of a Gaussian density field. However, the late-time density field becomes non-Gaussian due to non-linear gravitational evolution and higher-order summary statistics are required to capture all of its cosmological information. Using a Fisher formalism based on halo catalogues from the Quijote simulations, we explore the possibility of retrieving this information using the density-split clustering (DS) method, which combines clustering statistics from regions of different environmental density. We show that DS provides more precise constraints on the parameters of the νΛ\nu \LambdaCDM model compared to the 2PCF, and we provide suggestions for where the extra information may come from. DS improves the constraints on the sum of neutrino masses by a factor of 88 and by factors of 5, 3, 4, 6, and 6 for Ωm\Omega_m, Ωb\Omega_b, hh, nsn_s, and σ8\sigma_8, respectively. We compare DS statistics when the local density environment is estimated from the real or redshift-space positions of haloes. The inclusion of DS autocorrelation functions, in addition to the cross-correlation functions between DS environments and haloes, recovers most of the information that is lost when using the redshift-space halo positions to estimate the environment. We discuss the possibility of constructing simulation-based methods to model DS clustering statistics in different scenarios.Comment: Submitted to MNRAS. Source code for all figures in the paper is provided in the caption

    Simulation-based Inference for Exoplanet Atmospheric Retrieval: Insights from winning the Ariel Data Challenge 2023 using Normalizing Flows

    Full text link
    Advancements in space telescopes have opened new avenues for gathering vast amounts of data on exoplanet atmosphere spectra. However, accurately extracting chemical and physical properties from these spectra poses significant challenges due to the non-linear nature of the underlying physics. This paper presents novel machine learning models developed by the AstroAI team for the Ariel Data Challenge 2023, where one of the models secured the top position among 293 competitors. Leveraging Normalizing Flows, our models predict the posterior probability distribution of atmospheric parameters under different atmospheric assumptions. Moreover, we introduce an alternative model that exhibits higher performance potential than the winning model, despite scoring lower in the challenge. These findings highlight the need to reevaluate the evaluation metric and prompt further exploration of more efficient and accurate approaches for exoplanet atmosphere spectra analysis. Finally, we present recommendations to enhance the challenge and models, providing valuable insights for future applications on real observational data. These advancements pave the way for more effective and timely analysis of exoplanet atmospheric properties, advancing our understanding of these distant worlds.Comment: Conference proceeding for the ECML PKDD 202

    A Parameter-Masked Mock Data Challenge for Beyond-Two-Point Galaxy Clustering Statistics

    Full text link
    The last few years have seen the emergence of a wide array of novel techniques for analyzing high-precision data from upcoming galaxy surveys, which aim to extend the statistical analysis of galaxy clustering data beyond the linear regime and the canonical two-point (2pt) statistics. We test and benchmark some of these new techniques in a community data challenge "Beyond-2pt", initiated during the Aspen 2022 Summer Program "Large-Scale Structure Cosmology beyond 2-Point Statistics," whose first round of results we present here. The challenge dataset consists of high-precision mock galaxy catalogs for clustering in real space, redshift space, and on a light cone. Participants in the challenge have developed end-to-end pipelines to analyze mock catalogs and extract unknown ("masked") cosmological parameters of the underlying Λ\LambdaCDM models with their methods. The methods represented are density-split clustering, nearest neighbor statistics, BACCO power spectrum emulator, void statistics, LEFTfield field-level inference using effective field theory (EFT), and joint power spectrum and bispectrum analyses using both EFT and simulation-based inference. In this work, we review the results of the challenge, focusing on problems solved, lessons learned, and future research needed to perfect the emerging beyond-2pt approaches. The unbiased parameter recovery demonstrated in this challenge by multiple statistics and the associated modeling and inference frameworks supports the credibility of cosmology constraints from these methods. The challenge data set is publicly available and we welcome future submissions from methods that are not yet represented.Comment: New submissions welcome! Challenge data available at https://github.com/ANSalcedo/Beyond2ptMoc
    corecore