Constraining cosmology with machine learning and galaxy clustering: the CAMELS-SAM suite

Abstract

As the next generation of large galaxy surveys come online, it is becoming increasingly important to develop and understand the machine learning tools that analyze big astronomical data. Neural networks are powerful and capable of probing deep patterns in data, but must be trained carefully on large and representative data sets. We developed and generated a new `hump' of the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project: CAMELS-SAM, encompassing one thousand dark-matter only simulations of (100 h1h^{-1} cMpc)3^3 with different cosmological parameters (Ωm\Omega_m and σ8\sigma_8) and run through the Santa Cruz semi-analytic model for galaxy formation over a broad range of astrophysical parameters. As a proof-of-concept for the power of this vast suite of simulated galaxies in a large volume and broad parameter space, we probe the power of simple clustering summary statistics to marginalize over astrophysics and constrain cosmology using neural networks. We use the two-point correlation function, count-in-cells, and the Void Probability Function, and probe non-linear and linear scales across 0.68<0.68< R <27 h1<27\ h^{-1} cMpc. Our cosmological constraints cluster around 3-8%\% error on ΩM\Omega_{\text{M}} and σ8\sigma_8, and we explore the effect of various galaxy selections, galaxy sampling, and choice of clustering statistics on these constraints. We additionally explore how these clustering statistics constrain and inform key stellar and galactic feedback parameters in the Santa Cruz SAM. CAMELS-SAM has been publicly released alongside the rest of CAMELS, and offers great potential to many applications of machine learning in astrophysics: https://camels-sam.readthedocs.io.Comment: 40 pages, 22 figures (11 made of subfigures

    Similar works

    Full text

    thumbnail-image

    Available Versions