16 research outputs found
Wiener filter reloaded: fast signal reconstruction without preconditioning
We present a high performance solution to the Wiener filtering problem via a
formulation that is dual to the recently developed messenger technique. This
new dual messenger algorithm, like its predecessor, efficiently calculates the
Wiener filter solution of large and complex data sets without preconditioning
and can account for inhomogeneous noise distributions and arbitrary mask
geometries. We demonstrate the capabilities of this scheme in signal
reconstruction by applying it on a simulated cosmic microwave background (CMB)
temperature data set. The performance of this new method is compared to that of
the standard messenger algorithm and the preconditioned conjugate gradient
(PCG) approach, using a series of well-known convergence diagnostics and their
processing times, for the particular problem under consideration. This variant
of the messenger algorithm matches the performance of the PCG method in terms
of the effectiveness of reconstruction of the input angular power spectrum and
converges smoothly to the final solution. The dual messenger algorithm
outperforms the standard messenger and PCG methods in terms of execution time,
as it runs to completion around 2 and 3-4 times faster than the respective
methods, for the specific problem considered.Comment: 13 pages, 10 figures. Accepted for publication in MNRAS main journa
Simulation-based inference of dynamical galaxy cluster masses with 3D convolutional neural networks
We present a simulation-based inference framework using a convolutional
neural network to infer dynamical masses of galaxy clusters from their observed
3D projected phase-space distribution, which consists of the projected galaxy
positions in the sky and their line-of-sight velocities. By formulating the
mass estimation problem within this simulation-based inference framework, we
are able to quantify the uncertainties on the inferred masses in a
straightforward and robust way. We generate a realistic mock catalogue
emulating the Sloan Digital Sky Survey (SDSS) Legacy spectroscopic observations
(the main galaxy sample) for redshifts and explicitly
illustrate the challenges posed by interloper (non-member) galaxies for cluster
mass estimation from actual observations. Our approach constitutes the first
optimal machine learning-based exploitation of the information content of the
full 3D projected phase-space distribution, including both the virialized and
infall cluster regions, for the inference of dynamical cluster masses. We also
present, for the first time, the application of a simulation-based inference
machinery to obtain dynamical masses of around galaxy clusters found in
the SDSS Legacy Survey, and show that the resulting mass estimates are
consistent with mass measurements from the literature.Comment: 14 pages, 11 figures. Accepted for publication in MNRAS. Contains
non-peer reviewed supplementary material on cluster mass function in appendi
AI-driven spatio-temporal engine for finding gravitationally lensed supernovae
We present a spatio-temporal AI framework that concurrently exploits both the
spatial and time-variable features of gravitationally lensed supernovae in
optical images to ultimately aid in the discovery of such exotic transients in
wide-field surveys. Our spatio-temporal engine is designed using recurrent
convolutional layers, while drawing from recent advances in variational
inference to quantify approximate Bayesian uncertainties via a confidence
score. Using simulated Young Supernova Experiment (YSE) images as a showcase,
we find that the use of time-series images yields a substantial gain of nearly
20 per cent in classification accuracy over single-epoch observations, with a
preliminary application to mock observations from the Legacy Survey of Space
and Time (LSST) yielding around 99 per cent accuracy. Our innovative deep
learning machinery adds an extra dimension in the search for gravitationally
lensed supernovae from current and future astrophysical transient surveys.Comment: 6+8 pages, 10 figures, 2 tables. For submission to a peer-reviewed
journal. Comments welcom
Dynamical mass inference of galaxy clusters with neural flows
We present an algorithm for inferring the dynamical mass of galaxy clusters
directly from their respective phase-space distributions, i.e. the observed
line-of-sight velocities and projected distances of galaxies from the cluster
centre. Our method employs normalizing flows, a deep neural network capable of
learning arbitrary high-dimensional probability distributions, and inherently
accounts, to an adequate extent, for the presence of interloper galaxies which
are not bounded to a given cluster, the primary contaminant of dynamical mass
measurements. We validate and showcase the performance of our neural flow
approach to robustly infer the dynamical mass of clusters from a realistic mock
cluster catalogue. A key aspect of our novel algorithm is that it yields the
probability density function of the mass of a particular cluster, thereby
providing a principled way of quantifying uncertainties, in contrast to
conventional machine learning approaches. The neural network mass predictions,
when applied to a contaminated catalogue with interlopers, have a mean overall
logarithmic residual scatter of 0.028 dex, with a log-normal scatter of 0.126
dex, which goes down to 0.089 dex for clusters in the intermediate to high mass
range. This is an improvement by nearly a factor of four relative to the
classical cluster mass scaling relation with the velocity dispersion, and
outperforms recently proposed machine learning approaches. We also apply our
neural flow mass estimator to a compilation of galaxy observations of some
well-studied clusters with robust dynamical mass estimates, further
substantiating the efficacy of our algorithm.Comment: 14 pages, 9 figures, 1 table. Improved approach, saliency maps adde
The quijote simulations
The Quijote simulations are a set of 44,100 full N-body simulations spanning more than 7000 cosmological models in the hyperplane. At a single redshift, the simulations contain more than 8.5 trillion particles over a combined volume of 44,100 each simulation follows the evolution of 2563, 5123, or 10243 particles in a box of 1 h -1 Gpc length. Billions of dark matter halos and cosmic voids have been identified in the simulations, whose runs required more than 35 million core hours. The Quijote simulations have been designed for two main purposes: (1) to quantify the information content on cosmological observables and (2) to provide enough data to train machine-learning algorithms. In this paper, we describe the simulations and show a few of their applications. We also release the petabyte of data generated, comprising hundreds of thousands of simulation snapshots at multiple redshifts; halo and void catalogs; and millions of summary statistics, such as power spectra, bispectra, correlation functions, marked power spectra, and estimated probability density functions
Inférence bayésienne et apprentissage profond : application à la cosmologie primordiale et à la mesure de l'accélération cosmique
The essence of this doctoral research constitutes the development and application of novel Bayesian statistical inference and deep learning techniques to meet statistical challenges of massive and complex data sets from next-generation cosmic microwave background (CMB) missions or galaxy surveys and optimize their scientific returns to ultimately improve our understanding of the Universe. The first theme deals with the extraction of the E and B modes of the CMB polarization signal from the data. We have developed a high-performance hierarchical method, known as the dual messenger algorithm, for spin field reconstruction on the sphere and demonstrated its capabilities in reconstructing pure E and B maps, while accounting for complex and realistic noise models. The second theme lies in the development of various aspects of Bayesian forward modelling machinery for optimal exploitation of state-of-the-art galaxy redshift surveys. We have developed a large-scale Bayesian inference framework to constrain cosmological parameters via a novel implementation of the Alcock-Paczyński test and showcased our cosmological constraints on the matter density and dark energy equation of state. With the control of systematic effects being a crucial limiting factor for modern galaxy redshift surveys, we also presented an augmented likelihood which is robust to unknown foreground and target contaminations. Finally, with a view to building fast complex dynamics emulators in our above Bayesian hierarchical model, we have designed a novel halo painting network that learns to map approximate 3D dark matter fields to realistic halo distributions.Cette thèse a pour vocation le développement et l’application de nouvelles techniques d’inférence statistique bayésienne et d’apprentissage profond pour relever les défis statistiques imposés par les gros volumes de données complexes des missions du fond diffus cosmologique (CMB) ou des relevés profonds de galaxies de la prochaine génération, dans le but d'optimiser l’exploitation des données scientifiques afin d’améliorer, à terme, notre compréhension de l’Univers. La première partie de cette thèse concerne l'extraction des modes E et B du signal de polarisation du CMB à partir des données. Nous avons développé une méthode hiérarchique à haute performance, nommée algorithme du dual messenger, pour la reconstruction du champ de spin sur la sphère et nous avons démontré les capacités de cet algorithme à reconstruire des cartes E et B pures, tout en tenant compte des modèles de bruit réalistes. La seconde partie porte sur le développement d’un cadre d'inférence bayésienne pour contraindre les paramètres cosmologiques en s’appuyant sur une nouvelle implémentation du test géométrique d'Alcock-Paczyński et nous avons présenté nos contraintes cosmologiques sur la densité de matière et l'équation d'état de l'énergie sombre. Etant donné que le contrôle des effets systématiques est un facteur crucial, nous avons également présenté une fonction de vraisemblance robuste, qui résiste aux contaminations inconnues liées aux avant-plans. Finalement, dans le but de construire des émulateurs de dynamiques complexes dans notre modèle, nous avons conçu un nouveau réseau de neurones qui apprend à peindre des distributions de halo sur des champs approximatifs de matière noire en 3D
Painting halos from cosmic density fields of dark matter with physically motivated neural networks
International audienceWe present a novel halo painting network that learns to map approximate 3D dark matter fields to realistic halo distributions. This map is provided via a physically motivated network with which we can learn the nontrivial local relation between dark matter density field and halo distributions without relying on a physical model. Unlike other generative or regressive models, a well motivated prior and simple physical principles allow us to train the mapping network quickly and with relatively little data. In learning to paint halo distributions from computationally cheap, analytical and nonlinear density fields, we bypass the need for full particle mesh simulations and halo finding algorithms. Furthermore, by design, our halo painting network needs only local patches of dark matter density to predict the halos, and as such, it can predict the 3D halo distribution for any arbitrary simulation box size. Our neural network can be trained using small simulations and used to predict large halo distributions, as long as the resolutions are equivalent. We evaluate our model’s ability to generate 3D halo count distributions which reproduce, to a high degree, summary statistics such as the power spectrum and bispectrum, of the input or reference realizations
Wiener filtering and pure decomposition of CMB maps with anisotropic correlated noise
International audienceWe present an augmented version of our dual messenger algorithm for spin field reconstruction on the sphere, while accounting for highly non-trivial and realistic noise models such as modulated correlated noise. We also describe an optimization method for the estimation of noise covariance from Monte Carlo simulations. Using simulated Planck polarized cosmic microwave background (CMB) maps as a showcase, we demonstrate the capabilities of the algorithm in reconstructing pure || and || maps, guaranteed to be free from ambiguous modes resulting from the leakage or coupling issue that plagues conventional methods of || separation. Due to its high speed execution, coupled with lenient memory requirements, the algorithm can be optimized in exact global Bayesian analyses of state-of-the-art CMB data for a statistically optimal separation of pure || and || modes. Our algorithm, therefore, has a potentially key role in the data analysis of high-resolution and high-sensitivity CMB data, especially with the range of upcoming CMB experiments tailored for the detection of the elusive primordial ||-mode signal
Optimal and fast separation with a dual messenger field
International audienceWe adapt our recently proposed dual messenger algorithm for spin field reconstruction and showcase its efficiency and effectiveness in Wiener filtering polarized cosmic microwave background (CMB) maps. Unlike conventional preconditioned conjugate gradient (PCG) solvers, our preconditioner-free technique can deal with high-resolution joint temperature and polarization maps with inhomogeneous noise distributions and arbitrary mask geometries with relative ease. Various convergence diagnostics illustrate the high quality of the dual messenger reconstruction. In contrast, the PCG implementation fails to converge to a reasonable solution for the specific problem considered. The implementation of the dual messenger method is straightforward and guarantees numerical stability and convergence. We show how the algorithm can be modified to generate fluctuation maps, which, combined with the Wiener filter solution, yield unbiased constrained signal realizations, consistent with observed data. This algorithm presents a pathway to exact global analyses of high-resolution and high-sensitivity CMB data for a statistically optimal separation of and modes. It is therefore relevant for current and next-generation CMB experiments, in the quest for the elusive primordial -mode signal
Optimal machine-driven acquisition of future cosmological data
International audienceWe present a set of maps classifying regions of the sky according to their information gain potential as quantified by Fisher information. These maps can guide the optimal retrieval of relevant physical information with targeted cosmological searches. Specifically, we calculated the response of observed cosmic structures to perturbative changes in the cosmological model and we charted their respective contributions to Fisher information. Our physical forward-modeling machinery transcends the limitations of contemporary analyses based on statistical summaries to yield detailed characterizations of individual 3D structures. We demonstrate this advantage using galaxy counts data and we showcase the potential of our approach by studying the information gain of the Coma cluster. We find that regions in the vicinity of the filaments and cluster core, where mass accretion ensues from gravitational infall, are the most informative with regard to our physical model of structure formation in the Universe. Hence, collecting data in those regions would be most optimal for testing our model predictions. The results presented in this work are the first of their kind to elucidate the inhomogeneous distribution of cosmological information in the Universe. This study paves a new way forward for the performance of efficient targeted searches for the fundamental physics of the Universe, where search strategies are progressively refined with new cosmological data sets within an active learning framework.Key words: galaxies: statistics / cosmology: observations / methods: data analysis / methods: statistical / large-scale structure of Univers