Search CORE

35,806 research outputs found

Dark Quest. I. Fast and Accurate Emulation of Halo Clustering Statistics and Its Application to Galaxy Clustering

Author: Kobayashi Yosuke
Miyatake Hironao
Murata Ryoma
Nishimichi Takahiro
Oguri Masamune
Oogi Taira
Osato Ken
Shirasaki Masato
Takada Masahiro
Takahashi Ryuichi
Yoshida Naoki
Publication venue: 'American Astronomical Society'
Publication date: 26/07/2019
Field of study

We perform an ensemble of

N

-body simulations with

2048^3

particles for 101 flat

w

CDM cosmological models sampled based on a maximin-distance Sliced Latin Hypercube Design. By using the halo catalogs extracted at multiple redshifts in the range of

z=[0,1.48]

, we develop Dark Emulator, which enables fast and accurate computations of the halo mass function, halo-matter cross-correlation, and halo auto-correlation as a function of halo masses, redshift, separations and cosmological models, based on the Principal Component Analysis and the Gaussian Process Regression for the large-dimensional input and output data vector. We assess the performance of the emulator using a validation set of

N

-body simulations that are not used in training the emulator. We show that, for typical halos hosting CMASS galaxies in the Sloan Digital Sky Survey, the emulator predicts the halo-matter cross correlation, relevant for galaxy-galaxy weak lensing, with an accuracy better than

2\%

and the halo auto-correlation, relevant for galaxy clustering correlation, with an accuracy better than

4\%

. We give several demonstrations of the emulator. It can be used to study properties of halo mass density profiles such as the mass-concentration relation and splashback radius for different cosmologies. The emulator outputs can be combined with an analytical prescription of halo-galaxy connection such as the halo occupation distribution at the equation level, instead of using the mock catalogs, to make accurate predictions of galaxy clustering statistics such as the galaxy-galaxy weak lensing and the projected correlation function for any model within the

w

CDM cosmologies, in a few CPU seconds.Comment: 46 pages, 47 figures; version accepted for publication in Ap

arXiv.org e-Print Archive

Kyoto University Research Information Repository

Stratification Trees for Adaptive Randomization in Randomized Controlled Trials

Author: Tabord-Meehan Max
Publication venue
Publication date: 11/06/2020
Field of study

This paper proposes an adaptive randomization procedure for two-stage randomized controlled trials. The method uses data from a first-wave experiment in order to determine how to stratify in a second wave of the experiment, where the objective is to minimize the variance of an estimator for the average treatment effect (ATE). We consider selection from a class of stratified randomization procedures which we call stratification trees: these are procedures whose strata can be represented as decision trees, with differing treatment assignment probabilities across strata. By using the first wave to estimate a stratification tree, we simultaneously select which covariates to use for stratification, how to stratify over these covariates, as well as the assignment probabilities within these strata. Our main result shows that using this randomization procedure with an appropriate estimator results in an asymptotic variance which is minimal in the class of stratification trees. Moreover, the results we present are able to accommodate a large class of assignment mechanisms within strata, including stratified block randomization. In a simulation study, we find that our method, paired with an appropriate cross-validation procedure ,can improve on ad-hoc choices of stratification. We conclude by applying our method to the study in Karlan and Wood (2017), where we estimate stratification trees using the first wave of their experiment

arXiv.org e-Print Archive

New statistical method identifes cytokines that distinguish stool microbiomes

Author: Deych Elena
Hanson Blake
Johnson Jethro
Shands Berkley
Shannon William D.
Sodergren Erica
Weinstock George
Yang Dake
Zhou Xin
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Regressing an outcome or dependent variable onto a set of input or independent variables allows the analyst to measure associations between the two so that changes in the outcome can be described by and predicted by changes in the inputs. While there are many ways of doing this in classical statistics, where the dependent variable has certain properties (e.g., a scalar, survival time, count), little progress on regression where the dependent variable are microbiome taxa counts has been made that do not impose extremely strict conditions on the data. In this paper, we propose and apply a new regression model combining the Dirichlet-multinomial distribution with recursive partitioning providing a fully non-parametric regression model. This model, called DM-RPart, is applied to cytokine data and microbiome taxa count data and is applicable to any microbiome taxa count/metadata, is automatically fit, and intuitively interpretable. This is a model which can be applied to any microbiome or other compositional data and software (R package HMP) available through the R CRAN website

The Jackson Laboratory: The Mouseion at the JAXlibrary

Digital Commons@Becker

The composite absolute penalties family for grouped and hierarchical variable selection

Author: Rocha Guilherme
Yu Bin
Zhao Peng
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 02/09/2009
Field of study

Extracting useful information from high-dimensional data is an important focus of today's statistical research and practice. Penalized loss function minimization has been shown to be effective for this task both theoretically and empirically. With the virtues of both regularization and sparsity, the

L_1

-penalized squared error minimization method Lasso has been popular in regression models and beyond. In this paper, we combine different norms including

L_1

to form an intelligent penalty in order to add side information to the fitting of a regression or classification model to obtain reasonable estimates. Specifically, we introduce the Composite Absolute Penalties (CAP) family, which allows given grouping and hierarchical relationships between the predictors to be expressed. CAP penalties are built by defining groups and combining the properties of norm penalties at the across-group and within-group levels. Grouped selection occurs for nonoverlapping groups. Hierarchical variable selection is reached by defining groups with particular overlapping patterns. We propose using the BLASSO and cross-validation to compute CAP estimates in general. For a subfamily of CAP estimates involving only the

L_1

and

L_{\infty}

norms, we introduce the iCAP algorithm to trace the entire regularization path for the grouped selection problem. Within this subfamily, unbiased estimates of the degrees of freedom (df) are derived so that the regularization parameter is selected without cross-validation. CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with

p\gg n

and possibly mis-specified groupings. When the complexity of a model is properly calculated, iCAP is seen to be parsimonious in the experiments.Comment: Published in at http://dx.doi.org/10.1214/07-AOS584 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Monte Carlo for the LHC

Author: Seymour Michael H.
Publication venue
Publication date: 01/01/2010
Field of study

I review the status of the general-purpose Monte Carlo event generators for the LHC, with emphasis on areas of recent physics developments. There has been great progress, especially in multi-jet simulation, but I mention some question marks that have recently arisen.Comment: 10 pages, to appear in the proceedings of Physics at the LHC 2010, DESY, Hamburg, 7-12 June 201

arXiv.org e-Print Archive

DESY Publication Database

DESY

The University of Manchester - Institutional Repository

Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science

Author: Banzhaf W.
Bergstra J.
Feurer M.
Hastie T. J.
Snoek J.
Urbanowicz R. J.
Publication venue
Publication date: 19/03/2016
Field of study

As the field of data science continues to grow, there will be an ever-increasing demand for tools that make machine learning accessible to non-experts. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning---pipeline design. We implement an open source Tree-based Pipeline Optimization Tool (TPOT) in Python and demonstrate its effectiveness on a series of simulated and real-world benchmark data sets. In particular, we show that TPOT can design machine learning pipelines that provide a significant improvement over a basic machine learning analysis while requiring little to no input nor prior knowledge from the user. We also address the tendency for TPOT to design overly complex pipelines by integrating Pareto optimization, which produces compact pipelines without sacrificing classification accuracy. As such, this work represents an important step toward fully automating machine learning pipeline design.Comment: 8 pages, 5 figures, preprint to appear in GECCO 2016, edits not yet made from reviewer comment

arXiv.org e-Print Archive

Crossref

Scipedia

THE TOOLS AND MONTE CARLO WORKING GROUP Summary Report from the Les Houches 2009 Workshop on TeV Colliders

Author: Alwall J.
Arbey A.
Basso L.
Belov S.
Bharucha A.
Braam F.
Buckley A.
Butterworth J. M.
Campanelli M.
Chierici R.
Cordero F. Febres
de Visscher S.
Djouadi A.
Dudko L.
Duhr C.
Francavilla P.
Fuks B.
Garren L.
Goto T.
Grazzini M.
Hahn T.
Haisch U.
Hamilton K.
Heinemeyer S.
Hesketh G.
Hoeche S.
Hoeth H.
Huston J.
Kalinowski J.
Kekelidze D.
Kraml S.
Lacker H.
Lenzi P.
Loch P.
Lonnblad L.
Mahmoudi F.
Maina E.
Majumder D.
Maltoni F.
Mangano M.
Martin A.
Mazumdar K.
Monk J.
Moortgat F.
Muhlleitner M.
Oleari C.
Ovyn S.
Piacquadio G.
Pittau R.
Plaetzer S.
Reina L.
Reuter J.
Richardson P.
Robinson C.
Rouby X.
Roy T.
Schulz H.
Schumann S.
Schwartz M. D.
Sherstnev A.
Siegert F.
Sjostrand T.
Skands P.
Slavich P.
Spira M.
Taylor C.
Vesterinen M.
von Seggern E.
Wackeroth D.
Weinzierl S.
Winter J.
Wyatt T. R.
Publication venue
Publication date: 01/01/2009
Field of study

This is the summary and introduction to the proceedings contributions for the Les Houches 2009 "Tools and Monte Carlo" working group.Comment: 144 Pages. Workshop site http://wwwlapp.in2p3.fr/conferences/LesHouches/Houches2009/ . Conveners were Butterworth, Maltoni, Moortgat, Richardson, Schumann and Skand

arXiv.org e-Print Archive

HAL-IN2P3

Hal - Université Grenoble Alpes

HAL Clermont Université

Oxford University Research Archive

Enlighten