155,588 research outputs found
Correcting for misclassification error in gross flows using double sampling: moment-based inference vs. likelihood-based inference
Gross flows are discrete longitudinal data that are defined as transition counts, between a finite number of states, from one point in time to another. We discuss the analysis of gross flows in the presence of misclassification error via double sampling methods. Traditionally, adjusted for misclassification error estimates are obtained using a moment-based estimator. We propose a likelihood-based approach that works by simultaneously modeling the true transition process and the misclassification error process within the context of a missing data problem. Monte-Carlo simulation results indicate that the maximumlikelihood estimator is more efficient than the moment-based estimator
Past and present cosmic structure in the SDSS DR7 main sample
We present a chrono-cosmography project, aiming at the inference of the four
dimensional formation history of the observed large scale structure from its
origin to the present epoch. To do so, we perform a full-scale Bayesian
analysis of the northern galactic cap of the Sloan Digital Sky Survey (SDSS)
Data Release 7 main galaxy sample, relying on a fully probabilistic, physical
model of the non-linearly evolved density field. Besides inferring initial
conditions from observations, our methodology naturally and accurately
reconstructs non-linear features at the present epoch, such as walls and
filaments, corresponding to high-order correlation functions generated by
late-time structure formation. Our inference framework self-consistently
accounts for typical observational systematic and statistical uncertainties
such as noise, survey geometry and selection effects. We further account for
luminosity dependent galaxy biases and automatic noise calibration within a
fully Bayesian approach. As a result, this analysis provides highly-detailed
and accurate reconstructions of the present density field on scales larger than
Mpc, constrained by SDSS observations. This approach also leads to
the first quantitative inference of plausible formation histories of the
dynamic large scale structure underlying the observed galaxy distribution. The
results described in this work constitute the first full Bayesian non-linear
analysis of the cosmic large scale structure with the demonstrated capability
of uncertainty quantification. Some of these results will be made publicly
available along with this work. The level of detail of inferred results and the
high degree of control on observational uncertainties pave the path towards
high precision chrono-cosmography, the subject of simultaneously studying the
dynamics and the morphology of the inhomogeneous Universe.Comment: 27 pages, 9 figure
On valid descriptive inference from non-probability sample
We examine the conditions under which descriptive inference can be based
directly on the observed distribution in a non-probability sample, under both
the super-population and quasi-randomisation modelling approaches. Review of
existing estimation methods reveals that the traditional formulation of these
conditions may be inadequate due to potential issues of under-coverage or
heterogeneous mean beyond the assumed model. We formulate unifying conditions
that are applicable to both type of modelling approaches. The difficulties of
empirically validating the required conditions are discussed, as well as valid
inference approaches using supplementary probability sampling. The key message
is that probability sampling may still be necessary in some situations, in
order to ensure the validity of descriptive inference, but it can be much less
resource-demanding provided the presence of a big non-probability sample
Exit polling and racial bloc voting: Combining individual-level and RC ecological data
Despite its shortcomings, cross-level or ecological inference remains a
necessary part of some areas of quantitative inference, including in United
States voting rights litigation. Ecological inference suffers from a lack of
identification that, most agree, is best addressed by incorporating
individual-level data into the model. In this paper we test the limits of such
an incorporation by attempting it in the context of drawing inferences about
racial voting patterns using a combination of an exit poll and precinct-level
ecological data; accurate information about racial voting patterns is needed to
assess triggers in voting rights laws that can determine the composition of
United States legislative bodies. Specifically, we extend and study a hybrid
model that addresses two-way tables of arbitrary dimension. We apply the hybrid
model to an exit poll we administered in the City of Boston in 2008. Using the
resulting data as well as simulation, we compare the performance of a pure
ecological estimator, pure survey estimators using various sampling schemes and
our hybrid. We conclude that the hybrid estimator offers substantial benefits
by enabling substantive inferences about voting patterns not practicably
available without its use.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS353 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Comment on Article by Ferreira and Gamerman
A utility-function approach to optimal spatial sampling design is a powerful
way to quantify what "optimality" means. The emphasis then should be to capture
all possible contributions to utility, including scientific impact and the cost
of sampling. The resulting sampling plan should contain a component of designed
randomness that would allow for a non-parametric design-based analysis if
model-based assumptions were in doubt. [arXiv:1509.03410]Comment: Published at http://dx.doi.org/10.1214/15-BA944B in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
The SWELLS Survey. VI. hierarchical inference of the initial mass functions of bulges and discs
The long-standing assumption that the stellar initial mass function (IMF) is
universal has recently been challenged by a number of observations. Several
studies have shown that a "heavy" IMF (e.g., with a Salpeter-like abundance of
low mass stars and thus normalisation) is preferred for massive early-type
galaxies, while this IMF is inconsistent with the properties of less massive,
later-type galaxies. These discoveries motivate the hypothesis that the IMF may
vary (possibly very slightly) across galaxies and across components of
individual galaxies (e.g. bulges vs discs). In this paper we use a sample of 19
late-type strong gravitational lenses from the SWELLS survey to investigate the
IMFs of the bulges and discs in late-type galaxies. We perform a joint analysis
of the galaxies' total masses (constrained by strong gravitational lensing) and
stellar masses (constrained by optical and near-infrared colours in the context
of a stellar population synthesis [SPS] model, up to an IMF normalisation
parameter). Using minimal assumptions apart from the physical constraint that
the total stellar mass within any aperture must be less than the total mass
within the aperture, we find that the bulges of the galaxies cannot have IMFs
heavier (i.e. implying high mass per unit luminosity) than Salpeter, while the
disc IMFs are not well constrained by this data set. We also discuss the
necessity for hierarchical modelling when combining incomplete information
about multiple astronomical objects. This modelling approach allows us to place
upper limits on the size of any departures from universality. More data,
including spatially resolved kinematics (as in paper V) and stellar population
diagnostics over a range of bulge and disc masses, are needed to robustly
quantify how the IMF varies within galaxies.Comment: Accepted for publication in MNRAS. 15 pages, 8 figures. Code
available at https://github.com/eggplantbren/SWELLS_Hierarchica
- …