6,885 research outputs found
Building and using semiparametric tolerance regions for parametric multinomial models
We introduce a semiparametric ``tubular neighborhood'' of a parametric model
in the multinomial setting. It consists of all multinomial distributions lying
in a distance-based neighborhood of the parametric model of interest. Fitting
such a tubular model allows one to use a parametric model while treating it as
an approximation to the true distribution. In this paper, the Kullback--Leibler
distance is used to build the tubular region. Based on this idea one can define
the distance between the true multinomial distribution and the parametric model
to be the index of fit. The paper develops a likelihood ratio test procedure
for testing the magnitude of the index. A semiparametric bootstrap method is
implemented to better approximate the distribution of the LRT statistic. The
approximation permits more accurate construction of a lower confidence limit
for the model fitting index.Comment: Published in at http://dx.doi.org/10.1214/08-AOS603 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Efficiency in Spanish banking: A multistakeholder approach analysis
Searching for greater inter efficiency has been used as a reason tomodify the Spanish banking system since 2009. This paper aimsto contribute to quantify the magnitude of efficiency, but not onlythe economic one, but also social and overall efficiency from 2000to 2011. The case of Spain – compared to other banking systems –provides unique information regarding the stakeholder governancebanking literature because over the last century savings banks havebecome rooted in the Spanish culture. The results – confirmed bya two-stage frontiers analysis, a DEA and a model combined withbootstrapped tests – indicate that Spanish savings banks are notless efficient globally than banks and are more efficient socially.Moreover, our results – with potentially important implications –encourage the participation of stakeholders in banking systems andunderline the importance of attaining long-term efficiency gains tosupport financial stability objectives
Point process modeling for directed interaction networks
Network data often take the form of repeated interactions between senders and
receivers tabulated over time. A primary question to ask of such data is which
traits and behaviors are predictive of interaction. To answer this question, a
model is introduced for treating directed interactions as a multivariate point
process: a Cox multiplicative intensity model using covariates that depend on
the history of the process. Consistency and asymptotic normality are proved for
the resulting partial-likelihood-based estimators under suitable regularity
conditions, and an efficient fitting procedure is described. Multicast
interactions--those involving a single sender but multiple receivers--are
treated explicitly. The resulting inferential framework is then employed to
model message sending behavior in a corporate e-mail network. The analysis
gives a precise quantification of which static shared traits and dynamic
network effects are predictive of message recipient selection.Comment: 36 pages, 13 figures; includes supplementary materia
On statistics, computation and scalability
How should statistical procedures be designed so as to be scalable
computationally to the massive datasets that are increasingly the norm? When
coupled with the requirement that an answer to an inferential question be
delivered within a certain time budget, this question has significant
repercussions for the field of statistics. With the goal of identifying
"time-data tradeoffs," we investigate some of the statistical consequences of
computational perspectives on scability, in particular divide-and-conquer
methodology and hierarchies of convex relaxations.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP17 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
A flat trend of star-formation rate with X-ray luminosity of galaxies hosting AGN in the SCUBA-2 Cosmology Legacy Survey
© 2019 The Author(s) Published by Oxford University Press on behalf of the Royal Astronomical Society.Feedback processes from active galactic nuclei (AGN) are thought to play a crucial role in regulating star formation in massive galaxies. Previous studies using Herschel have resulted in conflicting conclusions as to whether star formation is quenched, enhanced, or not affected by AGN feedback. We use new deep 850 μm observations from the SCUBA-2 Cosmology Legacy Survey (S2CLS) to investigate star formation in a sample of X-ray selected AGN, probing galaxies up to L 0.5-7keV = 10 46 erg s -1. Here, we present the results of our analysis on a sample of 1957 galaxies at 1 < z < 3, using both S2CLS and ancilliary data at seven additional wavelengths (24-500 μm) from Herschel and Spitzer. We perform a stacking analysis, binning our sample by redshift and X-ray luminosity. By fitting analytical spectral energy distributions (SEDs) to decompose contributions from cold and warm dust, we estimate star formation rates (SFRs) for each 'average' source. We find that the average AGN in our sample resides in a star-forming host galaxy, with SFRs ranging from 80 to 600 M ⊙ yr -1. Within each redshift bin, we see no trend of SFR with X-ray luminosity, instead finding a flat distribution of SFR across ∼3 orders of magnitude of AGN luminosity. By studying instantaneous X-ray luminosities and SFRs, we find no evidence that AGN activity affects star formation in host galaxies.Peer reviewedFinal Accepted Versio
Block-Conditional Missing at Random Models for Missing Data
Two major ideas in the analysis of missing data are (a) the EM algorithm
[Dempster, Laird and Rubin, J. Roy. Statist. Soc. Ser. B 39 (1977) 1--38] for
maximum likelihood (ML) estimation, and (b) the formulation of models for the
joint distribution of the data and missing data indicators , and
associated "missing at random"; (MAR) condition under which a model for
is unnecessary [Rubin, Biometrika 63 (1976) 581--592]. Most previous work has
treated and as single blocks, yielding selection or pattern-mixture
models depending on how their joint distribution is factorized. This paper
explores "block-sequential"; models that interleave subsets of the variables
and their missing data indicators, and then make parameter restrictions based
on assumptions in each block. These include models that are not MAR. We examine
a subclass of block-sequential models we call block-conditional MAR (BCMAR)
models, and an associated block-monotone reduced likelihood strategy that
typically yields consistent estimates by selectively discarding some data.
Alternatively, full ML estimation can often be achieved via the EM algorithm.
We examine in some detail BCMAR models for the case of two multinomially
distributed categorical variables, and a two block structure where the first
block is categorical and the second block arises from a (possibly multivariate)
exponential family distribution.Comment: Published in at http://dx.doi.org/10.1214/10-STS344 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …