720 research outputs found
Partially linear additive quantile regression in ultra-high dimension
We consider a flexible semiparametric quantile regression model for analyzing
high dimensional heterogeneous data. This model has several appealing features:
(1) By considering different conditional quantiles, we may obtain a more
complete picture of the conditional distribution of a response variable given
high dimensional covariates. (2) The sparsity level is allowed to be different
at different quantile levels. (3) The partially linear additive structure
accommodates nonlinearity and circumvents the curse of dimensionality. (4) It
is naturally robust to heavy-tailed distributions. In this paper, we
approximate the nonlinear components using B-spline basis functions. We first
study estimation under this model when the nonzero components are known in
advance and the number of covariates in the linear part diverges. We then
investigate a nonconvex penalized estimator for simultaneous variable selection
and estimation. We derive its oracle property for a general class of nonconvex
penalty functions in the presence of ultra-high dimensional covariates under
relaxed conditions. To tackle the challenges of nonsmooth loss function,
nonconvex penalty function and the presence of nonlinear components, we combine
a recently developed convex-differencing method with modern empirical process
techniques. Monte Carlo simulations and an application to a microarray study
demonstrate the effectiveness of the proposed method. We also discuss how the
method for a single quantile of interest can be extended to simultaneous
variable selection and estimation at multiple quantiles.Comment: Published at http://dx.doi.org/10.1214/15-AOS1367 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Cluster Elastic Net for Multivariate Regression
We propose a method for estimating coefficients in multivariate regression
when there is a clustering structure to the response variables. The proposed
method includes a fusion penalty, to shrink the difference in fitted values
from responses in the same cluster, and an L1 penalty for simultaneous variable
selection and estimation. The method can be used when the grouping structure of
the response variables is known or unknown. When the clustering structure is
unknown the method will simultaneously estimate the clusters of the response
and the regression coefficients. Theoretical results are presented for the
penalized least squares case, including asymptotic results allowing for p >> n.
We extend our method to the setting where the responses are binomial variables.
We propose a coordinate descent algorithm for both the normal and binomial
likelihood, which can easily be extended to other generalized linear model
(GLM) settings. Simulations and data examples from business operations and
genomics are presented to show the merits of both the least squares and
binomial methods.Comment: 37 Pages, 11 Figure
David Sherwood: Invasive procedures
Sherwood uses a unique in vivo model to study how cells invade through extracellular barriers
On the Use of Minimum Penalties in Statistical Learning
Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not scale to other types of models. In this article we proposed the MinPEN framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using mild assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to exploit detected relationships between responses. An iterative algorithm that generalizes current state of the art methods is proposed as a solution to the non-convex optimization that is required to obtain estimates. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framewok
Does Narrative Impact Funding? Analyzing the Relationship Between Project Description and Pledged Amounts for Reward-based Crowdfunding Projects
In reward-based crowdfunding (RBC) campaigns, project description text plays a critical role in driving market demand, by simplifying complex project information and providing clear backing signals. Indeed, well-crafted textual descriptions could persuade potential backers to better fund the respective projects. Through the lenses of theories on framing and resonance, we examine three attributes, that are key to forming a compelling narrative: innovation disclosure, linguistic specificity, and shared phrase utilization. We posit that better communication regarding product innovation (i.e., innovation disclosure), employing phrases commonly found in comparable projects (i.e., shared phrase utilization), and incorporating concrete and precise language (i.e., linguistic specificity) are associated with higher funding. Using data from technology and product design project categories of a prominent RBC platform, our hypotheses are tested and largely supported. Our study contributes to information systems (IS) research by exploring creators’ resonance strategies and the role of the project description narratives in funding outcomes
An Autonomous Earth Observing Sensorweb
We describe a network of sensors linked by software and the internet to an autonomous satellite observation response capability. This system of systems is designed with a flexible, modular, architecture to facilitate expansion in sensors, customization of trigger conditions, and customization of responses. This system has been used to implement a global surveillance program of science phenomena including: volcanoes, flooding, cryosphere events, and atmospheric phenomena. In this paper we describe the importance of the earth observing sensorweb application as well as overall architecture for the network
Sapper: A Language for Hardware- Level Security Policy Enforcement
Privacy and integrity are important security concerns. These concerns are addressed by controlling information flow, i.e., restricting how information can flow through a system. Most proposed systems that restrict information flow make the implicit assumption that the hardware used by the system is fully “correct ” and that the hardware’s instruction set accurately describes its behavior in all circumstances. The truth is more complicated: modern hardware designs defy complete verification; many aspects of the timing and ordering of events are left totally unspecified; and implementation bugs present themselves with surprising frequency. In this work we describe Sapper, a novel hardware description language for designing security-critical hardware components. Sapper seeks to address these problems by using static analysis a
Dynamics of Nonequilibrium Deposition
In this work we survey selected theoretical developments for models of
deposition of extended particles, with and without surface diffusion, on linear
and planar substrates, of interest in colloid, polymer, and certain biological
systems.Comment: 35 pages in plain TeX and 4 JPG figures, to appear in a special
volume entitled "Adhesion of Submicron Particles on Solid Surfaces" of
Colloids and Surfaces A, guest-edited by V. Privma
- …