720 research outputs found

    Partially linear additive quantile regression in ultra-high dimension

    Get PDF
    We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.Comment: Published at http://dx.doi.org/10.1214/15-AOS1367 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Cluster Elastic Net for Multivariate Regression

    Get PDF
    We propose a method for estimating coefficients in multivariate regression when there is a clustering structure to the response variables. The proposed method includes a fusion penalty, to shrink the difference in fitted values from responses in the same cluster, and an L1 penalty for simultaneous variable selection and estimation. The method can be used when the grouping structure of the response variables is known or unknown. When the clustering structure is unknown the method will simultaneously estimate the clusters of the response and the regression coefficients. Theoretical results are presented for the penalized least squares case, including asymptotic results allowing for p >> n. We extend our method to the setting where the responses are binomial variables. We propose a coordinate descent algorithm for both the normal and binomial likelihood, which can easily be extended to other generalized linear model (GLM) settings. Simulations and data examples from business operations and genomics are presented to show the merits of both the least squares and binomial methods.Comment: 37 Pages, 11 Figure

    David Sherwood: Invasive procedures

    Get PDF
    Sherwood uses a unique in vivo model to study how cells invade through extracellular barriers

    On the Use of Minimum Penalties in Statistical Learning

    Get PDF
    Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not scale to other types of models. In this article we proposed the MinPEN framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using mild assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to exploit detected relationships between responses. An iterative algorithm that generalizes current state of the art methods is proposed as a solution to the non-convex optimization that is required to obtain estimates. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framewok

    Does Narrative Impact Funding? Analyzing the Relationship Between Project Description and Pledged Amounts for Reward-based Crowdfunding Projects

    Get PDF
    In reward-based crowdfunding (RBC) campaigns, project description text plays a critical role in driving market demand, by simplifying complex project information and providing clear backing signals. Indeed, well-crafted textual descriptions could persuade potential backers to better fund the respective projects. Through the lenses of theories on framing and resonance, we examine three attributes, that are key to forming a compelling narrative: innovation disclosure, linguistic specificity, and shared phrase utilization. We posit that better communication regarding product innovation (i.e., innovation disclosure), employing phrases commonly found in comparable projects (i.e., shared phrase utilization), and incorporating concrete and precise language (i.e., linguistic specificity) are associated with higher funding. Using data from technology and product design project categories of a prominent RBC platform, our hypotheses are tested and largely supported. Our study contributes to information systems (IS) research by exploring creators’ resonance strategies and the role of the project description narratives in funding outcomes

    An Autonomous Earth Observing Sensorweb

    Get PDF
    We describe a network of sensors linked by software and the internet to an autonomous satellite observation response capability. This system of systems is designed with a flexible, modular, architecture to facilitate expansion in sensors, customization of trigger conditions, and customization of responses. This system has been used to implement a global surveillance program of science phenomena including: volcanoes, flooding, cryosphere events, and atmospheric phenomena. In this paper we describe the importance of the earth observing sensorweb application as well as overall architecture for the network

    Sapper: A Language for Hardware- Level Security Policy Enforcement

    Get PDF
    Privacy and integrity are important security concerns. These concerns are addressed by controlling information flow, i.e., restricting how information can flow through a system. Most proposed systems that restrict information flow make the implicit assumption that the hardware used by the system is fully “correct ” and that the hardware’s instruction set accurately describes its behavior in all circumstances. The truth is more complicated: modern hardware designs defy complete verification; many aspects of the timing and ordering of events are left totally unspecified; and implementation bugs present themselves with surprising frequency. In this work we describe Sapper, a novel hardware description language for designing security-critical hardware components. Sapper seeks to address these problems by using static analysis a

    Dynamics of Nonequilibrium Deposition

    Full text link
    In this work we survey selected theoretical developments for models of deposition of extended particles, with and without surface diffusion, on linear and planar substrates, of interest in colloid, polymer, and certain biological systems.Comment: 35 pages in plain TeX and 4 JPG figures, to appear in a special volume entitled "Adhesion of Submicron Particles on Solid Surfaces" of Colloids and Surfaces A, guest-edited by V. Privma
    • …
    corecore