3,101 research outputs found

    Longitudinal LASSO: Jointly Learning Features and Temporal Contingency for Outcome Prediction

    Full text link
    Longitudinal analysis is important in many disciplines, such as the study of behavioral transitions in social science. Only very recently, feature selection has drawn adequate attention in the context of longitudinal modeling. Standard techniques, such as generalized estimating equations, have been modified to select features by imposing sparsity-inducing regularizers. However, they do not explicitly model how a dependent variable relies on features measured at proximal time points. Recent graphical Granger modeling can select features in lagged time points but ignores the temporal correlations within an individual's repeated measurements. We propose an approach to automatically and simultaneously determine both the relevant features and the relevant temporal points that impact the current outcome of the dependent variable. Meanwhile, the proposed model takes into account the non-{\em i.i.d} nature of the data by estimating the within-individual correlations. This approach decomposes model parameters into a summation of two components and imposes separate block-wise LASSO penalties to each component when building a linear model in terms of the past Ď„\tau measurements of features. One component is used to select features whereas the other is used to select temporal contingent points. An accelerated gradient descent algorithm is developed to efficiently solve the related optimization problem with detailed convergence analysis and asymptotic analysis. Computational results on both synthetic and real world problems demonstrate the superior performance of the proposed approach over existing techniques.Comment: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 201

    An Aggregation Method for Sparse Logistic Regression

    Full text link
    L1L_1 regularized logistic regression has now become a workhorse of data mining and bioinformatics: it is widely used for many classification problems, particularly ones with many features. However, L1L_1 regularization typically selects too many features and that so-called false positives are unavoidable. In this paper, we demonstrate and analyze an aggregation method for sparse logistic regression in high dimensions. This approach linearly combines the estimators from a suitable set of logistic models with different underlying sparsity patterns and can balance the predictive ability and model interpretability. Numerical performance of our proposed aggregation method is then investigated using simulation studies. We also analyze a published genome-wide case-control dataset to further evaluate the usefulness of the aggregation method in multilocus association mapping

    Market Structure and Entry: Where's the Beef?.

    Get PDF
    We study the effects of market structure on entry using data from the UK fast food (counter-service burger)industry over the years 1991-1995. Over this period, the market can be characterized as a duopoly. We find that market structure matters greatly: for both firms, rival presence increases the probability of entry. We control for market specific time-invariant unobservable and their correlation with existing outlets of both firms through a variety of methods.LEARNING ; MARKET ; DUOPOLY

    Oracle Properties and Finite Sample Inference of the Adaptive Lasso for Time Series Regression Models

    Full text link
    We derive new theoretical results on the properties of the adaptive least absolute shrinkage and selection operator (adaptive lasso) for time series regression models. In particular, we investigate the question of how to conduct finite sample inference on the parameters given an adaptive lasso model for some fixed value of the shrinkage parameter. Central in this study is the test of the hypothesis that a given adaptive lasso parameter equals zero, which therefore tests for a false positive. To this end we construct a simple testing procedure and show, theoretically and empirically through extensive Monte Carlo simulations, that the adaptive lasso combines efficient parameter estimation, variable selection, and valid finite sample inference in one step. Moreover, we analytically derive a bias correction factor that is able to significantly improve the empirical coverage of the test on the active variables. Finally, we apply the introduced testing procedure to investigate the relation between the short rate dynamics and the economy, thereby providing a statistical foundation (from a model choice perspective) to the classic Taylor rule monetary policy model

    Technical Report: Compressive Temporal Higher Order Cyclostationary Statistics

    Full text link
    The application of nonlinear transformations to a cyclostationary signal for the purpose of revealing hidden periodicities has proven to be useful for applications requiring signal selectivity and noise tolerance. The fact that the hidden periodicities, referred to as cyclic moments, are often compressible in the Fourier domain motivates the use of compressive sensing (CS) as an efficient acquisition protocol for capturing such signals. In this work, we consider the class of Temporal Higher Order Cyclostationary Statistics (THOCS) estimators when CS is used to acquire the cyclostationary signal assuming compressible cyclic moments in the Fourier domain. We develop a theoretical framework for estimating THOCS using the low-rate nonuniform sampling protocol from CS and illustrate the performance of this framework using simulated data

    Quantile calculus and censored regression

    Full text link
    Quantile regression has been advocated in survival analysis to assess evolving covariate effects. However, challenges arise when the censoring time is not always observed and may be covariate-dependent, particularly in the presence of continuously-distributed covariates. In spite of several recent advances, existing methods either involve algorithmic complications or impose a probability grid. The former leads to difficulties in the implementation and asymptotics, whereas the latter introduces undesirable grid dependence. To resolve these issues, we develop fundamental and general quantile calculus on cumulative probability scale in this article, upon recognizing that probability and time scales do not always have a one-to-one mapping given a survival distribution. These results give rise to a novel estimation procedure for censored quantile regression, based on estimating integral equations. A numerically reliable and efficient Progressive Localized Minimization (PLMIN) algorithm is proposed for the computation. This procedure reduces exactly to the Kaplan--Meier method in the kk-sample problem, and to standard uncensored quantile regression in the absence of censoring. Under regularity conditions, the proposed quantile coefficient estimator is uniformly consistent and converges weakly to a Gaussian process. Simulations show good statistical and algorithmic performance. The proposal is illustrated in the application to a clinical study.Comment: Published in at http://dx.doi.org/10.1214/09-AOS771 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore