80,549 research outputs found

    Change-points Estimation in Statistical Inference and Machine Learning Problems

    Get PDF
    Statistical inference plays an increasingly important role in science, finance and industry. Despite the extensive research and wide application of statistical inference, most of the efforts focus on uniform models. This thesis considers the statistical inference in models with abrupt changes instead. The task is to estimate change-points where the underlying models change. We first study low dimensional linear regression problems for which the underlying model undergoes multiple changes. Our goal is to estimate the number and locations of change-points that segment available data into different regions, and further produce sparse and interpretable models for each region. To address challenges of the existing approaches and to produce interpretable models, we propose a sparse group Lasso (SGL) based approach for linear regression problems with change-points. Then we extend our method to high dimensional nonhomogeneous linear regression models. Under certain assumptions and using a properly chosen regularization parameter, we show several desirable properties of the method. We further extend our studies to generalized linear models (GLM) and prove similar results. In practice, change-points inference usually involves high dimensional data, hence it is prone to tackle for distributed learning with feature partitioning data, which implies each machine in the cluster stores a part of the features. One bottleneck for distributed learning is communication. For this implementation concern, we design communication efficient algorithm for feature partitioning data sets to speed up not only change-points inference but also other classes of machine learning problem including Lasso, support vector machine (SVM) and logistic regression

    A Hybrid Approach to Privacy-Preserving Federated Learning

    Full text link
    Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions
    • …
    corecore