Search CORE

446,721 research outputs found

Efficient semi-supervised inference for logistic regression under case-control studies

Author: Chen Kani
Lin Yuanyuan
Quan Zhuojun
Yu Wen
Publication venue
Publication date: 23/02/2024
Field of study

Semi-supervised learning has received increasingly attention in statistics and machine learning. In semi-supervised learning settings, a labeled data set with both outcomes and covariates and an unlabeled data set with covariates only are collected. We consider an inference problem in semi-supervised settings where the outcome in the labeled data is binary and the labeled data is collected by case-control sampling. Case-control sampling is an effective sampling scheme for alleviating imbalance structure in binary data. Under the logistic model assumption, case-control data can still provide consistent estimator for the slope parameter of the regression model. However, the intercept parameter is not identifiable. Consequently, the marginal case proportion cannot be estimated from case-control data. We find out that with the availability of the unlabeled data, the intercept parameter can be identified in semi-supervised learning setting. We construct the likelihood function of the observed labeled and unlabeled data and obtain the maximum likelihood estimator via an iterative algorithm. The proposed estimator is shown to be consistent, asymptotically normal, and semiparametrically efficient. Extensive simulation studies are conducted to show the finite sample performance of the proposed method. The results imply that the unlabeled data not only helps to identify the intercept but also improves the estimation efficiency of the slope parameter. Meanwhile, the marginal case proportion can be estimated accurately by the proposed method

arXiv.org e-Print Archive

Sparse Distributed Learning Based on Diffusion Adaptation

Author: Di Lorenzo Paolo
Sayed Ali H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/11/2012
Field of study

This article proposes diffusion LMS strategies for distributed estimation over adaptive networks that are able to exploit sparsity in the underlying system model. The approach relies on convex regularization, common in compressive sensing, to enhance the detection of sparsity via a diffusive process over the network. The resulting algorithms endow networks with learning abilities and allow them to learn the sparse structure from the incoming data in real-time, and also to track variations in the sparsity of the model. We provide convergence and mean-square performance analysis of the proposed method and show under what conditions it outperforms the unregularized diffusion version. We also show how to adaptively select the regularization parameter. Simulation results illustrate the advantage of the proposed filters for sparse data recovery.Comment: to appear in IEEE Trans. on Signal Processing, 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref