Search CORE

711,242 research outputs found

Sparse Bilinear Logistic Regression

Author: Baraniuk Richard G.
Shi Jianing V.
Xu Yangyang
Publication venue
Publication date: 15/04/2014
Field of study

In this paper, we introduce the concept of sparse bilinear logistic regression for decision problems involving explanatory variables that are two-dimensional matrices. Such problems are common in computer vision, brain-computer interfaces, style/content factorization, and parallel factor analysis. The underlying optimization problem is bi-convex; we study its solution and develop an efficient algorithm based on block coordinate descent. We provide a theoretical guarantee for global convergence and estimate the asymptotical convergence rate using the Kurdyka-{\L}ojasiewicz inequality. A range of experiments with simulated and real data demonstrate that sparse bilinear logistic regression outperforms current techniques in several important applications.Comment: 27 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Model selection in logistic regression

Author: Kwemou Marius
Taupin Marie-Luce
Tocquet Anne-Sophie
Publication venue
Publication date: 29/08/2015
Field of study

This paper is devoted to model selection in logistic regression. We extend the model selection principle introduced by Birg\'e and Massart (2001) to logistic regression model. This selection is done by using penalized maximum likelihood criteria. We propose in this context a completely data-driven criteria based on the slope heuristics. We prove non asymptotic oracle inequalities for selected estimators. Theoretical results are illustrated through simulation studies

arXiv.org e-Print Archive

HAL Evry

HAL Descartes

Structured Learning via Logistic Regression

Author: Domke Justin
Publication venue
Publication date: 02/07/2014
Field of study

A successful approach to structured learning is to write the learning objective as a joint function of linear parameters and inference messages, and iterate between updates to each. This paper observes that if the inference problem is "smoothed" through the addition of entropy terms, for fixed messages, the learning objective reduces to a traditional (non-structured) logistic regression problem with respect to parameters. In these logistic regression problems, each training example has a bias term determined by the current set of messages. Based on this insight, the structured energy function can be extended from linear factors to any function class where an "oracle" exists to minimize a logistic loss.Comment: Advances in Neural Information Processing Systems 201

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

Expectation-maximization for logistic regression

Author: Scott James G.
Sun Liang
Publication venue
Publication date: 31/05/2013
Field of study

We present a family of expectation-maximization (EM) algorithms for binary and negative-binomial logistic regression, drawing a sharp connection with the variational-Bayes algorithm of Jaakkola and Jordan (2000). Indeed, our results allow a version of this variational-Bayes approach to be re-interpreted as a true EM algorithm. We study several interesting features of the algorithm, and of this previously unrecognized connection with variational Bayes. We also generalize the approach to sparsity-promoting priors, and to an online method whose convergence properties are easily established. This latter method compares favorably with stochastic-gradient descent in situations with marked collinearity

arXiv.org e-Print Archive

CiteSeerX

Resampling Logistic Regression Untuk Penanganan Ketidakseimbangan Class Pada Prediksi Cacat Software

Author: Rianto H. (Harsih)
Wahono R. S. (Romi)
Publication venue: IlmuKomputer.com
Publication date: 01/01/2015
Field of study

Software yang berkualitas tinggi adalah software yang dapat membantu proses bisnis Perusahaan dengan efektif, efesien dan tidak ditemukan cacat selama proses pengujian, pemeriksaan, dan implementasi. Perbaikan software setelah pengirimana dan implementasi, membutuhkan biaya jauh lebih mahal dari pada saat pengembangan. Biaya yang dibutuhkan untuk pengujian software menghabisakan lebih dari 50% dari biaya pengembangan. Dibutuhkan model pengujian cacat software untuk mengurangi biaya yang dikeluarkan. Saat ini belum ada model prediksi cacat software yang berlaku umum pada saat digunakan digunakan. Model Logistic Regression merupakan model paling efektif dan efesien dalam prediksi cacat software. Kelemahan dari Logistic Regression adalah rentan terhadap underfitting pada dataset yang kelasnya tidak seimbang, sehingga akan menghasilkan akurasi yang rendah. Dataset NASA MDP adalah dataset umum yang digunakan dalam prediksi cacat software. Salah satu karakter dari dataset prediksi cacat software, termasuk didalamnya dataset NASA MDP adalah memiliki ketidakseimbangan pada kelas. Untuk menangani masalah ketidakseimbangan kelas pada dataset cacat software pada penelitian ini diusulkan metode resampling. Eksperimen dilakukan untuk membandingkan hasil kinerja Logistic Regression sebelum dan setelah diterapkan metode resampling. Demikian juga dilakukan eksperimen untuk membandingkan metode yang diusulkan hasil pengklasifikasi lain seperti Naïve Bayes, Linear Descriminant Analysis, C4.5, Random Forest, Neural Network, k-Nearest Network. Hasil eksperimen menunjukkan bahwa tingkat akurasi Logistic Regression dengan resampling lebih tinggi dibandingkan dengan metode Logistric Regression yang tidak menggunakan resampling, demikian juga bila dibandingkan dengan pengkalisifkasi yang lain. Dari hasil eksperimen di atas dapat disimpulkan bahwa metode resampling terbukti efektif dalam menyelesaikan ketidakseimbangan kelas pada prediksi cacat software dengan algoritma Logistic Regression

Neliti