Search CORE

72,674 research outputs found

An Efficient v-minimum Absolute Deviation Distribution Regression Machine

Author: Coghill George MacLeod
Huang Lan
Pang Wei
Song Yingying
Wang Yan
Wang Yao
Xie Xuping
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso

Author: Bach F.
Cortes C.
Cover T. M.
Eric P. Xing
Fukumizu K.
Leonid Sigal
Li F.
Liu H.
Makoto Yamada
Masaeli M.
Masashi Sugiyama
Nocedal J.
Raskutti G.
Rodriguez-Lujan I.
Schölkopf B.
Seeger M.
Song L.
Tibshirani R.
Tomioka R.
Wittawat Jitkrittum
Xing E. P.
Zhao Z.
Publication venue: 'MIT Press - Journals'
Publication date: 03/01/2019
Field of study

The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this paper, we consider a feature-wise kernelized Lasso for capturing non-linear input-output dependency. We first show that, with particular choices of kernel functions, non-redundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments with thousands of features.Comment: 18 page

arXiv.org e-Print Archive

Crossref

Reliability and validity in comparative studies of software prediction models

Author: Myrtveit I
Shepperd MJ
Stensrud E
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2005
Field of study

Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models

Crossref

Brunel University Research Archive

Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection

Author: Bai
Bickel
Bickel
Bickel
Deutsch
Efron
Fan
Fan
Fan
Fan
Fan
Frank
Friedman
Huang
Huber
Kim
Koenker
Lehmann
Li
Portnoy
Portnoy
Tibshirani
van der Vaart
Wu
Xie
Yuan
Zhao
Zou
Zou
Zou
Zou
Publication venue: 'Wiley'
Publication date: 30/06/2010
Field of study

In high-dimensional model selection problems, penalized simple least-square approaches have been extensively used. This paper addresses the question of both robustness and efficiency of penalized model selection methods, and proposes a data-driven weighted linear combination of convex loss functions, together with weighted

L_1

-penalty. It is completely data-adaptive and does not require prior knowledge of the error distribution. The weighted

L_1

-penalty is used both to ensure the convexity of the penalty term and to ameliorate the bias caused by the

L_1

-penalty. In the setting with dimensionality much larger than the sample size, we establish a strong oracle property of the proposed method that possesses both the model selection consistency and estimation efficiency for the true non-zero coefficients. As specific examples, we introduce a robust method of composite L1-L2, and optimal composite quantile method and evaluate their performance in both simulated and real data examples

arXiv.org e-Print Archive

Crossref

Estimating Blood Pressure from Photoplethysmogram Signal and Demographic Features using Machine Learning Techniques

Author: Chowdhury Moajjem Hossain
Chowdhury Muhammad E. H.
Khandakar Amith
Mahbub Zaid B
Reaz Mamun Bin Ibne
Shuzan Md Nazmul Islam
Uddin M. Monir
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Hypertension is a potentially unsafe health ailment, which can be indicated directly from the Blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important; however, cuff-based BP measurements are discrete and uncomfortable to the user. To address this need, a cuff-less, continuous and a non-invasive BP measurement system is proposed using Photoplethysmogram (PPG) signal and demographic features using machine learning (ML) algorithms. PPG signals were acquired from 219 subjects, which undergo pre-processing and feature extraction steps. Time, frequency and time-frequency domain features were extracted from the PPG and their derivative signals. Feature selection techniques were used to reduce the computational complexity and to decrease the chance of over-fitting the ML algorithms. The features were then used to train and evaluate ML algorithms. The best regression models were selected for Systolic BP (SBP) and Diastolic BP (DBP) estimation individually. Gaussian Process Regression (GPR) along with ReliefF feature selection algorithm outperforms other algorithms in estimating SBP and DBP with a root-mean-square error (RMSE) of 6.74 and 3.59 respectively. This ML model can be implemented in hardware systems to continuously monitor BP and avoid any critical health conditions due to sudden changes.Comment: Accepted for publication in Sensor, 14 Figures, 14 Table

arXiv.org e-Print Archive

Qatar University Institutional Repository