Search CORE

62 research outputs found

On Security and Sparsity of Linear Classifiers for Adversarial Settings

Author: B Biggio
B Biggio
B Biggio
B Biggio
C Cortes
D Maiorca
F Sebastiani
F Zhang
H Xu
H Zou
R Bondell
S Sra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Machine-learning techniques are widely used in security-related applications, like spam and malware detection. However, in such settings, they have been shown to be vulnerable to adversarial attacks, including the deliberate manipulation of data at test time to evade detection. In this work, we focus on the vulnerability of linear classifiers to evasion attacks. This can be considered a relevant problem, as linear classifiers have been increasingly used in embedded systems and mobile devices for their low processing time and memory requirements. We exploit recent findings in robust optimization to investigate the link between regularization and security of linear classifiers, depending on the type of attack. We also analyze the relationship between the sparsity of feature weights, which is desirable for reducing processing cost, and the security of linear classifiers. We further propose a novel octagonal regularizer that allows us to achieve a proper trade-off between them. Finally, we empirically show how this regularizer can improve classifier security and sparsity in real-world application examples including spam and malware detection

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

Archivio istituzionale della ricerca - Università di Genova

How Subprime Borrowers and Mortgage Brokers Shared the Pie

Author: A Berndt
Antje Berndt
B Ambrose
Burton Hollifield
H Bondell
Icf Macro
J Lacko
K Engel
P Glasserman
Patrik Sandds
R Koenker
S Woodward
S Woodward
W Larson
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

We develop an equilibrium model for origination fees charged by mortgage brokers and show how the equilibrium fee distribution depends on borrowers' valuation for their loans and their information about fees. We use non-crossing quantile regressions and data from a large subprime lender to estimate conditional fee distributions. Given the fee distribution, we identify the distributions of borrower valuations and informedness. The level of informedness is higher for larger loans and in better educated neighborhoods. We quantify the fraction of the surplus from the mortgage that goes to the broker, and how it decreases as the borrower becomes more informed

Crossref

EconStor (ZBW Kiel)

Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery

Author: A Chang
A Comar
A Haghighattalab
A Makino
AC Kyratzis
Arun-Narenthiran Veeranampalayam-Sivakumar
B Gregorutti
BC Bowman
C Leng
CM Andersen
CN Law
D Basak
D Moravec
DK Ray
ER Hunt
F Degenhardt
F Iqbal
F Lu
FH Holman
G Bai
GC McDonald
H Orhan
Hannah Stoll
HD Bondell
I Guyon
J Bendig
J Crain
J Li
Jakob Geipel
Jiating Li
K Girma
KJ Archer
L Breiman
L Li
L Wang
L Wang
M Bhatta
M Du
M Schirrmann
MA Hassan
MA Hassan
Madhav Bhatta
Martin Kanning
MH Kalubarme
MP Labus
Nicholas D. Garst
O Mutanga
OA Montesinos-López
P Benincasa
P Bühlmann
P Rischbeck
P. Stephen Baenziger
R de Vlaming
R Genuer
R Tibshirani
Reka Howard
S Kipp
S Shafian
SC Kefauver
Senlin Guan
T Chu
T Duan
U Grömping
V Belamkar
Vikas Belamkar
Y Gu
Y Liu
Yeyin Shi
Yufeng Ge
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2019
Field of study

Background: Automated phenotyping technologies are continually advancing the breeding process. However, collecting various secondary traits throughout the growing season and processing massive amounts of data still take great efforts and time. Selecting a minimum number of secondary traits that have the maximum predictive power has the potential to reduce phenotyping efforts. The objective of this study was to select principal features extracted from UAV imagery and critical growth stages that contributed the most in explaining winter wheat grain yield. Five dates of multispectral images and seven dates of RGB images were collected by a UAV system during the spring growing season in 2018. Two classes of features (variables), totaling to 172 variables, were extracted for each plot from the vegetation index and plant height maps, including pixel statistics and dynamic growth rates. A parametric algorithm, LASSO regression (the least angle and shrinkage selection operator), and a non-parametric algorithm, random forest, were applied for variable selection. The regression coefficients estimated by LASSO and the permutation importance scores provided by random forest were used to determine the ten most important variables influencing grain yield from each algorithm. Results: Both selection algorithms assigned the highest importance score to the variables related with plant height around the grain filling stage. Some vegetation indices related variables were also selected by the algorithms mainly at earlier to mid growth stages and during the senescence. Compared with the yield prediction using all 172 variables derived from measured phenotypes, using the selected variables performed comparable or even better. We also noticed that the prediction accuracy on the adapted NE lines (r = 0.58–0.81) was higher than the other lines (r = 0.21–0.59) included in this study with different genetic backgrounds. Conclusions: With the ultra-high resolution plot imagery obtained by the UAS-based phenotyping we are now able to derive more features, such as the variation of plant height or vegetation indices within a plot other than just an averaged number, that are potentially very useful for the breeding purpose. However, too many features or variables can be derived in this way. The promising results from this study suggests that the selected set from those variables can have comparable prediction accuracies on the grain yield prediction than the full set of them but possibly resulting in a better allocation of efforts and resources on phenotypic data collection and processing

Crossref

DigitalCommons@University of Nebraska

Quantile regression for overdispersed count data: a hierarchical method

Author: B Reich
B Santos
D Benoit
D Lunn
Department of Communities and Local Government (DCLG)
E Moreno
E Regidor
E Tsionas
H Bondell
I Takeuchi
J Berkhof
J Caminal
J Machado
J Sheringham
K Carling
K Yu
L Benites
L Greco
L Miranda-Moreno
M Hubert
M Ruopp
P Dolton
R Chambers
R Koenker
S Brooks
S Connolly
S Connolly
S Richardson
S Sohn
S Watanabe
V Verardi
W Aeberhard
Y Tian
Y Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

Abstract Generalized Poisson regression is commonly applied to overdispersed count data, and focused on modelling the conditional mean of the response. However, conditional mean regression models may be sensitive to response outliers and provide no information on other conditional distribution features of the response. We consider instead a hierarchical approach to quantile regression of overdispersed count data. This approach has the benefits of effective outlier detection and robust estimation in the presence of outliers, and in health applications, that quantile estimates can reflect risk factors. The technique is first illustrated with simulated overdispersed counts subject to contamination, such that estimates from conditional mean regression are adversely affected. A real application involves ambulatory care sensitive emergency admissions across 7518 English patient general practitioner (GP) practices. Predictors are GP practice deprivation, patient satisfaction with care and opening hours, and region. Impacts of deprivation are particularly important in policy terms as indicating effectiveness of efforts to reduce inequalities in care sensitive admissions. Hierarchical quantile count regression is used to develop profiles of central and extreme quantiles according to specified predictor combinations

Crossref

Directory of Open Access Journals

Queen Mary Research Online

Modified technique to increase nostril cross-sectional area after using rib and septal cartilage graft over alar nasal cartilages

Author: Alvaro Julio de Andrade Sá
Baguley T
Bondell HD
Bridger GP
Camirand A
Chuang-Stein C
Ciminera JL
Constantian MB
Constantian MB
Constantian MB
Cottle MH
Daniel RK
Du P
Ellenbogen R
Fanous N
Fomon S
Gelman A
Ghidini A
Goode RL
Gunter JP
Gunter JP
Guyuron B
Horton CE
Marcelo Wulkan
McHugh ML
Mink PJ
Nivaldo Alonso
Rohrich RJ
Sen C
Shaida AM Kenyon GS
Stone A
Toriumi DM
Toriumi DM
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Conditional Quantile-Located VaR

Author: B Efron
HD Bondell
N Sim
R Koenker
T Adrian
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

Archivio istituzionale della ricerca - Università di Padova

Projected Clustering with LASSO for High Dimensional Data Analysis

Author: A.K. Jain
B. Efron
H. Zou
H. Zou
H.D. Bondell
K. Bache
K.Y. Yip
M. Yuan
R. Agarwal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Crossref

Statistical Inference Based on Pooled Data: A Moment-Based Estimating Equation Approach

Author: Aitchison J.
Aiyi Liu
Box G. E. P.
Enrique F. Schisterman
Howard D. Bondell
Sefing R. J.
Sobel M.
Publication venue
Publication date
Field of study

We consider statistical inference on parameters of a distribution when only pooled data are observed. A moment-based estimating equation approach is proposed to deal with situations where likelihood functions based on pooled data are difficult to work with. We outline the method to obtain estimates and test statistics of the parameters of interest in the general setting. We demonstrate the approach on the family of distributions generated by the Box-Cox transformation model, and, in the process, construct tests for goodness of fit based on the pooled data.Pooling biospecimens, set-based observations, moments, Box-Cox transformation, goodness-of-fit, lognormal distribution,

Crossref

Research Papers in Economics

Interquantile Shrinkage in Regression Models

Author: Akaike H.
Barro R.
Bondell H.
Fan J.
He X.
Howard D. Bondell
Huixia Judy Wang
Kato K.
Koenker R.
Koenker R.
Koenker R.
Li Y.
Liewen Jiang
Osborne M. R.
Tackeuchi I.
Tibshirani R.
Tibshirani R.
Wang H.
Wu Y.
Yuan M.
Zhao P.
Zou H.
Zou H.
———
———
———
———
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

A Confidence Region Approach to Tuning for Variable Selection

Author: Akaike H.
Funda Gunes
Howard D. Bondell
Leng C.
Shao J.
Stone M.
Tibshirani R.
Zhao P.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref