Search CORE

180 research outputs found

Model selection and error estimation

Author: Gábor Lugosi
Peter L. Bartlett
Stéphane Boucheron
Publication venue
Publication date
Field of study

We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical {\sc vc} dimension, empirical {\sc vc} entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.Complexity regularization, model selection, error estimation, concentration of measure

Research Papers in Economics

Structured Random Matrices

Author: F. Lust-Piquard
G. Pisier
G.W. Anderson
L. Mackey
M. Rudelson
M. Talagrand
N. Srivastava
N. Tomczak-Jaegermann
S. Boucheron
T. Tao
Y. Gordon
Y. Seginer
Z.D. Bai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/10/2016
Field of study

Random matrix theory is a well-developed area of probability theory that has numerous connections with other areas of mathematics and its applications. Much of the literature in this area is concerned with matrices that possess many exact or approximate symmetries, such as matrices with i.i.d. entries, for which precise analytic results and limit theorems are available. Much less well understood are matrices that are endowed with an arbitrary structure, such as sparse Wigner matrices or matrices whose entries possess a given variance pattern. The challenge in investigating such structured random matrices is to understand how the given structure of the matrix is reflected in its spectral properties. This chapter reviews a number of recent results, methods, and open problems in this direction, with a particular emphasis on sharp spectral norm inequalities for Gaussian random matrices.Comment: 46 pages; to appear in IMA Volume "Discrete Structures: Analysis and Applications" (Springer

arXiv.org e-Print Archive

Crossref

PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

Author: A. Tsybakov
C. Cortes
D. A. McAllester
D. A. McAllester
E. Mammen
J. H. Friedman
J. Rissanen
J.-Y. Audibert
L. Devroye
P. Alquier
R. Schapire
S. Boucheron
T. Zhang
W. Hoeffding
Publication venue: 'Allerton Press'
Publication date: 01/01/2008
Field of study

The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Polytechnique

Mirror Descent and Convex Optimization Problems With Non-Smooth Inequality Constraints

Author: A Beck
A Beck
A Ben-Tal
A Juditsky
A Juditsky
A Nedic
A Nemirovski
A Nemirovskii
A Nemirovsky
B Polyak
B Polyak
L Xiao
NZ Shor
S Boucheron
Y Nesterov
Y Nesterov
Yurii Nesterov
Publication venue
Publication date: 29/01/2018
Field of study

We consider the problem of minimization of a convex function on a simple set with convex non-smooth inequality constraint and describe first-order methods to solve such problems in different situations: smooth or non-smooth objective function; convex or strongly convex objective and constraint; deterministic or randomized information about the objective and constraint. We hope that it is convenient for a reader to have all the methods for different settings in one place. Described methods are based on Mirror Descent algorithm and switching subgradient scheme. One of our focus is to propose, for the listed different settings, a Mirror Descent with adaptive stepsizes and adaptive stopping rule. This means that neither stepsize nor stopping rule require to know the Lipschitz constant of the objective or constraint. We also construct Mirror Descent for problems with objective function, which is not Lipschitz continuous, e.g. is a quadratic function. Besides that, we address the problem of recovering the solution of the dual problem

arXiv.org e-Print Archive

Crossref

Utility of multispectral imaging for nuclear classification of routine clinical histopathology imagery

Author: A Neher
B Weyn
B Weyn
B Weyn
BS Manjunath
C Angeletti
CM Bishop
David L Rimm
DJ Zahniser
F Schnorrenberg
G van de Wouwer
H Stark
L Latson
Laura E Boucheron
LE Boucheron
MA Brewer
MA Roula
MS Bartlett
Neal R Harvey
NH Anderson
NR Harvey
R Jaganath
R Levenson
RM Levenson
RM Levenson
S Theodoridis
SM Gentry
T Mairinger
Zhiqiang Bi
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background We present an analysis of the utility of multispectral versus standard RGB imagery for routine H&E stained histopathology images, in particular for pixel-level classification of nuclei. Our multispectral imagery has 29 spectral bands, spaced 10 nm within the visual range of 420–700 nm. It has been hypothesized that the additional spectral bands contain further information useful for classification as compared to the 3 standard bands of RGB imagery. We present analyses of our data designed to test this hypothesis. Results For classification using all available image bands, we find the best performance (equal tradeoff between detection rate and false alarm rate) is obtained from either the multispectral or our "ccd" RGB imagery, with an overall increase in performance of 0.79% compared to the next best performing image type. For classification using single image bands, the single best multispectral band (in the red portion of the spectrum) gave a performance increase of 0.57%, compared to performance of the single best RGB band (red). Additionally, red bands had the highest coefficients/preference in our classifiers. Principal components analysis of the multispectral imagery indicates only two significant image bands, which is not surprising given the presence of two stains. Conclusion Our results indicate that multispectral imagery for routine H&E stained histopathology provides minimal additional spectral information for a pixel-level nuclear classification task than would standard RGB imagery.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Maximum-Reward Motion in a Stochastic Environment: The Nonequilibrium Statistical Mechanics Perspective

Author: A Schrijver
C Urmson
CV Heer
DJ Bertsimas
DP Dubhashi
GF Mazonko
GR Fleming
JB Martin
JB Martin
K Johansson
L Ingber
MJ Steele
S Boucheron
S Scherer
T Antal
T Nagatani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2015
Field of study

We consider the problem of computing the maximum-reward motion in a reward field in an online setting. We assume that the robot has a limited perception range, and it discovers the reward field on the fly. We analyze the performance of a simple, practical lattice-based algorithm with respect to the perception range. Our main result is that, with very little perception range, the robot can collect as much reward as if it could see the whole reward field, under certain assumptions. Along the way, we establish novel connections between this class of problems and certain fundamental problems of nonequilibrium statistical mechanics . We demonstrate our results in simulation examples

CiteSeerX

DSpace@MIT

Crossref

Faster Hoeffding Racing: Bernstein Races via Jackknife Estimates

Author: A. Antos
B. Efron
C. McDiarmid
E. Even-Dar
J.-Y. Audibert
J.-Y. Audibert
J.M. Steele
L. Paninski
M. Arcones
R. Jin
S. Boucheron
S.N. Bernstein
T. Peel
W. Hoeffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Performance of LoRa-WAN Sensors for Precision Livestock Tracking and Biosensing Applications

Author: Boucheron L.
Brandani C. B.
Cao H.
Chen H.
Cibils Andrés F.
Cox A.
Duff G.
Estell R. E.
Funk M.
Gong Q.
Gouvea V.
McIntosh M. M.
Nyamuryekung’e S.
Spiegal S.
Utsumi S. A.
Publication venue: UKnowledge
Publication date: 30/01/2022
Field of study

This study investigated the integration of Long Range Wide Area Network (LoRa WAN) communication technology and sensors for use as Internet of Things (IoT) platform for Precision Livestock-Farming (PLF) applications. The research was conducted at New Mexico State University’s Clayton Livestock Research Centre. The functionality of LoRA WAN communication technology and performance of LoRa WAN motion and GPS sensors were tested using static sensors that were placed either, a) outdoors and at incremental distances from the LoRa WAN gateway antenna (Field, n=6), or b) housed indoors and close to the same LoRa WAN gateway antenna (Indoor, n=5). Accelerometer data, reported as motion intensity index, and GPS location were acquired, transmitted and logged at 1 and 15 minute intervals, respectively. We evaluated the tracker\u27s GPS accuracy (GPSBias as the euclidean distance between the actual and projected tracker location) and variables associated with the tracker’s data transmission capabilities. The results indicate that field trackers had a greater accuracy for remote sensing of GPS locations compared to indoor trackers facing increasing communication interference to acquire satellite signals (GPSBias; 5.20 vs. 17.76 m; P\u3c 0.01). Overall, the trackers and deployments appeared to have a comparable GPS accuracy to other tracking devices and systems available in the market. The total data packets that were successfully transmitted were similar between the indoor and field trackers, but the number of data packets that were processed varied between the two deployments (P=0.02). Due to the static deployment of indoor and field trackers, activity data was almost non-existent for most devices. However, same trackers embedded on collars that were mounted on mature cattle showed clear diurnal patterns consistent with time budgets exerted by grazing cattle. The pilot testing of GPS and accelerometer sensors using LoRa WAN technology revealed reasonable sensor sensitivity and reliability for integration in PLF platforms

University of Kentucky

Some inequalities on generalized entropies

Author: A El-Barakaty
A Rényi
C Tsallis
C Tsallis
C Tsallis
C Tsallis
DI Cartwright
EW Weisstein
F Kittaneh
FC Mitroi
Flavia-Corina Mitroi
I Csiszár
I Csiszár
I Csiszár
J Aczél
JM Aldaz
JM Aldaz
L-H Sun
M Masi
M Sebawe Abdalla
N Minculete
N Minculete
N Minculete
Nicuşor Minculete
S Boucheron
S Dragomir
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
Shigeru Furuichi
T Han
TM Cover
Z Daróczy
Z Ficek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/12/2011
Field of study

We give several inequalities on generalized entropies involving Tsallis entropies, using some inequalities obtained by improvements of Young's inequality. We also give a generalized Han's inequality.Comment: 15 page

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Estimation in high dimensions: a geometric perspective

This tutorial provides an exposition of a flexible geometric framework for high dimensional estimation problems with constraints. The tutorial develops geometric intuition about high dimensional sets, justifies it with some results of asymptotic convex geometry, and demonstrates connections between geometric results and estimation problems. The theory is illustrated with applications to sparse recovery, matrix completion, quantization, linear and logistic regression and generalized linear models.Comment: 56 pages, 9 figures. Multiple minor change

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California