Search CORE

17 research outputs found

Online Active Linear Regression via Thresholding

Author: Johari Ramesh
Riquelme Carlos
Zhang Baosen
Publication venue
Publication date: 21/12/2016
Field of study

We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model. Our main contribution is a novel threshold-based algorithm for selection of most informative observations; we characterize its performance and fundamental lower bounds. We extend the algorithm and its guarantees to sparse linear regression in high-dimensional settings. Simulations suggest the algorithm is remarkably robust: it provides significant benefits over passive random sampling in real-world datasets that exhibit high nonlinearity and high dimensionality --- significantly reducing both the mean and variance of the squared error.Comment: Published in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

$\ell_1$ -regression with Heavy-tailed Distributions

Author: Dorney Kate E.
George Michael
Lee Michael
Publication venue
Publication date: 25/10/2018
Field of study

In this paper, we consider the problem of linear regression with heavy-tailed distributions. Different from previous studies that use the squared loss to measure the performance, we choose the absolute loss, which is capable of estimating the conditional median. To address the challenge that both the input and output could be heavy-tailed, we propose a truncated minimization problem, and demonstrate that it enjoys an

\widetilde{O}(\sqrt{d/n})

excess risk, where

d

is the dimensionality and

n

is the number of samples. Compared with traditional work on

\ell_1

-regression, the main advantage of our result is that we achieve a high-probability risk bound without exponential moment conditions on the input and output. Furthermore, if the input is bounded, we show that the classical empirical risk minimization is competent for

\ell_1

-regression even when the output is heavy-tailed

arXiv.org e-Print Archive

Directory of Open Access Journals

eScholarship - University of California

L2P: An Algorithm for Estimating Heavy-tailed Outcomes

Author: Eliassi-Rad Tina
Varol Onur
Wang Xindi
Publication venue
Publication date: 07/07/2021
Field of study

Many real-world prediction tasks have outcome variables that have characteristic heavy-tail distributions. Examples include copies of books sold, auction prices of art pieces, demand for commodities in warehouses, etc. By learning heavy-tailed distributions, "big and rare" instances (e.g., the best-sellers) will have accurate predictions. Most existing approaches are not dedicated to learning heavy-tailed distribution; thus, they heavily under-predict such instances. To tackle this problem, we introduce Learning to Place (L2P), which exploits the pairwise relationships between instances for learning. In its training phase, L2P learns a pairwise preference classifier: is instance A > instance B? In its placing phase, L2P obtains a prediction by placing the new instance among the known instances. Based on its placement, the new instance is then assigned a value for its outcome variable. Experiments on real data show that L2P outperforms competing approaches in terms of accuracy and ability to reproduce heavy-tailed outcome distribution. In addition, L2P provides an interpretable model by placing each predicted instance in relation to its comparable neighbors. Interpretable models are highly desirable when lives and treasure are at stake.Comment: 9 pages, 6 figures, 2 tables Nature of changes from previous version: 1. Added complexity analysis in Section 2.2 2. Datasets change 3. Added LambdaMART in the baseline methods, also a brief discussion on why LambdaMart failed in our problem. 4. Figure update

arXiv.org e-Print Archive