Search CORE

3,814 research outputs found

A Combinatorial Approach to Robust PCA

Author: Kong Weihao
Qiao Mingda
Sen Rajat
Publication venue
Publication date: 27/11/2023
Field of study

We study the problem of recovering Gaussian data under adversarial corruptions when the noises are low-rank and the corruptions are on the coordinate level. Concretely, we assume that the Gaussian noises lie in an unknown

k

-dimensional subspace

U \subseteq \mathbb{R}^d

, and

s

randomly chosen coordinates of each data point fall into the control of an adversary. This setting models the scenario of learning from high-dimensional yet structured data that are transmitted through a highly-noisy channel, so that the data points are unlikely to be entirely clean. Our main result is an efficient algorithm that, when

ks^2 = O(d)

, recovers every single data point up to a nearly-optimal

\ell_1

error of

\tilde O(ks/d)

in expectation. At the core of our proof is a new analysis of the well-known Basis Pursuit (BP) method for recovering a sparse signal, which is known to succeed under additional assumptions (e.g., incoherence or the restricted isometry property) on the underlying subspace

U

. In contrast, we present a novel approach via studying a natural combinatorial problem and show that, over the randomness in the support of the sparse signal, a high-probability error bound is possible even if the subspace

U

is arbitrary.Comment: To appear at ITCS 202

arXiv.org e-Print Archive

Trimmed Maximum Likelihood Estimation for Robust Learning in Generalized Linear Models

Author: Awasthi Pranjal
Das Abhimanyu
Kong Weihao
Sen Rajat
Publication venue
Publication date: 02/08/2022
Field of study

We study the problem of learning generalized linear models under adversarial corruptions. We analyze a classical heuristic called the iterative trimmed maximum likelihood estimator which is known to be effective against label corruptions in practice. Under label corruptions, we prove that this simple estimator achieves minimax near-optimal risk on a wide range of generalized linear models, including Gaussian regression, Poisson regression and Binomial regression. Finally, we extend the estimator to the more challenging setting of label and covariate corruptions and demonstrate its robustness and optimality in that setting as well

arXiv.org e-Print Archive

Transformers can optimally learn regression mixture models

Author: Das Abhimanyu
Kong Weihao
Pathak Reese
Sen Rajat
Publication venue
Publication date: 14/11/2023
Field of study

Mixture models arise in many regression problems, but most methods have seen limited adoption partly due to these algorithms' highly-tailored and model-specific nature. On the other hand, transformers are flexible, neural sequence models that present the intriguing possibility of providing general-purpose prediction methods, even in this mixture setting. In this work, we investigate the hypothesis that transformers can learn an optimal predictor for mixtures of regressions. We construct a generative process for a mixture of linear regressions for which the decision-theoretic optimal procedure is given by data-driven exponential weights on a finite set of parameters. We observe that transformers achieve low mean-squared error on data generated via this process. By probing the transformer's output at inference time, we also show that transformers typically make predictions that are close to the optimal predictor. Our experiments also demonstrate that transformers can learn mixtures of regressions in a sample-efficient fashion and are somewhat robust to distribution shifts. We complement our experimental observations by proving constructively that the decision-theoretic optimal procedure is indeed implementable by a transformer.Comment: 24 pages, 9 figure

arXiv.org e-Print Archive

Linear Regression using Heterogeneous Data Batches

Author: Das Abhimanyu
Jain Ayush
Kong Weihao
Orlitsky Alon
Sen Rajat
Publication venue
Publication date: 05/09/2023
Field of study

In many learning applications, data are collected from multiple sources, each providing a \emph{batch} of samples that by itself is insufficient to learn its input-output relationship. A common approach assumes that the sources fall in one of several unknown subgroups, each with an unknown input distribution and input-output relationship. We consider one of this setup's most fundamental and important manifestations where the output is a noisy linear combination of the inputs, and there are

k

subgroups, each with its own regression vector. Prior work~\cite{kong2020meta} showed that with abundant small-batches, the regression vectors can be learned with only few,

\tilde\Omega( k^{3/2})

, batches of medium-size with

\tilde\Omega(\sqrt k)

samples each. However, the paper requires that the input distribution for all

k

subgroups be isotropic Gaussian, and states that removing this assumption is an ``interesting and challenging problem". We propose a novel gradient-based algorithm that improves on the existing results in several ways. It extends the applicability of the algorithm by: (1) allowing the subgroups' underlying input distributions to be different, unknown, and heavy-tailed; (2) recovering all subgroups followed by a significant proportion of batches even for infinite

k

; (3) removing the separation requirement between the regression vectors; (4) reducing the number of batches and allowing smaller batch sizes

arXiv.org e-Print Archive

Long-term Forecasting with TiDE: Time-series Dense Encoder

Author: Das Abhimanyu
Kong Weihao
Leach Andrew
Mathur Shaan
Sen Rajat
Yu Rose
Publication venue
Publication date: 27/04/2023
Field of study

Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and non-linear dependencies. Theoretically, we prove that the simplest linear analogue of our model can achieve near optimal error rate for linear dynamical systems (LDS) under some assumptions. Empirically, we show that our method can match or outperform prior approaches on popular long-term time-series forecasting benchmarks while being 5-10x faster than the best Transformer based model

arXiv.org e-Print Archive

Asymptotic results for fitting semiparametric transformation models to failure time data from case-cohort studies

Author: Cai Jianwen
Kong Lan
Sen Pranab K
Publication venue
Publication date: 01/01/2006
Field of study

Semiparametric transformation models are considered for failure time data from case-cohort studies, where the covariates are assembled only for a ran-domly selected subcohort from the entire cohort and additional cases outside the subcohort. We present the estimating procedures for the regression parameters and survival probability. The asymptotic properties of the resulting estimators are developed based on asymptotic results for U-statistics, martingales, stochastic processes and finite population sampling

Carolina Digital Repository

Affine equivariant rank-weighted L-estimation of multivariate location

Author: B. Chakraborty
E. Roelant
H. Chernoff
H.P. Lopuhaä
J. Jurečková
L. Kong
M. Eaton
M. Hallin
N.C. Weber
P.K. Sen
R.J. Serfling
R.L. Obenchain
R.Y. Liu
W. Hoeffding
Y. Zuo
Y. Zuo
Y. Zuo
Y.P. Chaubey
Publication venue
Publication date: 01/01/2015
Field of study

In the multivariate one-sample location model, we propose a class of flexible robust, affine-equivariant L-estimators of location, for distributions invoking affine-invariance of Mahalanobis distances of individual observations. An involved iteration process for their computation is numerically illustrated.Comment: 16 pages, 4 figures, 6 table

arXiv.org e-Print Archive

Crossref

Phenomenology of a three-family model with gauge symmetry SU(3)_c X SU(4)_L X U(1)_X

Author: Blanke M
Blanke M
Fayyazuddin
Kaplan D E
Kong O C W
Luis A Sánchez
Pisano F Tran T A
Pleitez V
Ponce W A Giraldo Y Sánchez L A Díaz-Cruz J L
Sen S Dixit A
Stiven Villada
Voloshin M
Publication venue: 'IOP Publishing'
Publication date: 25/08/2009
Field of study

We study an extension of the gauge group SU(3)_c X SU(2)_L X U(1)_Y of the standard model to the symmetry group SU(3)_c X SU(4)_L X U(1)_X (3-4-1 for short). This extension provides an interesting attempt to answer the question of family replication in the sense that models for the electroweak interaction can be constructed so that anomaly cancellation is achieved by an interplay between generations, all of them under the condition that the number of families must be divisible by the number of colours of SU(3)_c. This method of anomaly cancellation requires a family of quarks transforming differently from the other two, thus leading to tree-level flavour changing neutral currents (FCNC) transmitted by the two extra neutral gauge bosons

Z'

and

Z''

predicted by the model. In a version of the 3-4-1 extension, which does not contain particles with exotic electric charges, we study the fermion mass spectrum and some aspects of the phenomenology of the neutral gauge boson sector. In particular, we impose limits on the

Z-Z'

mixing angle and on the mass scale of the corresponding physical new neutral gauge boson

Z_2

, and establish a lower bound on the mass of the additional new neutral gauge boson

Z'' \equiv Z_3

. For the analysis we use updated precision electroweak data at the Z-pole from the CERN LEP and SLAC Linear Collider, and atomic parity violation data. The mass scale of the additional new neutral gauge boson

Z_3

is constrained by using updated experimental inputs from neutral meson mixing in the analysis of the sources of FCNC in the model. The data constrain the

Z-Z'

mixing angle to a very small value of O(0.001), and the lower bounds on

M_{Z_2}

and on

M_{Z_3}

are found to be of O(1 TeV) and of O(7 TeV), repectively.Comment: 22 pages, 6 tables, 1 figure. To appear in J. Phys. G: Nuclear and Particle Physic

arXiv.org e-Print Archive

Crossref