Search CORE

8 research outputs found

SOCP relaxation bounds for the optimal subset selection problem applied to robust linear regression

Author: Flores Salvador
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

This paper deals with the problem of finding the globally optimal subset of h elements from a larger set of n elements in d space dimensions so as to minimize a quadratic criterion, with an special emphasis on applications to computing the Least Trimmed Squares Estimator (LTSE) for robust regression. The computation of the LTSE is a challenging subset selection problem involving a nonlinear program with continuous and binary variables, linked in a highly nonlinear fashion. The selection of a globally optimal subset using the branch and bound (BB) algorithm is limited to problems in very low dimension, tipically d<5, as the complexity of the problem increases exponentially with d. We introduce a bold pruning strategy in the BB algorithm that results in a significant reduction in computing time, at the price of a negligeable accuracy lost. The novelty of our algorithm is that the bounds at nodes of the BB tree come from pseudo-convexifications derived using a linearization technique with approximate bounds for the nonlinear terms. The approximate bounds are computed solving an auxiliary semidefinite optimization problem. We show through a computational study that our algorithm performs well in a wide set of the most difficult instances of the LTSE problem.Comment: 12 pages, 3 figures, 2 table

arXiv.org e-Print Archive

Repositorio Académico de la Universidad de Chile

Least quantile regression via modern optimization

Author: Bertsimas Dimitris
Mazumder Rahul
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/03/2014
Field of study

We address the Least Quantile of Squares (LQS) (and in particular the Least Median of Squares) regression problem using modern optimization methods. We propose a Mixed Integer Optimization (MIO) formulation of the LQS problem which allows us to find a provably global optimal solution for the LQS problem. Our MIO framework has the appealing characteristic that if we terminate the algorithm early, we obtain a solution with a guarantee on its sub-optimality. We also propose continuous optimization methods based on first-order subdifferential methods, sequential linear optimization and hybrid combinations of them to obtain near optimal solutions to the LQS problem. The MIO algorithm is found to benefit significantly from high quality solutions delivered by our continuous optimization based methods. We further show that the MIO approach leads to (a) an optimal solution for any dataset, where the data-points

(y_i,\mathbf{x}_i)

's are not necessarily in general position, (b) a simple proof of the breakdown point of the LQS objective value that holds for any dataset and (c) an extension to situations where there are polyhedral constraints on the regression coefficient vector. We report computational results with both synthetic and real-world datasets showing that the MIO algorithm with warm starts from the continuous optimization methods solve small (

n=100

) and medium (

n=500

) size problems to provable optimality in under two hours, and outperform all publicly available methods for large-scale (

n={}

10,000) LQS problems.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1223 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

DSpace@MIT

Crossref

All non-trivial variants of 3-LDT are equivalent

Author: Dudek Bartłomiej
Gawrychowski Paweł
Starikovskaya Tatiana
Publication venue
Publication date: 01/01/2020
Field of study

The popular 3-SUM conjecture states that there is no strongly subquadratic time algorithm for checking if a given set of integers contains three distinct elements that sum up to zero. A closely related problem is to check if a given set of integers contains distinct

x_1, x_2, x_3

such that

x_1+x_2=2x_3

. This can be reduced to 3-SUM in almost-linear time, but surprisingly a reverse reduction establishing 3-SUM hardness was not known. We provide such a reduction, thus resolving an open question of Erickson. In fact, we consider a more general problem called 3-LDT parameterized by integer parameters

\alpha_1, \alpha_2, \alpha_3

and

t

. In this problem, we need to check if a given set of integers contains distinct elements

x_1, x_2, x_3

such that

\alpha_1 x_1+\alpha_2 x_2 +\alpha_3 x_3 = t

. For some combinations of the parameters, every instance of this problem is a NO-instance or there exists a simple almost-linear time algorithm. We call such variants trivial. We prove that all non-trivial variants of 3-LDT are equivalent under subquadratic reductions. Our main technical contribution is an efficient deterministic procedure based on the famous Behrend's construction that partitions a given set of integers into few subsets that avoid a chosen linear equation

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Data Structures Meet Cryptography: 3SUM with Preprocessing

Author: Golovnev Alexander
Guo Siyao
Horel Thibaut
Park Sunoo
Vaikuntanathan Vinod
Publication venue
Publication date: 02/04/2021
Field of study

This paper shows several connections between data structure problems and cryptography against preprocessing attacks. Our results span data structure upper bounds, cryptographic applications, and data structure lower bounds, as summarized next. First, we apply Fiat--Naor inversion, a technique with cryptographic origins, to obtain a data structure upper bound. In particular, our technique yields a suite of algorithms with space

S

and (online) time

T

for a preprocessing version of the

N

-input 3SUM problem where

S^3\cdot T = \widetilde{O}(N^6)

. This disproves a strong conjecture (Goldstein et al., WADS 2017) that there is no data structure that solves this problem for

S=N^{2-\delta}

and

T = N^{1-\delta}

for any constant

\delta>0

. Secondly, we show equivalence between lower bounds for a broad class of (static) data structure problems and one-way functions in the random oracle model that resist a very strong form of preprocessing attack. Concretely, given a random function

F: [N] \to [N]

(accessed as an oracle) we show how to compile it into a function

G^F: [N^2] \to [N^2]

which resists

S

-bit preprocessing attacks that run in query time

T

where

ST=O(N^{2-\varepsilon})

(assuming a corresponding data structure lower bound on 3SUM). In contrast, a classical result of Hellman tells us that

F

itself can be more easily inverted, say with

N^{2/3}

-bit preprocessing in

N^{2/3}

time. We also show that much stronger lower bounds follow from the hardness of kSUM. Our results can be equivalently interpreted as security against adversaries that are very non-uniform, or have large auxiliary input, or as security in the face of a powerfully backdoored random oracle. Thirdly, we give non-adaptive lower bounds for 3SUM and a range of geometric problems which match the best known lower bounds for static data structure problems

arXiv.org e-Print Archive

DSpace@MIT

Segmentation and wake removal of seafaring vessels in optical satellite images

Author: Ali A
Ali A Mohamoud
Henri Bouma
Henri Bouma
Rob J Dekker
Rob J Dekker
Robin M Schoemaker
Robin M Schoemaker
Publication venue
Publication date: 01/01/2013
Field of study

ABSTRACT This paper aims at the segmentation of seafaring vessels in optical satellite images, which allows an accurate length estimation. In maritime situation awareness, vessel length is an important parameter to classify a vessel. The proposed segmentation system consists of robust foreground-background separation, wake detection and ship-wake separation, simultaneous position and profile clustering and a special module for small vessel segmentation. We compared our system with a baseline implementation on 53 vessels that were observed with GeoEye-1. The results show that the relative L1 error in the length estimation is reduced from 3.9 to 0.5, which is an improvement of 87%. We learned that the wake removal is an important element for the accurate segmentation and length estimation of ships

CiteSeerX

On the Least Median Square Problem

Author: David M. Mount
Jeff Erickson
Sariel Har-peled
Publication venue
Publication date: 01/01/2003
Field of study

We consider the exact and approximate computational complexity of the multivariate LMS linear regression estimator. The LMS estimator is among the most widely used robust linear statistical estimators. Given a set of n points in IR and a parameter k, the problem is equivalent to computing the slab bounded by two parallel hyperplanes of minimum separation that contains k of the points. We present algorithms for the exact and approximate versions of the multivariate LMS problem. We also provide nearly matching lower bounds on the computational complexity of these problems

CiteSeerX

Crossref