Search CORE

7,690 research outputs found

HopSkipJumpAttack: A Query-Efficient Decision-Based Attack

Author: Chen Jianbo
Jordan Michael I.
Wainwright Martin J.
Publication venue
Publication date: 27/04/2020
Field of study

The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks optimized for

\ell_2

and

\ell_\infty

similarity metrics respectively. Theoretical analysis is provided for the proposed algorithms and the gradient direction estimate. Experiments show HopSkipJumpAttack requires significantly fewer model queries than Boundary Attack. It also achieves competitive performance in attacking several widely-used defense mechanisms. (HopSkipJumpAttack was named Boundary Attack++ in a previous version of the preprint.

arXiv.org e-Print Archive

Crossref

ReSQueing Parallel and Private Stochastic Convex Optimization

Author: Carmon Yair
Jambulapati Arun
Jin Yujia
Lee Yin Tat
Liu Daogao
Sidford Aaron
Tian Kevin
Publication venue
Publication date: 27/10/2023
Field of study

We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in

\mathbb{R}^d

, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error

\epsilon_{\text{opt}}

with

d^{1/3}\epsilon_{\text{opt}}^{-2/3}

gradient oracle query depth and

d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}

gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For

\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]

, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. Given

n

samples of Lipschitz loss functions, prior works [BFTT19, BFGT20, AFKT21, KLL21] established that if

n \gtrsim d \epsilon_{\text{dp}}^{-2}

(\epsilon_{\text{dp}}, \delta)

-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large

n \gtrsim d^2 \epsilon_{\text{dp}}^{-3}

, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime

arXiv.org e-Print Archive

Learning to Efficiently Rank

Author: Wang Lidan
Publication venue
Publication date: 01/01/2012
Field of study

Web search engines allow users to find information on almost any topic imaginable. To be successful, a search engine must return relevant information to the user in a short amount of time. However, efficiency (speed) and effectiveness (relevance) are competing forces that often counteract each other. It is often the case that methods developed for improving effectiveness incur moderate-to-large computational costs, thus sustained effectiveness gains typically have to be counter-balanced by buying more/faster hardware, implementing caching strategies if possible, or spending additional effort in low-level optimizations. This thesis describes the "Learning to Efficiently Rank" framework for building highly effective ranking models for Web-scale data, without sacrificing run-time efficiency for returning results. It introduces new classes of ranking models that have the capability of being simultaneously fast and effective, and discusses the issue of how to optimize the models for speed and effectiveness. More specifically, a series of concrete instantiations of the general "Learning to Efficiently Rank" framework are illustrated in detail. First, given a desired tradeoff between effectiveness/efficiency, efficient linear models, which have a mechanism to directly optimize the tradeoff metric and achieve an optimal balance between effectiveness/efficiency, are introduced. Second, temporally constrained models for returning the most effective ranked results possible under a time constraint are described. Third, a cascade ranking model for efficient top-K retrieval over Web-scale documents is proposed, where the ranking effectiveness and efficiency are simultaneously optimized. Finally, a constrained cascade for returning results within time constraints by simultaneously reducing document set size and unnecessary features is discussed in detail

Digital Repository at the University of Maryland

Selective Query Processing: a Risk-Sensitive Selection of System Configurations

Author: Mothe Josiane
Ullah Md Zia
Publication venue
Publication date: 17/05/2023
Field of study

In information retrieval systems, search parameters are optimized to ensure high effectiveness based on a set of past searches and these optimized parameters are then used as the system configuration for all subsequent queries. A better approach, however, would be to adapt the parameters to fit the query at hand. Selective query expansion is one such an approach, in which the system decides automatically whether or not to expand the query, resulting in two possible system configurations. This approach was extended recently to include many other parameters, leading to many possible system configurations where the system automatically selects the best configuration on a per-query basis. To determine the ideal configurations to use on a per-query basis in real-world systems we developed a method in which a restricted number of possible configurations is pre-selected and then used in a meta-search engine that decides the best search configuration on a per query basis. We define a risk-sensitive approach for configuration pre-selection that considers the risk-reward trade-off between the number of configurations kept, and system effectiveness. For final configuration selection, the decision is based on query feature similarities. We find that a relatively small number of configurations (20) selected by our risk-sensitive model is sufficient to increase effectiveness by about 15% according(P@10, nDCG@10) when compared to traditional grid search using a single configuration and by about 20% when compared to learning to rank documents. Our risk-sensitive approach works for both diversity- and ad hoc-oriented searches. Moreover, the similarity-based selection method outperforms the more sophisticated approaches. Thus, we demonstrate the feasibility of developing per-query information retrieval systems, which will guide future research in this direction.Comment: 30 pages, 5 figures, 8 tables; submitted to TOIS ACM journa

arXiv.org e-Print Archive