Search CORE

21 research outputs found

On Profit-Maximizing Pricing for the Highway and Tollbooth Problems

Author: A. Grigoriev
E.D. Demaine
G. Aggarwal
J.D. Hartline
K.M. Elbassioni
M. Luby
M.-F. Balcan
M.F. Balcan
M.F. Balcan
M.F. Balcan
P. Briest
P. Briest
P.W. Glynn
R. Motwani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

In the \emph{tollbooth problem}, we are given a tree \bT=(V,E) with

n

edges, and a set of

m

customers, each of whom is interested in purchasing a path on the tree. Each customer has a fixed budget, and the objective is to price the edges of \bT such that the total revenue made by selling the paths to the customers that can afford them is maximized. An important special case of this problem, known as the \emph{highway problem}, is when \bT is restricted to be a line. For the tollbooth problem, we present a randomized

O(\log n)

-approximation, improving on the current best

O(\log m)

-approximation. We also study a special case of the tollbooth problem, when all the paths that customers are interested in purchasing go towards a fixed root of \bT. In this case, we present an algorithm that returns a

(1-\epsilon)

-approximation, for any

\epsilon > 0

, and runs in quasi-polynomial time. On the other hand, we rule out the existence of an FPTAS by showing that even for the line case, the problem is strongly NP-hard. Finally, we show that in the \emph{coupon model}, when we allow some items to be priced below zero to improve the overall profit, the problem becomes even APX-hard

arXiv.org e-Print Archive

Learning Kernel Perceptrons on Noisy Data and Random Projections

Author: A. Blum
B. Schölkopf
D. Fradkin
E. Cohen
F. Rosenblatt
G. Rätsch
H.D. Block
J. Shawe-Taylor
M. Kearns
M.F. Balcan
N. Cesa-Bianchi
N. Cristianini
V. Vapnik
V. Vapnik
Y. Freund
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

In this paper, we address the issue of learning nonlinearly separable concepts with a kernel classifier in the situation where the data at hand are altered by a uniform classification noise. Our proposed approach relies on the combination of the technique of random or deterministic projections with a classification noise tolerant perceptron learning algorithm that assumes distributions defined over finite-dimensional spaces. Provided a sufficient separation margin characterizes the problem, this strategy makes it possible to envision the learning from a noisy distribution in any separable Hilbert space, regardless of its dimension; learning with any appropriate Mercer kernel is therefore possible. We prove that the required sample complexity and running time of our algorithm is polynomial in the classical PAC learning parameters. Numerical simulations on toy datasets and on data from the UCI repository support the validity of our approach

On the Usefulness of Similarity Based Projection Spaces for Transfer Learning

Author: A. Smeaton
B. Haasdonk
E. Ristad
E. Zhong
J. Quionero-Candela
K. Weinberger
L. Bruzzone
M. Bernard
M.F. Balcan
S. Ben-David
S. Ben-David
S. Pan
X. Gao
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

talk: http://videolectures.net/simbad2011_morvant_transfer/, 16 pagesInternational audienceSimilarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classication, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows ne to use similarity functions that do not need to be positive semi-de nite nor symmetric. The similarities are then used to de ne an xplicit projection space where a linear classi er with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learning data and test data are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the test data allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task

New Unsupervised Support Vector Machines

Author: B. Schoelkopf
J.F. Sturm
K. Zhao
M.F. Balcan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Efficient Parallel Learning of Word2Vec

Author: Balcan M.F.
Eickhoff C.
Vries A.P. de
Vuurens J.B.P.
Publication venue
Publication date: 01/01/2016
Field of study

When Does Co-Training Work in Real Data?

Author: I.H. Witten
M.F. Balcan
R. Vilalta
R.J. Quinlan
Z.H. Zhou
Publication venue
Publication date: 01/01/2009
Field of study

Co-training, a paradigm of semi-supervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view co-training requires the dataset to be described by two views of features, and previous studies have shown that co-training works well if the two views satisfy the sufficiency and independence assumptions. In practice, however, these two assumptions are often not known or ensured (even when the two views are given). More commonly, most supervised datasets are described by one set of attributes (one view). Thus, they need be split into two views in order to apply the standard twoview co-training. In this paper, we first propose a novel approach to empirically verify the two assumptions of co-training given two views. Then, we design several methods to split single view datasets into two views, in order to make co-training work reliably well. Our empirical results show that, given a whole or a large labeled training set, our view verification and splitting methods are quite effective. Unfortunately, co-training is called for precisely when the labeled training set is small. However, given small labeled training sets, we show that the two co-training assumptions are difficult to verify, and view splitting is unreliable. Our conclusions for co-training’s effectiveness are mixed. If two views are given, and known to satisfy the two assumptions, co-training works well. Otherwise, based on small labeled training sets, verifying the assumptions or splitting single view into two views are unreliable, thus it is uncertain whether the standard co-training would work or not

CiteSeerX

Crossref

On the Complexity of the Highway Pricing Problem

Author: A. Grigoriev
A. Grigoriev
K. Elbassioni
M.F. Balcan
M.R. Garey
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The highway pricing problem asks for prices to be determined for segments of a single highway such as to maximize the revenue obtainable from a given set of customers with known valuations. The problem is NP-hard and a recent quasi-PTAS suggests that a PTAS might be in reach. Yet, so far it has resisted any attempt for constant-factor approximation algorithms. We relate the tractability of the problem to structural properties of customers' valuations. We show that the problem becomes NP-hard as soon as the average valuations of customers are not homogeneous, even under further restrictions such as monotonicity. Moreover, we derive an efficient approximation algorithm, parameterized along the inhomogeneity of customers' valuations. Finally, we discuss extensions of our results that go beyond the highway pricing problem.\u

Maastricht University Research Portal

Crossref

University of Twente Research Information

A Quasi-PTAS for Profit-Maximizing Pricing on Line Graphs

Author: E.D. Demaine
G. Aggarwal
J.D. Hartline
M.F. Balcan
N. Bansal
P. Briest
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Semi-supervised On-Line Boosting for Robust Tracking

Author: I. Matthews
J. Friedman
J. Lim
M.F. Balcan
R. Collins
S. Avidan
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Recently, on-line adaptation of binary classifiers for tracking have been investigated. On-line learning allows for simple classifiers since only the current view of the object from its surrounding background needs to be discriminiated. However, on-line adaption faces one key problem: Each update of the tracker may introduce an error which, finally, can lead to tracking failure (drifting). The contribution of this paper is a novel on-line semi-supervised boosting method which significantly alleviates the drifting problem in tracking applications. This allows to limit the drifting problem while still staying adaptive to appearance changes. The main idea is to formulate the update process in a semisupervised fashion as combined decision of a given prior and an on-line classifier. This comes without any parameter tuning. In the experiments, we demonstrate real-time tracking of our SemiBoost tracker on several challenging test sequences where our tracker outperforms other on-line tracking methods

CiteSeerX

Crossref

Robust reductions from ranking to classification

Author: Balcan M.F.
Bansal N.
Beygelzimer A.
Bshouty N.H.
Coppersmith D.
Gentile C.
Langford J.
Sorkin G.B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

We reduce ranking, as measured by the Area Under the Receiver Operating Characteristic Curve (AUC), to binary classification. The core theorem shows that a binary classification regret of r on the induced binary problem implies an AUC regret of at most 2r. This is a large improvement over approaches such as ordering according to regressed scores, which have a regret transform of

r \mapsto nr

where n is the number of elements