21 research outputs found
On Profit-Maximizing Pricing for the Highway and Tollbooth Problems
In the \emph{tollbooth problem}, we are given a tree \bT=(V,E) with
edges, and a set of customers, each of whom is interested in purchasing a
path on the tree. Each customer has a fixed budget, and the objective is to
price the edges of \bT such that the total revenue made by selling the paths
to the customers that can afford them is maximized. An important special case
of this problem, known as the \emph{highway problem}, is when \bT is
restricted to be a line.
For the tollbooth problem, we present a randomized -approximation,
improving on the current best -approximation. We also study a
special case of the tollbooth problem, when all the paths that customers are
interested in purchasing go towards a fixed root of \bT. In this case, we
present an algorithm that returns a -approximation, for any
, and runs in quasi-polynomial time. On the other hand, we rule
out the existence of an FPTAS by showing that even for the line case, the
problem is strongly NP-hard. Finally, we show that in the \emph{coupon model},
when we allow some items to be priced below zero to improve the overall profit,
the problem becomes even APX-hard
Learning Kernel Perceptrons on Noisy Data and Random Projections
In this paper, we address the issue of learning nonlinearly separable concepts with a kernel classifier in the situation where the data at hand are altered by a uniform classification noise. Our proposed approach relies on the combination of the technique of random or deterministic projections with a classification noise tolerant perceptron learning algorithm that assumes distributions defined over finite-dimensional spaces. Provided a sufficient separation margin characterizes the problem, this strategy makes it possible to envision the learning from a noisy distribution in any separable Hilbert space, regardless of its dimension; learning with any appropriate Mercer kernel is therefore possible. We prove that the required sample complexity and running time of our algorithm is polynomial in the classical PAC learning parameters. Numerical simulations on toy datasets and on data from the UCI repository support the validity of our approach
On the Usefulness of Similarity Based Projection Spaces for Transfer Learning
talk: http://videolectures.net/simbad2011_morvant_transfer/, 16 pagesInternational audienceSimilarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classication, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows ne to use similarity functions that do not need to be positive semi-de nite nor symmetric. The similarities are then used to de ne an xplicit projection space where a linear classi er with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learning data and test data are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the test data allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task
When Does Co-Training Work in Real Data?
Co-training, a paradigm of semi-supervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view co-training requires the dataset to be described by two views of features, and previous studies have shown that co-training works well if the two views satisfy the sufficiency and independence assumptions. In practice, however, these two assumptions are often not known or ensured (even when the two views are given). More commonly, most supervised datasets are described by one set of attributes (one view). Thus, they need be split into two views in order to apply the standard twoview co-training. In this paper, we first propose a novel approach to empirically verify the two assumptions of co-training given two views. Then, we design several methods to split single view datasets into two views, in order to make co-training work reliably well. Our empirical results show that, given a whole or a large labeled training set, our view verification and splitting methods are quite effective. Unfortunately, co-training is called for precisely when the labeled training set is small. However, given small labeled training sets, we show that the two co-training assumptions are difficult to verify, and view splitting is unreliable. Our conclusions for co-trainingâs effectiveness are mixed. If two views are given, and known to satisfy the two assumptions, co-training works well. Otherwise, based on small labeled training sets, verifying the assumptions or splitting single view into two views are unreliable, thus it is uncertain whether the standard co-training would work or not
On the Complexity of the Highway Pricing Problem
The highway pricing problem asks for prices to be determined for segments of a single highway such as to maximize the revenue obtainable from a given set of customers with known valuations. The problem is NP-hard and a recent quasi-PTAS suggests that a PTAS might be in reach. Yet, so far it has resisted any attempt for constant-factor approximation algorithms. We relate the tractability of the problem to structural properties of customers' valuations. We show that the problem becomes NP-hard as soon as the average valuations of customers are not homogeneous, even under further restrictions such as monotonicity. Moreover, we derive an efficient approximation algorithm, parameterized along the inhomogeneity of customers' valuations. Finally, we discuss extensions of our results that go beyond the highway pricing problem.\u
Semi-supervised On-Line Boosting for Robust Tracking
Recently, on-line adaptation of binary classifiers for tracking have been investigated. On-line learning allows for simple classifiers since only the current view of the object from its surrounding background needs to be discriminiated. However, on-line adaption faces one key problem: Each update of the tracker may introduce an error which, finally, can lead to tracking failure (drifting). The contribution of this paper is a novel on-line semi-supervised boosting method which significantly alleviates the drifting problem in tracking applications. This allows to limit the drifting problem while still staying adaptive to appearance changes. The main idea is to formulate the update process in a semisupervised fashion as combined decision of a given prior and an on-line classifier. This comes without any parameter tuning. In the experiments, we demonstrate real-time tracking of our SemiBoost tracker on several challenging test sequences where our tracker outperforms other on-line tracking methods
Robust reductions from ranking to classification
We reduce ranking, as measured by the Area Under the Receiver Operating Characteristic Curve (AUC), to binary classification. The core theorem shows that a binary classification regret of r on the induced binary problem implies an AUC regret of at most 2r. This is a large improvement over approaches such as ordering according to regressed scores, which have a regret transform of where n is the number of elements