295,365 research outputs found
Beyond Submodularity: A Unified Framework of Randomized Set Selection with Group Fairness Constraints
Machine learning algorithms play an important role in a variety of important
decision-making processes, including targeted advertisement displays, home loan
approvals, and criminal behavior predictions. Given the far-reaching impact of
these algorithms, it is crucial that they operate fairly, free from bias or
prejudice towards certain groups in the population. Ensuring impartiality in
these algorithms is essential for promoting equality and avoiding
discrimination. To this end we introduce a unified framework for randomized
subset selection that incorporates group fairness constraints. Our problem
involves a global utility function and a set of group utility functions for
each group, here a group refers to a group of individuals (e.g., people)
sharing the same attributes (e.g., gender). Our aim is to generate a
distribution across feasible subsets, specifying the selection probability of
each feasible set, to maximize the global utility function while meeting a
predetermined quota for each group utility function in expectation. Note that
there may not necessarily be any direct connections between the global utility
function and each group utility function. We demonstrate that this framework
unifies and generalizes many significant applications in machine learning and
operations research. Our algorithmic results either improves the best known
result or provide the first approximation algorithms for new applications.Comment: This paper has been accepted for publication in the Journal on
Combinatorial Optimizatio
Global optimization based on active preference learning with radial basis functions
AbstractThis paper proposes a method for solving optimization problems in which the decision-maker cannot evaluate the objective function, but rather can only express apreferencesuch as "this is better than that" between two candidate decision vectors. The algorithm described in this paper aims at reaching the global optimizer by iteratively proposing the decision maker a new comparison to make, based on actively learning a surrogate of the latent (unknown and perhaps unquantifiable) objective function from past sampled decision vectors and pairwise preferences. A radial-basis function surrogate is fit via linear or quadratic programming, satisfying if possible the preferences expressed by the decision maker on existing samples. The surrogate is used to propose a new sample of the decision vector for comparison with the current best candidate based on two possible criteria: minimize a combination of the surrogate and an inverse weighting distance function to balance between exploitation of the surrogate and exploration of the decision space, or maximize a function related to the probability that the new candidate will be preferred. Compared to active preference learning based on Bayesian optimization, we show that our approach is competitive in that, within the same number of comparisons, it usually approaches the global optimum more closely and is computationally lighter. Applications of the proposed algorithm to solve a set of benchmark global optimization problems, for multi-objective optimization, and for optimal tuning of a cost-sensitive neural network classifier for object recognition from images are described in the paper. MATLAB and a Python implementations of the algorithms described in the paper are available athttp://cse.lab.imtlucca.it/~bemporad/glis
Fast Witness Extraction Using a Decision Oracle
The gist of many (NP-)hard combinatorial problems is to decide whether a
universe of elements contains a witness consisting of elements that
match some prescribed pattern. For some of these problems there are known
advanced algebra-based FPT algorithms which solve the decision problem but do
not return the witness. We investigate techniques for turning such a
YES/NO-decision oracle into an algorithm for extracting a single witness, with
an objective to obtain practical scalability for large values of . By
relying on techniques from combinatorial group testing, we demonstrate that a
witness may be extracted with queries to either a deterministic or
a randomized set inclusion oracle with one-sided probability of error.
Furthermore, we demonstrate through implementation and experiments that the
algebra-based FPT algorithms are practical, in particular in the setting of the
-path problem. Also discussed are engineering issues such as optimizing
finite field arithmetic.Comment: Journal version, 16 pages. Extended abstract presented at ESA'1
Ensemble of Example-Dependent Cost-Sensitive Decision Trees
Several real-world classification problems are example-dependent
cost-sensitive in nature, where the costs due to misclassification vary between
examples and not only within classes. However, standard classification methods
do not take these costs into account, and assume a constant cost of
misclassification errors. In previous works, some methods that take into
account the financial costs into the training of different algorithms have been
proposed, with the example-dependent cost-sensitive decision tree algorithm
being the one that gives the highest savings. In this paper we propose a new
framework of ensembles of example-dependent cost-sensitive decision-trees. The
framework consists in creating different example-dependent cost-sensitive
decision trees on random subsamples of the training set, and then combining
them using three different combination approaches. Moreover, we propose two new
cost-sensitive combination approaches; cost-sensitive weighted voting and
cost-sensitive stacking, the latter being based on the cost-sensitive logistic
regression method. Finally, using five different databases, from four
real-world applications: credit card fraud detection, churn modeling, credit
scoring and direct marketing, we evaluate the proposed method against
state-of-the-art example-dependent cost-sensitive techniques, namely,
cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision
trees. The results show that the proposed algorithms have better results for
all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio
- …