27,234 research outputs found
iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making
People are rated and ranked, towards algorithmic decision making in an
increasing number of applications, typically based on machine learning.
Research on how to incorporate fairness into such tasks has prevalently pursued
the paradigm of group fairness: giving adequate success rates to specifically
protected groups. In contrast, the alternative paradigm of individual fairness
has received relatively little attention, and this paper advances this less
explored direction. The paper introduces a method for probabilistically mapping
user records into a low-rank representation that reconciles individual fairness
and the utility of classifiers and rankings in downstream applications. Our
notion of individual fairness requires that users who are similar in all
task-relevant attributes such as job qualification, and disregarding all
potentially discriminating attributes such as gender, should have similar
outcomes. We demonstrate the versatility of our method by applying it to
classification and learning-to-rank tasks on a variety of real-world datasets.
Our experiments show substantial improvements over the best prior work for this
setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio
Classification of protein interaction sentences via gaussian processes
The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
- …