25,706 research outputs found
Predictive User Modeling with Actionable Attributes
Different machine learning techniques have been proposed and used for
modeling individual and group user needs, interests and preferences. In the
traditional predictive modeling instances are described by observable
variables, called attributes. The goal is to learn a model for predicting the
target variable for unseen instances. For example, for marketing purposes a
company consider profiling a new user based on her observed web browsing
behavior, referral keywords or other relevant information. In many real world
applications the values of some attributes are not only observable, but can be
actively decided by a decision maker. Furthermore, in some of such applications
the decision maker is interested not only to generate accurate predictions, but
to maximize the probability of the desired outcome. For example, a direct
marketing manager can choose which type of a special offer to send to a client
(actionable attribute), hoping that the right choice will result in a positive
response with a higher probability. We study how to learn to choose the value
of an actionable attribute in order to maximize the probability of a desired
outcome in predictive modeling. We emphasize that not all instances are equally
sensitive to changes in actions. Accurate choice of an action is critical for
those instances, which are on the borderline (e.g. users who do not have a
strong opinion one way or the other). We formulate three supervised learning
approaches for learning to select the value of an actionable attribute at an
instance level. We also introduce a focused training procedure which puts more
emphasis on the situations where varying the action is the most likely to take
the effect. The proof of concept experimental validation on two real-world case
studies in web analytics and e-learning domains highlights the potential of the
proposed approaches
Generalized Zero-Shot Learning via Synthesized Examples
We present a generative framework for generalized zero-shot learning where
the training and test classes are not necessarily disjoint. Built upon a
variational autoencoder based architecture, consisting of a probabilistic
encoder and a probabilistic conditional decoder, our model can generate novel
exemplars from seen/unseen classes, given their respective class attributes.
These exemplars can subsequently be used to train any off-the-shelf
classification model. One of the key aspects of our encoder-decoder
architecture is a feedback-driven mechanism in which a discriminator (a
multivariate regressor) learns to map the generated exemplars to the
corresponding class attribute vectors, leading to an improved generator. Our
model's ability to generate and leverage examples from unseen classes to train
the classification model naturally helps to mitigate the bias towards
predicting seen classes in generalized zero-shot learning settings. Through a
comprehensive set of experiments, we show that our model outperforms several
state-of-the-art methods, on several benchmark datasets, for both standard as
well as generalized zero-shot learning.Comment: Accepted in CVPR'1
An empirical comparison of supervised machine learning techniques in bioinformatics
Research in bioinformatics is driven by the experimental data.
Current biological databases are populated by vast amounts of
experimental data. Machine learning has been widely applied to
bioinformatics and has gained a lot of success in this research
area. At present, with various learning algorithms available in the
literature, researchers are facing difficulties in choosing the best
method that can apply to their data. We performed an empirical
study on 7 individual learning systems and 9 different combined
methods on 4 different biological data sets, and provide some
suggested issues to be considered when answering the following
questions: (i) How does one choose which algorithm is best
suitable for their data set? (ii) Are combined methods better than
a single approach? (iii) How does one compare the effectiveness
of a particular algorithm to the others
Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images
In this paper, we design and evaluate a convolutional autoencoder that
perturbs an input face image to impart privacy to a subject. Specifically, the
proposed autoencoder transforms an input face image such that the transformed
image can be successfully used for face recognition but not for gender
classification. In order to train this autoencoder, we propose a novel training
scheme, referred to as semi-adversarial training in this work. The training is
facilitated by attaching a semi-adversarial module consisting of a pseudo
gender classifier and a pseudo face matcher to the autoencoder. The objective
function utilized for training this network has three terms: one to ensure that
the perturbed image is a realistic face image; another to ensure that the
gender attributes of the face are confounded; and a third to ensure that
biometric recognition performance due to the perturbed image is not impacted.
Extensive experiments confirm the efficacy of the proposed architecture in
extending gender privacy to face images
InfoScrub: Towards Attribute Privacy by Targeted Obfuscation
Personal photos of individuals when shared online, apart from exhibiting a
myriad of memorable details, also reveals a wide range of private information
and potentially entails privacy risks (e.g., online harassment, tracking). To
mitigate such risks, it is crucial to study techniques that allow individuals
to limit the private information leaked in visual data. We tackle this problem
in a novel image obfuscation framework: to maximize entropy on inferences over
targeted privacy attributes, while retaining image fidelity. We approach the
problem based on an encoder-decoder style architecture, with two key novelties:
(a) introducing a discriminator to perform bi-directional translation
simultaneously from multiple unpaired domains; (b) predicting an image
interpolation which maximizes uncertainty over a target set of attributes. We
find our approach generates obfuscated images faithful to the original input
images, and additionally increase uncertainty by 6.2 (or up to 0.85
bits) over the non-obfuscated counterparts.Comment: 20 pages, 7 figure
Diacritic Restoration and the Development of a Part-of-Speech Tagset for the MÄori Language
This thesis investigates two fundamental problems in natural language processing: diacritic restoration and part-of-speech tagging. Over the past three decades, statistical approaches to diacritic restoration and part-of-speech tagging have grown in interest as a consequence of the increasing availability of manually annotated training data in major languages such as English and French. However, these approaches are not practical for most minority languages, where appropriate training data is either non-existent or not publically available. Furthermore, before developing a part-of-speech tagging system, a suitable tagset is required for that language. In this thesis, we make the following contributions to bridge this gap:
Firstly, we propose a method for diacritic restoration based on naive Bayes classifiers that act at word-level. Classifications are based on a rich set of features, extracted automatically from training data in the form of diacritically marked text. This method requires no additional resources, which makes it language independent. The algorithm was evaluated on one language, namely MÄori, and an accuracy exceeding 99% was observed.
Secondly, we present our work on creating one of the necessary resources for the development of a part-of-speech tagging system in MÄori, that of a suitable tagset. The tagset described was developed in accordance with the EAGLES guidelines for morphosyntactic annotation of corpora, and was the result of in-depth analysis of the MÄori grammar
- âŠ