118 research outputs found
Blind Multiclass Ensemble Classification
The rising interest in pattern recognition and data analytics has spurred the
development of innovative machine learning algorithms and tools. However, as
each algorithm has its strengths and limitations, one is motivated to
judiciously fuse multiple algorithms in order to find the "best" performing
one, for a given dataset. Ensemble learning aims at such high-performance
meta-algorithm, by combining the outputs from multiple algorithms. The present
work introduces a blind scheme for learning from ensembles of classifiers,
using a moment matching method that leverages joint tensor and matrix
factorization. Blind refers to the combiner who has no knowledge of the
ground-truth labels that each classifier has been trained on. A rigorous
performance analysis is derived and the proposed scheme is evaluated on
synthetic and real datasets.Comment: To appear in IEEE Transactions in Signal Processin
Speech-driven Animation with Meaningful Behaviors
Conversational agents (CAs) play an important role in human computer
interaction. Creating believable movements for CAs is challenging, since the
movements have to be meaningful and natural, reflecting the coupling between
gestures and speech. Studies in the past have mainly relied on rule-based or
data-driven approaches. Rule-based methods focus on creating meaningful
behaviors conveying the underlying message, but the gestures cannot be easily
synchronized with speech. Data-driven approaches, especially speech-driven
models, can capture the relationship between speech and gestures. However, they
create behaviors disregarding the meaning of the message. This study proposes
to bridge the gap between these two approaches overcoming their limitations.
The approach builds a dynamic Bayesian network (DBN), where a discrete variable
is added to constrain the behaviors on the underlying constraint. The study
implements and evaluates the approach with two constraints: discourse functions
and prototypical behaviors. By constraining on the discourse functions (e.g.,
questions), the model learns the characteristic behaviors associated with a
given discourse class learning the rules from the data. By constraining on
prototypical behaviors (e.g., head nods), the approach can be embedded in a
rule-based system as a behavior realizer creating trajectories that are timely
synchronized with speech. The study proposes a DBN structure and a training
approach that (1) models the cause-effect relationship between the constraint
and the gestures, (2) initializes the state configuration models increasing the
range of the generated behaviors, and (3) captures the differences in the
behaviors across constraints by enforcing sparse transitions between shared and
exclusive states per constraint. Objective and subjective evaluations
demonstrate the benefits of the proposed approach over an unconstrained model.Comment: 13 pages, 12 figures, 5 table
Globally Optimal Crowdsourcing Quality Management
We study crowdsourcing quality management, that is, given worker responses to
a set of tasks, our goal is to jointly estimate the true answers for the tasks,
as well as the quality of the workers. Prior work on this problem relies
primarily on applying Expectation-Maximization (EM) on the underlying maximum
likelihood problem to estimate true answers as well as worker quality.
Unfortunately, EM only provides a locally optimal solution rather than a
globally optimal one. Other solutions to the problem (that do not leverage EM)
fail to provide global optimality guarantees as well. In this paper, we focus
on filtering, where tasks require the evaluation of a yes/no predicate, and
rating, where tasks elicit integer scores from a finite domain. We design
algorithms for finding the global optimal estimates of correct task answers and
worker quality for the underlying maximum likelihood problem, and characterize
the complexity of these algorithms. Our algorithms conceptually consider all
mappings from tasks to true answers (typically a very large number), leveraging
two key ideas to reduce, by several orders of magnitude, the number of mappings
under consideration, while preserving optimality. We also demonstrate that
these algorithms often find more accurate estimates than EM-based algorithms.
This paper makes an important contribution towards understanding the inherent
complexity of globally optimal crowdsourcing quality management
Preference Learning
This report documents the program and the outcomes of Dagstuhl Seminar 14101 “Preference Learning”. Preferences have recently received considerable attention in disciplines such as machine learning, knowledge discovery, information retrieval, statistics, social choice theory, multiple criteria decision making, decision under risk and uncertainty, operations research, and others. The motivation for this seminar was to showcase recent progress in these different areas with the goal of working towards a common basis of understanding, which should help to facilitate future synergies
Inter-annotator agreement using the Conversation Analysis Modelling Schema, for dialogue
We present the Conversation Analysis Modeling Schema (CAMS), a novel dialogue labeling schema that combines the Conversation Analysis concept of Adjacency Pairs, with Dialogue Acts. The aim is to capture both the semantic and syntactic structure of dialogue, in a format that is independent of the domain or topic, and which facilitates the computational modeling of dialogue. A labeling task undertaken by novice annotators is used to evaluate its efficacy on a selection of task-oriented and non-task-oriented dialogs, and to measure inter-annotator agreement. To deepen the “human-factors” analysis we also record and examine users’ self-reported confidence scores and average utterance annotation times. Inter-annotator agreement is shown to be higher for task-oriented dialogs than non-task-oriented, though the structure of the dialogue itself has a more significant impact. We further examine the assumptions around expected agreement for two weighted agreement coefficients, Alpha and Beta, and show that annotators assign labels using similar probability distributions, small variations can result in large differences in agreement values between biased and unbiased measures
- …