1,996 research outputs found
Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks
Automatically assessing emotional valence in human speech has historically
been a difficult task for machine learning algorithms. The subtle changes in
the voice of the speaker that are indicative of positive or negative emotional
states are often "overshadowed" by voice characteristics relating to emotional
intensity or emotional activation. In this work we explore a representation
learning approach that automatically derives discriminative representations of
emotional speech. In particular, we investigate two machine learning strategies
to improve classifier performance: (1) utilization of unlabeled data using a
deep convolutional generative adversarial network (DCGAN), and (2) multitask
learning. Within our extensive experiments we leverage a multitask annotated
emotional corpus as well as a large unlabeled meeting corpus (around 100
hours). Our speaker-independent classification experiments show that in
particular the use of unlabeled data in our investigations improves performance
of the classifiers and both fully supervised baseline approaches are
outperformed considerably. We improve the classification of emotional valence
on a discrete 5-point scale to 43.88% and on a 3-point scale to 49.80%, which
is competitive to state-of-the-art performance
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
A hybrid representation based simile component extraction
Simile, a special type of metaphor, can help people to express their ideas more clearly. Simile component extraction is to extract tenors and vehicles from sentences. This task has a realistic significance since it is useful for building cognitive knowledge base. With the development of deep neural networks, researchers begin to apply neural models to component extraction. Simile components should be in cross-domain. According to our observations, words in cross-domain always have different concepts. Thus, concept is important when identifying whether two words are simile components or not. However, existing models do not integrate concept into their models. It is difficult for these models to identify the concept of a word. What’s more, corpus about simile component extraction is limited. There are a number of rare words or unseen words, and the representations of these words are always not proper enough. Exiting models can hardly extract simile components accurately when there are low-frequency words in sentences. To solve these problems, we propose a hybrid representation-based component extraction (HRCE) model. Each word in HRCE is represented in three different levels: word level, concept level and character level. Concept representations (representations in concept level) can help HRCE to identify the words in cross-domain more accurately. Moreover, with the help of character representations (representations in character levels), HRCE can represent the meaning of a word more properly since words are consisted of characters and these characters can partly represent the meaning of words. We conduct experiments to compare the performance between HRCE and existing models. The experiment results show that HRCE significantly outperforms current models
Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation
In this paper, we empirically evaluate the utility of transfer and multi-task
learning on a challenging semantic classification task: semantic interpretation
of noun--noun compounds. Through a comprehensive series of experiments and
in-depth error analysis, we show that transfer learning via parameter
initialization and multi-task learning via parameter sharing can help a neural
classification model generalize over a highly skewed distribution of relations.
Further, we demonstrate how dual annotation with two distinct sets of relations
over the same set of compounds can be exploited to improve the overall accuracy
of a neural classifier and its F1 scores on the less frequent, but more
difficult relations.Comment: EMNLP 2018: Conference on Empirical Methods in Natural Language
Processing (EMNLP
Multi-Task Learning of Keyphrase Boundary Classification
Keyphrase boundary classification (KBC) is the task of detecting keyphrases
in scientific articles and labelling them with respect to predefined types.
Although important in practice, this task is so far underexplored, partly due
to the lack of labelled data. To overcome this, we explore several auxiliary
tasks, including semantic super-sense tagging and identification of multi-word
expressions, and cast the task as a multi-task learning problem with deep
recurrent neural networks. Our multi-task models perform significantly better
than previous state of the art approaches on two scientific KBC datasets,
particularly for long keyphrases.Comment: ACL 201
Kernels for Vector-Valued Functions: a Review
Kernel methods are among the most popular techniques in machine learning.
From a frequentist/discriminative perspective they play a central role in
regularization theory as they provide a natural choice for the hypotheses space
and the regularization functional through the notion of reproducing kernel
Hilbert spaces. From a Bayesian/generative perspective they are the key in the
context of Gaussian processes, where the kernel function is also known as the
covariance function. Traditionally, kernel methods have been used in supervised
learning problem with scalar outputs and indeed there has been a considerable
amount of work devoted to designing and learning kernels. More recently there
has been an increasing interest in methods that deal with multiple outputs,
motivated partly by frameworks like multitask learning. In this paper, we
review different methods to design or learn valid kernel functions for multiple
outputs, paying particular attention to the connection between probabilistic
and functional methods
Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction
We study the interplay between surrogate methods for structured prediction
and techniques from multitask learning designed to leverage relationships
between surrogate outputs. We propose an efficient algorithm based on trace
norm regularization which, differently from previous methods, does not require
explicit knowledge of the coding/decoding functions of the surrogate framework.
As a result, our algorithm can be applied to the broad class of problems in
which the surrogate space is large or even infinite dimensional. We study
excess risk bounds for trace norm regularized structured prediction, implying
the consistency and learning rates for our estimator. We also identify relevant
regimes in which our approach can enjoy better generalization performance than
previous methods. Numerical experiments on ranking problems indicate that
enforcing low-rank relations among surrogate outputs may indeed provide a
significant advantage in practice.Comment: 42 pages, 1 tabl
- …