335 research outputs found
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
Modeling hypernymy, such as poodle is-a dog, is an important generalization
aid to many NLP tasks, such as entailment, coreference, relation extraction,
and question answering. Supervised learning from labeled hypernym sources, such
as WordNet, limits the coverage of these models, which can be addressed by
learning hypernyms from unlabeled text. Existing unsupervised methods either do
not scale to large vocabularies or yield unacceptably poor accuracy. This paper
introduces distributional inclusion vector embedding (DIVE), a
simple-to-implement unsupervised method of hypernym discovery via per-word
non-negative vector embeddings which preserve the inclusion property of word
contexts in a low-dimensional and interpretable space. In experimental
evaluations more comprehensive than any previous literature of which we are
aware-evaluating on 11 datasets using multiple existing as well as newly
proposed scoring functions-we find that our method provides up to double the
precision of previous unsupervised embeddings, and the highest average
performance, using a much more compact word representation, and yielding many
new state-of-the-art results.Comment: NAACL 201
Mathematical Foundations for a Compositional Distributional Model of Meaning
We propose a mathematical framework for a unification of the distributional
theory of meaning in terms of vector space models, and a compositional theory
for grammatical types, for which we rely on the algebra of Pregroups,
introduced by Lambek. This mathematical framework enables us to compute the
meaning of a well-typed sentence from the meanings of its constituents.
Concretely, the type reductions of Pregroups are `lifted' to morphisms in a
category, a procedure that transforms meanings of constituents into a meaning
of the (well-typed) whole. Importantly, meanings of whole sentences live in a
single space, independent of the grammatical structure of the sentence. Hence
the inner-product can be used to compare meanings of arbitrary sentences, as it
is for comparing the meanings of words in the distributional model. The
mathematical structure we employ admits a purely diagrammatic calculus which
exposes how the information flows between the words in a sentence in order to
make up the meaning of the whole sentence. A variation of our `categorical
model' which involves constraining the scalars of the vector spaces to the
semiring of Booleans results in a Montague-style Boolean-valued semantics.Comment: to appea
Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding
Abstractive community detection is an important spoken language understanding
task, whose goal is to group utterances in a conversation according to whether
they can be jointly summarized by a common abstractive sentence. This paper
provides a novel approach to this task. We first introduce a neural contextual
utterance encoder featuring three types of self-attention mechanisms. We then
train it using the siamese and triplet energy-based meta-architectures.
Experiments on the AMI corpus show that our system outperforms multiple
energy-based and non-energy based baselines from the state-of-the-art. Code and
data are publicly available.Comment: Update baseline
A survey on perceived speaker traits: personality, likability, pathology, and the first challenge
The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks
Äriprotsessimudelite ühildamine
Väitekirja elektrooniline versioon ei sisalda publikatsioone.Ettevõtted, kellel on aastatepikkune kogemus äriprotsesside haldamises, omavad sageli protsesside repositooriumeid, mis võivad endas sisaldada sadu või isegi tuhandeid äriprotsessimudeleid. Need mudelid pärinevad erinevatest allikatest ja need on loonud ning neid on muutnud erinevad osapooled, kellel on erinevad modelleerimise oskused ning praktikad. üheks sagedaseks praktikaks on uute mudelite loomine, kasutades olemasolevaid mudeleid, kopeerides neist fragmente ning neid seejärel muutes. See omakorda loob olukorra, kus protsessimudelite repositoorium sisaldab mudeleid, milles on identseid mudeli fragmente, mis viitavad samale alamprotsessile. Kui sellised fragmendid jätta konsolideerimata, siis võib see põhjustada repositooriumis ebakõlasid -- üks ja sama alamprotsess võib olla erinevates protsessides erinevalt kirjeldatud. Sageli on ettevõtetel mudelid, millel on sarnased eesmärgid, kuid mis on mõeldud erinevate klientide, toodete, äriüksuste või geograafiliste regioonide jaoks. Näiteks on äriprotsessid kodukindlustuse ja autokindlustuse jaoks sama ärilise eesmärgiga. Loomulikult sisaldavad nende protsesside mudelid mitmeid identseid alamfragmente (nagu näiteks poliisi andmete kontrollimine), samas on need protsessid mitmes punktis erinevad. Nende protsesside eraldi haldamine on ebaefektiivne ning tekitab liiasusi.
Doktoritöös otsisime vastust küsimusele: kuidas identifitseerida protsessimudelite repositooriumis korduvaid mudelite fragmente, ning üldisemalt -- kuidas leida ning konsolideerida sarnasusi suurtes äriprotsessimudelite repositooriumites?
Doktoritöös on sisse toodud kaks üksteist täiendavat meetodit äriprotsessimudelite konsolideerimiseks, täpsemalt protsessimudelite ühildamine üheks mudeliks ning mudelifragmentide ekstraktimine. Esimene neist võtab sisendiks kaks või enam protsessimudelit ning konstrueerib neist ühe konsolideeritud protsessimudeli, mis sisaldab kõikide sisendmudelite käitumist. Selline lähenemine võimaldab analüütikutel hallata korraga tervet perekonda sarnaseid mudeleid ning neid muuta sünkroniseeritud viisil. Teine lähenemine, alamprotsesside ekstraktimine, sisaldab endas sagedasti esinevate fragmentide identifitseerimist (protsessimudelites kloonide leidmist) ning nende kapseldamist alamprotsessideks
Lexical measurements for information retrieval: a quantum approach
The problem of determining whether a document is about a loosely defined topic is at the core of text Information Retrieval (IR). An automatic IR system should be able to determine if a document is likely to convey information on a topic. In most cases, it has to do it solely based on measure- ments of the use of terms in the document (lexical measurements). In this work a novel scheme for measuring and representing lexical information from text documents is proposed. This scheme is inspired by the concept of ideal measurement as is described by Quantum Theory (QT). We apply it to Information Retrieval through formal analogies between text processing and physical measurements. The main contribution of this work is the development of a complete mathematical scheme to describe lexical measurements. These measurements encompass current ways of repre- senting text, but also completely new representation schemes for it. For example, this quantum-like representation includes logical features such as non-Boolean behaviour that has been suggested to be a fundamental issue when extracting information from natural language text. This scheme also provides a formal unification of logical, probabilistic and geometric approaches to the IR problem.
From the concepts and structures in this scheme of lexical measurement, and using the principle of uncertain conditional, an “Aboutness Witness” is defined as a transformation that can detect docu- ments that are relevant to a query. Mathematical properties of the Aboutness Witness are described in detail and related to other concepts from Information Retrieval. A practical application of this concept is also developed for ad hoc retrieval tasks, and is evaluated with standard collections. Even though the introduction of the model instantiated here does not lead to substantial perfor- mance improvements, it is shown how it can be extended and improved, as well as how it can generate a whole range of radically new models and methodologies. This work opens a number of research possibilities both theoretical and experimental, like new representations for documents in Hilbert spaces or other forms, methodologies for term weighting to be used either within the proposed framework or independently, ways to extend existing methodologies, and a new range of operator-based methods for several tasks in IR
c
In this article, we describe and interpret a set of acoustic and linguistic features that characterise emotional/emotion-related user states – confined to the one database processed: four classes in a German corpus of children interacting with a pet robot. To this end, we collected a very large feature vector consisting of more than 4000 features extracted at different sites. We performed extensive feature selection (Sequential Forward Floating Search) for seven acoustic and four linguistic types of features, ending up in a small number of ‘most important ’ features which we try to interpret by discussing the impact of different feature and extraction types. We establish different measures of impact and discuss the mutual influence of acoustics and linguistics
Detecting Abnormal Social Robot Behavior through Emotion Recognition
Sharing characteristics with both the Internet of Things and the Cyber Physical Systems categories, a new type of device has arrived to claim a third category and raise its very own privacy concerns. Social robots are in the market asking consumers to become part of their daily routine and interactions. Ranging in the level and method of communication with the users, all social robots are able to collect, share and analyze a great variety and large volume of personal data.In this thesis, we focus the community’s attention to this emerging area of interest for privacy and security research. We discuss the likely privacy issues, comment on current defense mechanisms that are applicable to this new category of devices, outline new forms of attack that are made possible through social robots, highlight paths that research on consumer perceptions could follow, and propose a system for detecting abnormal social robot behavior based on emotion detection
Automatic social role recognition and its application in structuring multiparty interactions
Automatic processing of multiparty interactions is a research domain with important applications in content browsing, summarization and information retrieval. In recent years, several works have been devoted to find regular patterns which speakers exhibit in a multiparty interaction also known as social roles. Most of the research in literature has generally focused on recognition of scenario specific formal roles. More recently, role coding schemes based on informal social roles have been proposed in literature, defining roles based on the behavior speakers have in the functioning of a small group interaction. Informal social roles represent a flexible classification scheme that can generalize across different scenarios of multiparty interaction. In this thesis, we focus on automatic recognition of informal social roles and exploit the influence of informal social roles on speaker behavior for structuring multiparty interactions. To model speaker behavior, we systematically explore various verbal and non verbal cues extracted from turn taking patterns, vocal expression and linguistic style. The influence of social roles on the behavior cues exhibited by a speaker is modeled using a discriminative approach based on conditional random fields. Experiments performed on several hours of meeting data reveal that classification using conditional random fields improves the role recognition performance. We demonstrate the effectiveness of our approach by evaluating it on previously unseen scenarios of multiparty interaction. Furthermore, we also consider whether formal roles and informal roles can be automatically predicted by the same verbal and nonverbal features. We exploit the influence of social roles on turn taking patterns to improve speaker diarization under distant microphone condition. Our work extends the Hidden Markov model (HMM)- Gaussian mixture model (GMM) speaker diarization system, and is based on jointly estimating both the speaker segmentation and social roles in an audio recording. We modify the minimum duration constraint in HMM-GMM diarization system by using role information to model the expected duration of speaker's turn. We also use social role n-grams as prior information to model speaker interaction patterns. Finally, we demonstrate the application of social roles for the problem of topic segmentation in meetings. We exploit our findings that social roles can dynamically change in conversations and use this information to predict topic changes in meetings. We also present an unsupervised method for topic segmentation which combines social roles and lexical cohesion. Experimental results show that social roles improve performance of both speaker diarization and topic segmentation
- …