15 research outputs found
Recommended from our members
Leveraging Text-to-Scene Generation for Language Elicitation and Documentation
Text-to-scene generation systems take input in the form of a natural language text and output a 3D scene illustrating the meaning of that text. A major benefit of text-to-scene generation is that it allows users to create custom 3D scenes without requiring them to have a background in 3D graphics or knowledge of specialized software packages. This contributes to making text-to-scene useful in scenarios from creative applications to education. The primary goal of this thesis is to explore how we can use text-to-scene generation in a new way: as a tool to facilitate the elicitation and formal documentation of language. In particular, we use text-to-scene generation (a) to assist field linguists studying endangered languages; (b) to provide a cross-linguistic framework for formally modeling spatial language; and (c) to collect language data using crowdsourcing. As a side effect of these goals, we also explore the problem of multilingual text-to-scene generation, that is, systems for generating 3D scenes from languages other than English.
The contributions of this thesis are the following. First, we develop a novel tool suite (the WordsEye Linguistics Tools, or WELT) that uses the WordsEye text-to-scene system to assist field linguists with eliciting and documenting endangered languages. WELT allows linguists to create custom elicitation materials and to document semantics in a formal way. We test WELT with two endangered languages, Nahuatl and Arrernte. Second, we explore the question of how to learn a syntactic parser for WELT. We show that an incremental learning method using a small number of annotated dependency structures can produce reasonably accurate results. We demonstrate that using a parser trained in this way can significantly decrease the time it takes an annotator to label a new sentence with dependency information. Third, we develop a framework that generates 3D scenes from spatial and graphical semantic primitives. We incorporate this system into the WELT tools for creating custom elicitation materials, allowing users to directly manipulate the underlying semantics of a generated scene. Fourth, we introduce a deep semantic representation of spatial relations and use this to create a new resource, SpatialNet, which formally declares the lexical semantics of spatial relations for a language. We demonstrate how SpatialNet can be used to support multilingual text-to-scene generation. Finally, we show how WordsEye and the semantic resources it provides can be used to facilitate elicitation of language using crowdsourcing
Finding Emotion in Image Descriptions
In this paper, we approach the problem of classifying emotion in image descriptions. A method is proposed to perform 6-way emotion classification and is tested against two labeled datasets: a corpus of blog posts mined from LiveJournal and a corpus of descriptive texts of computer generated scenes. We perform feature selection using the mRMR technique and then use a multi-class linear predictor to classify posts among the Ekman Big Six emotions (happiness, sadness, anger, surprise, fear, and disgust) [9]. We find that TFIDF scores on lexical features and LIWC scores are much more helpful in emotion classification than using scores calculated from existing sentiment dictionaries, and that our proposed method performs significantly better than a baseline classifier that chooses the majority class. On the blog posts, we achieve 40 % accuracy, and on the corpus of image descriptions, we achieve up to 63% accuracy
Recommended from our members
Finding Emotion in Image Descriptions: Crowdsourced Data
This dataset contains 660 images, each annotated with descriptions and mood labels.
The images were originally created by users of the WordsEye text-to-scene system (https://www.wordseye.com/) and were downloaded from the WordsEye gallery.
For each image, we used Amazon Mechanical Turk to obtain:
(a) a literal description that could function as a caption for the image,
(b) the most relevant mood for the picture (happiness, sadness, anger, surprise, fear, or disgust),
(c) a short explanation of why that mood was selected.
We published three AMT HITs for each picture, for a total of 1980 captions, mood labels, and explanations.
This data was used for the machine learning experiments presented in:
Morgan Ulinski, Victor Soto, and Julia Hirschberg. Finding Emotion in Image Descriptions. In Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining, WISDOM '12, pages 8:1-8:7.
Please cite this paper if you use this data
Recommended from our members
Multilingual Spatial Relation and Motion Treebank
One-sentence descriptions of each picture from the Picture Series for Positional Verbs (Ameka et al., 1999) and each video clip from the Motion Verb Stimulus Kit (Levinson, 2001). 163 English sentences, 165 Spanish sentences, 157 German sentences, 158 Egyptian Arabic sentences. All sentences are tokenized and annotated with lemma, part of speech, morphological features, dependency label and head. We use the universal POS tags, universal features, and universal dependency relations. Treebank is in conll format
Recommended from our members
Complex predicates in Arrernte
Using the example of Murrinh-Patha, Seiss [2011] illustrates how Australian Aboriginal languages can shed light on the morphology-syntax interface: one aspect of their polysynthetic nature is that information often encoded in phrases and clauses in other languages is instead found in a single morphological word. In this paper, we look at another instance, the Australian Aboriginal language Arrernte, and in particular at complex predicates within the language, to examine the implications for the morphology-syntax interface. Following from our consideration of the morphology-syntax interface, we show how a glue semantics-based approach can be applied to Arrernte complex predicates, in a way that fits neatly with the use of glue semantics to model lexical functions in LFG in a multilingual natural language generation environment.21 page(s