118,827 research outputs found
Event Representations for Automated Story Generation with Deep Neural Nets
Automated story generation is the problem of automatically selecting a
sequence of events, actions, or words that can be told as a story. We seek to
develop a system that can generate stories by learning everything it needs to
know from textual story corpora. To date, recurrent neural networks that learn
language models at character, word, or sentence levels have had little success
generating coherent stories. We explore the question of event representations
that provide a mid-level of abstraction between words and sentences in order to
retain the semantic information of the original data while minimizing event
sparsity. We present a technique for preprocessing textual story data into
event sequences. We then present a technique for automated story generation
whereby we decompose the problem into the generation of successive events
(event2event) and the generation of natural language sentences from events
(event2sentence). We give empirical results comparing different event
representations and their effects on event successor generation and the
translation of events to natural language.Comment: Submitted to AAAI'1
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
This paper introduces the Multi-Genre Natural Language Inference (MultiNLI)
corpus, a dataset designed for use in the development and evaluation of machine
learning models for sentence understanding. In addition to being one of the
largest corpora available for the task of NLI, at 433k examples, this corpus
improves upon available resources in its coverage: it offers data from ten
distinct genres of written and spoken English--making it possible to evaluate
systems on nearly the full complexity of the language--and it offers an
explicit setting for the evaluation of cross-genre domain adaptation.Comment: 10 pages, 1 figures, 5 tables. v2 corrects a misreported accuracy
number for the CBOW model in the 'matched' setting. v3 adds a discussion of
the difficulty of the corpus to the analysis section. v4 is the version that
was accepted to NAACL201
Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Heterogeneous information networks (HINs) are ubiquitous in real-world
applications. In the meantime, network embedding has emerged as a convenient
tool to mine and learn from networked data. As a result, it is of interest to
develop HIN embedding methods. However, the heterogeneity in HINs introduces
not only rich information but also potentially incompatible semantics, which
poses special challenges to embedding learning in HINs. With the intention to
preserve the rich yet potentially incompatible information in HIN embedding, we
propose to study the problem of comprehensive transcription of heterogeneous
information networks. The comprehensive transcription of HINs also provides an
easy-to-use approach to unleash the power of HINs, since it requires no
additional supervision, expertise, or feature engineering. To cope with the
challenges in the comprehensive transcription of HINs, we propose the HEER
algorithm, which embeds HINs via edge representations that are further coupled
with properly-learned heterogeneous metrics. To corroborate the efficacy of
HEER, we conducted experiments on two large-scale real-words datasets with an
edge reconstruction task and multiple case studies. Experiment results
demonstrate the effectiveness of the proposed HEER model and the utility of
edge representations and heterogeneous metrics. The code and data are available
at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, London, United Kingdom,
ACM, 201
Event detection in field sports video using audio-visual features and a support vector machine
In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
Examining Variations of Prominent Features in Genre Classification.
This paper investigates the correlation between features of three types (visual, stylistic and topical types) and genre classes. The majority of previous studies in automated genre classification have created models based on an amalgamated representation of a document using a combination of features. In these models, the inseparable roles of different features make it difficult to determine a means of improving the classifier when it exhibits poor performance in detecting selected genres. In this paper we use classifiers independently modeled on three groups of features to examine six genre classes to show that the strongest features for making one classification is not necessarily the best features for carrying out another classification.
Event detection based on generic characteristics of field-sports
In this paper, we propose a generic framework for event detection in broadcast video of multiple different field-sports. Features indicating significant events are selected, and robust detectors built. These features are rooted in generic characteristics common to all genres of field-sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested across multiple genres of field-sports including soccer, rugby, hockey and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
- …