Search CORE

6,387 research outputs found

Learning Language from a Large (Unannotated) Corpus

Author: Goertzel Ben
Vepstas Linas
Publication venue
Publication date: 14/01/2014
Field of study

A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa

arXiv.org e-Print Archive

CiteSeerX

Acquiring Word-Meaning Mappings for Natural Language Interfaces

Author: Thompson C.
Publication venue: 'AI Access Foundation'
Publication date: 22/06/2011
Field of study

This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance

arXiv.org e-Print Archive

Crossref

Multi-Level Modeling of Quotation Families Morphogenesis

Author: Cointet Jean-Philippe
Omodei Elisa
Poibeau Thierry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

This paper investigates cultural dynamics in social media by examining the proliferation and diversification of clearly-cut pieces of content: quoted texts. In line with the pioneering work of Leskovec et al. and Simmons et al. on memes dynamics we investigate in deep the transformations that quotations published online undergo during their diffusion. We deliberately put aside the structure of the social network as well as the dynamical patterns pertaining to the diffusion process to focus on the way quotations are changed, how often they are modified and how these changes shape more or less diverse families and sub-families of quotations. Following a biological metaphor, we try to understand in which way mutations can transform quotations at different scales and how mutation rates depend on various properties of the quotations.Comment: Published in the Proceedings of the ASE/IEEE 4th Intl. Conf. on Social Computing "SocialCom 2012", Sep. 3-5, 2012, Amsterdam, N

arXiv.org e-Print Archive

Crossref

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

A topic modeling based approach to novel document automatic summarization

Author: Chen E
Huang H
Lei L
Li G
Wu Z
Xu G
Zheng C
Publication venue: 'Elsevier BV'
Publication date: 30/10/2017
Field of study

© 2017 Elsevier Ltd Most of existing text automatic summarization algorithms are targeted for multi-documents of relatively short length, thus difficult to be applied immediately to novel documents of structure freedom and long length. In this paper, aiming at novel documents, we propose a topic modeling based approach to extractive automatic summarization, so as to achieve a good balance among compression ratio, summarization quality and machine readability. First, based on topic modeling, we extract the candidate sentences associated with topic words from a preprocessed novel document. Second, with the goals of compression ratio and topic diversity, we design an importance evaluation function to select the most important sentences from the candidate sentences and thus generate an initial novel summary. Finally, we smooth the initial summary to overcome the semantic confusion caused by ambiguous or synonymous words, so as to improve the summary readability. We evaluate experimentally our proposed approach on a real novel dataset. The experiment results show that compared to those from other candidate algorithms, each automatic summary generated by our approach has not only a higher compression ratio, but also better summarization quality

OPUS - University of Technology Sydney

A constraint-based approach to noun phrase coreference resolution in German newspaper text

Author: Versley Yannick
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging

Hochschulschriftenserver - Universität Frankfurt am Main

Exploring Metaphorical Senses and Word Representations for Identifying Metonyms

Author: Gelernter Judith
Zhang Wei
Publication venue
Publication date: 18/08/2015
Field of study

A metonym is a word with a figurative meaning, similar to a metaphor. Because metonyms are closely related to metaphors, we apply features that are used successfully for metaphor recognition to the task of detecting metonyms. On the ACL SemEval 2007 Task 8 data with gold standard metonym annotations, our system achieved 86.45% accuracy on the location metonyms. Our code can be found on GitHub.Comment: 9 pages, 8 pages conten

arXiv.org e-Print Archive

CiteSeerX

Towards an explication and description of synonymy in English

Author: Chi Wei Tze
Publication venue: The University of Edinburgh
Publication date: 01/01/1983
Field of study

The thesis begins by arguing for an a posteriori approach to synonymy, according to which synonymy should be treated as an em¬ pirical phenomenon which it is the task of linguistic semantics to explicate. Arguments are presented against the a priori approach often underlying treatments of synonymy, which makes it possible to define synonymy out of existence. A distinction is then drawn between three possible levels of synonymy (i.e. lexeme-synonymy, sense-synonymy and occurrence-synonymy), and it is argued that all three should be treated as legitimate levels - occurrence-synonymy as the basic level and the other two chiefly as a means of stating synonymy-relations more economically, where appropriate. This is followed by the establishment of two criteria of synonymy for ail three levels. After discussion and (in some cases) re-definition of various types of acceptability and anomaly, the interchangeability criterion is defined as the mutual substitutability of words without causing either grammatical or collocational anomaly. The sameness of meaning criterion is based on the distinctions between pragmatic and analytic equivalence and between performance and judgement equivalence, and is defined in terms of the first alternative in each case. While my concern up to this point is with the explication of synonymy, the remainder of the thesis is devoted to its description. A distinction is drawn between two types of case where two senses are synonymous in some contexts but not in others. Two types of explanation are provided accordingly. The thesis ends by discussing various types of communicatively relevant difference between synonyms

Edinburgh Research Archive

OpenGrey Repository