Search CORE

22,933 research outputs found

A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations

Author: Turney Peter D.
Publication venue
Publication date: 01/01/2008
Field of study

Recognizing analogies, synonyms, antonyms, and associations appear to be four\ud distinct tasks, requiring distinct NLP algorithms. In the past, the four\ud tasks have been treated independently, using a wide variety of algorithms.\ud These four semantic classes, however, are a tiny sample of the full\ud range of semantic phenomena, and we cannot afford to create ad hoc algorithms\ud for each semantic phenomenon; we need to seek a unified approach.\ud We propose to subsume a broad range of phenomena under analogies.\ud To limit the scope of this paper, we restrict our attention to the subsumption\ud of synonyms, antonyms, and associations. We introduce a supervised corpus-based\ud machine learning algorithm for classifying analogous word pairs, and we\ud show that it can solve multiple-choice SAT analogy questions, TOEFL\ud synonym questions, ESL synonym-antonym questions, and similar-associated-both\ud questions from cognitive psychology

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

How Controlled English can Improve Semantic Wikis

Author: Kuhn Tobias
Publication venue
Publication date: 01/01/2009
Field of study

The motivation of semantic wikis is to make acquisition, maintenance, and mining of formal knowledge simpler, faster, and more flexible. However, most existing semantic wikis have a very technical interface and are restricted to a relatively low level of expressivity. In this paper, we explain how AceWiki uses controlled English - concretely Attempto Controlled English (ACE) - to provide a natural and intuitive interface while supporting a high degree of expressivity. We introduce recent improvements of the AceWiki system and user studies that indicate that AceWiki is usable and useful

arXiv.org e-Print Archive

CiteSeerX

ZORA

On Yao's method of translation

Author: Hoede C.
Liu X.
Publication venue: Department of Applied Mathematics, University of Twente
Publication date: 01/01/2002
Field of study

Machine Translation, i.e., translating one kind of natural language to another kind of natural language by using a computer system, is a very important research branch in Artificial Intelligence. Yao developed a method of translation that he called ``Lexical-Semantic Driven". In his system he introduced 49 ``relation types" including case relations, event relations, semantic relations, and complex relations. The knowledge graph method is a new kind of method to represent an interlingua between natural languages. In this paper, we will give a comparison of these two methods. We will translate one Chinese sentence cited in Yao�s book by using these two methods. Finally, we will use the relations in knowledge graph theory to represent the ``relations" in Lexical-Semantic Driven, and partition the relations in Lexical-Semantic Driven into groups according to the relations in knowledge graph theory

University of Twente Research Information

In no uncertain terms : a dataset for monolingual and multilingual automatic term extraction from comparable corpora

Author: Hoste Veronique
Lefever Els
Rigouts Terryn Ayla
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Automatic term extraction is a productive field of research within natural language processing, but it still faces significant obstacles regarding datasets and evaluation, which require manual term annotation. This is an arduous task, made even more difficult by the lack of a clear distinction between terms and general language, which results in low inter-annotator agreement. There is a large need for well-documented, manually validated datasets, especially in the rising field of multilingual term extraction from comparable corpora, which presents a unique new set of challenges. In this paper, a new approach is presented for both monolingual and multilingual term annotation in comparable corpora. The detailed guidelines with different term labels, the domain- and language-independent methodology and the large volumes annotated in three different languages and four different domains make this a rich resource. The resulting datasets are not just suited for evaluation purposes but can also serve as a general source of information about terms and even as training data for supervised methods. Moreover, the gold standard for multilingual term extraction from comparable corpora contains information about term variants and translation equivalents, which allows an in-depth, nuanced evaluation

Ghent University Academic Bibliography

Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

Author: Grefenstette Gregory
Muchemi Lawrence
Publication venue
Publication date: 24/05/2016
Field of study

Specialized dictionaries are used to understand concepts in specific domains, especially where those concepts are not part of the general vocabulary, or having meanings that differ from ordinary languages. The first step in creating a specialized dictionary involves detecting the characteristic vocabulary of the domain in question. Classical methods for detecting this vocabulary involve gathering a domain corpus, calculating statistics on the terms found there, and then comparing these statistics to a background or general language corpus. Terms which are found significantly more often in the specialized corpus than in the background corpus are candidates for the characteristic vocabulary of the domain. Here we present two tools, a directed crawler, and a distributional semantics package, that can be used together, circumventing the need of a background corpus. Both tools are available on the web

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Using Natural Language as Knowledge Representation in an Intelligent Tutoring System

Author: Jung Sung-Young
Publication venue
Publication date: 01/02/2012
Field of study

Knowledge used in an intelligent tutoring system to teach students is usually acquired from authors who are experts in the domain. A problem is that they cannot directly add and update knowledge if they don’t learn formal language used in the system. Using natural language to represent knowledge can allow authors to update knowledge easily. This thesis presents a new approach to use unconstrained natural language as knowledge representation for a physics tutoring system so that non-programmers can add knowledge without learning a new knowledge representation. This approach allows domain experts to add not only problem statements, but also background knowledge such as commonsense and domain knowledge including principles in natural language. Rather than translating into a formal language, natural language representation is directly used in inference so that domain experts can understand the internal process, detect knowledge bugs, and revise the knowledgebase easily. In authoring task studies with the new system based on this approach, it was shown that the size of added knowledge was small enough for a domain expert to add, and converged to near zero as more problems were added in one mental model test. After entering the no-new-knowledge state in the test, 5 out of 13 problems (38 percent) were automatically solved by the system without adding new knowledge

D-Scholarship@Pitt

Investigating the timecourse of accessing conversational implicatures during incremental sentence interpretation

Author: Altmann G.
Gazdar G.
Ginzburg J.
Grice H. P.
Grodner D. J.
Grodzinsky Y.
Hallett P. E.
Heather J. Ferguson
Heim I.
Heim I.
Horn L.
Horn L.
Levinson S.
Napoleon Katsos
Richard Breheny
Roberts C.
Schneider W.
Spector B.
Sperber D.
Publication venue: 'Informa UK Limited'
Publication date: 01/05/2013
Field of study

Many contextual inferences in utterance interpretation are explained as following from the nature of conversation and the assumption that participants are rational. Recent psycholinguistic research has focussed on certain of these ‘Gricean’ inferences and have revealed that comprehenders can access them in online interpretation. However there have been mixed results as to the time-course of access. Some results show that Gricean inferences can be accessed very rapidly, as rapidly as any other contextually specified information (Sedivy, 2003; Grodner, Klein, Carbery, & Tanenhaus, 2010); while other studies looking at the same kind of inference suggest that access to Gricean inferences are delayed relative to other aspects of semantic interpretation (Huang & Snedeker, 2009; in press). While previous timecourse research has focussed on Gricean inferences that support the online assignment of reference to definite expressions, the study reported here examines the timecourse of access to scalar implicatures, which enrich the meaning of an utterance beyond the semantic interpretation. Even if access to Gricean inference in support of reference assignment may be rapid, it is still unknown whether genuinely enriching scalar implicatures are delayed. Our results indicate that scalar implicatures are accessed as rapidly as other contextual inferences. The implications of our results are discussed in reference to the architecture of language comprehension

Crossref

UCL Discovery

Kent Academic Repository