39,019 research outputs found
Automatic extraction of paraphrastic phrases from medium size corpora
This paper presents a versatile system intended to acquire paraphrastic
phrases from a representative corpus. In order to decrease the time spent on
the elaboration of resources for NLP system (for example Information
Extraction, IE hereafter), we suggest to use a machine learning system that
helps defining new templates and associated resources. This knowledge is
automatically derived from the text collection, in interaction with a large
semantic network
Ontologies and Information Extraction
This report argues that, even in the simplest cases, IE is an ontology-driven
process. It is not a mere text filtering method based on simple pattern
matching and keywords, because the extracted pieces of texts are interpreted
with respect to a predefined partial domain model. This report shows that
depending on the nature and the depth of the interpretation to be done for
extracting the information, more or less knowledge must be involved. This
report is mainly illustrated in biology, a domain in which there are critical
needs for content-based exploration of the scientific literature and which
becomes a major application domain for IE
Corpus-Driven Knowledge Acquisition for Discourse Analysis
The availability of large on-line text corpora provides a natural and
promising bridge between the worlds of natural language processing (NLP) and
machine learning (ML). In recent years, the NLP community has been aggressively
investigating statistical techniques to drive part-of-speech taggers, but
application-specific text corpora can be used to drive knowledge acquisition at
much higher levels as well. In this paper we will show how ML techniques can be
used to support knowledge acquisition for information extraction systems. It is
often very difficult to specify an explicit domain model for many information
extraction applications, and it is always labor intensive to implement
hand-coded heuristics for each new domain. We have discovered that it is
nevertheless possible to use ML algorithms in order to capture knowledge that
is only implicitly present in a representative text corpus. Our work addresses
issues traditionally associated with discourse analysis and intersentential
inference generation, and demonstrates the utility of ML algorithms at this
higher level of language analysis. The benefits of our work address the
portability and scalability of information extraction (IE) technologies. When
hand-coded heuristics are used to manage discourse analysis in an information
extraction system, months of programming effort are easily needed to port a
successful IE system to a new domain. We will show how ML algorithms can reduce
thisComment: 6 pages, AAAI-9
Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups
Linguistic relations in oral conversations present how opinions are
constructed and developed in a restricted time. The relations bond ideas,
arguments, thoughts, and feelings, re-shape them during a speech, and finally
build knowledge out of all information provided in the conversation. Speakers
share a common interest to discuss. It is expected that each speaker's reply
includes duplicated forms of words from previous speakers. However, linguistic
adaptation is observed and evolves in a more complex path than just
transferring slightly modified versions of common concepts. A conversation
aiming a benefit at the end shows an emergent cooperation inducing the
adaptation. Not only cooperation, but also competition drives the adaptation or
an opposite scenario and one can capture the dynamic process by tracking how
the concepts are linguistically linked. To uncover salient complex dynamic
events in verbal communications, we attempt to discover self-organized
linguistic relations hidden in a conversation with explicitly stated winners
and losers. We examine open access data of the United States Supreme Court. Our
understanding is crucial in big data research to guide how transition states in
opinion mining and decision-making should be modeled and how this required
knowledge to guide the model should be pinpointed, by filtering large amount of
data.Comment: Full paper, Proceedings of FLAIRS-2017 (30th Florida Artificial
Intelligence Research Society), Special Track, Artificial Intelligence for
Big Social Data Analysi
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Lexical typology : a programmatic sketch
The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar
- …