879,118 research outputs found
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Chart-driven Connectionist Categorial Parsing of Spoken Korean
While most of the speech and natural language systems which were developed
for English and other Indo-European languages neglect the morphological
processing and integrate speech and natural language at the word level, for the
agglutinative languages such as Korean and Japanese, the morphological
processing plays a major role in the language processing since these languages
have very complex morphological phenomena and relatively simple syntactic
functionality. Obviously degenerated morphological processing limits the usable
vocabulary size for the system and word-level dictionary results in exponential
explosion in the number of dictionary entries. For the agglutinative languages,
we need sub-word level integration which leaves rooms for general morphological
processing. In this paper, we developed a phoneme-level integration model of
speech and linguistic processings through general morphological analysis for
agglutinative languages and a efficient parsing scheme for that integration.
Korean is modeled lexically based on the categorial grammar formalism with
unordered argument and suppressed category extensions, and chart-driven
connectionist parsing method is introduced.Comment: 6 pages, Postscript file, Proceedings of ICCPOL'9
Morphological Cues for Lexical Semantics
Most natural language processing tasks require lexical semantic information.
Automated acquisition of this information would thus increase the robustness
and portability of NLP systems. This paper describes an acquisition method
which makes use of fixed correspondences between derivational affixes and
lexical semantic information. One advantage of this method, and of other
methods that rely only on surface characteristics of language, is that the
necessary input is currently available
Language Without Words: A Pointillist Model for Natural Language Processing
This paper explores two separate questions: Can we perform natural language
processing tasks without a lexicon?; and, Should we? Existing natural language
processing techniques are either based on words as units or use units such as
grams only for basic classification tasks. How close can a machine come to
reasoning about the meanings of words and phrases in a corpus without using any
lexicon, based only on grams?
Our own motivation for posing this question is based on our efforts to find
popular trends in words and phrases from online Chinese social media. This form
of written Chinese uses so many neologisms, creative character placements, and
combinations of writing systems that it has been dubbed the "Martian Language."
Readers must often use visual queues, audible queues from reading out loud, and
their knowledge and understanding of current events to understand a post. For
analysis of popular trends, the specific problem is that it is difficult to
build a lexicon when the invention of new ways to refer to a word or concept is
easy and common. For natural language processing in general, we argue in this
paper that new uses of language in social media will challenge machines'
abilities to operate with words as the basic unit of understanding, not only in
Chinese but potentially in other languages.Comment: 5 pages, 2 figure
Recommended from our members
Sentence processing with incremental feedback
Utilizing recurrent network topologies to produce case/role meaning representations for single sentences has become common practice in connectionist natural language processing systems. Typically, these systems train with the complete sentence meaning as the target output for the entire period that the sentence is being processed; i.e., the complete meaning is available starting with the first word of the sentence. Thus, the context feedback provided by these systems is non-incremental in that they use information about the sentence that has not yet been encountered in order to aid in the processing and learning tasks. SAIL1 is a connectionist natural language processing system which builds the sentence meaning representation incrementally, incorporating into the meaning only the information derived from words already processed
The REVERE project:Experiments with the application of probabilistic NLP to systems engineering
Despite natural language’s well-documented shortcomings as a medium for precise technical description, its use in software-intensive systems engineering remains inescapable. This poses many problems for engineers who must derive problem understanding and synthesise precise solution descriptions from free text. This is true both for the largely unstructured textual descriptions from which system requirements are derived, and for more formal documents, such as standards, which impose requirements on system development processes. This paper describes experiments that we have carried out in the REVERE1 project to investigate the use of probabilistic natural language processing techniques to provide systems engineering support
- …
