15,842 research outputs found
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration
Cross-language information retrieval (CLIR), where queries and documents are
in different languages, has of late become one of the major topics within the
information retrieval community. This paper proposes a Japanese/English CLIR
system, where we combine a query translation and retrieval modules. We
currently target the retrieval of technical documents, and therefore the
performance of our system is highly dependent on the quality of the translation
of technical terms. However, the technical term translation is still
problematic in that technical terms are often compound words, and thus new
terms are progressively created by combining existing base words. In addition,
Japanese often represents loanwords based on its special phonogram.
Consequently, existing dictionaries find it difficult to achieve sufficient
coverage. To counter the first problem, we produce a Japanese/English
dictionary for base words, and translate compound words on a word-by-word
basis. We also use a probabilistic method to resolve translation ambiguity. For
the second problem, we use a transliteration method, which corresponds words
unlisted in the base word dictionary to their phonetic equivalents in the
target language. We evaluate our system using a test collection for CLIR, and
show that both the compound word translation and transliteration methods
improve the system performance
Paradigm Completion for Derivational Morphology
The generation of complex derived word forms has been an overlooked problem
in NLP; we fill this gap by applying neural sequence-to-sequence models to the
task. We overview the theoretical motivation for a paradigmatic treatment of
derivational morphology, and introduce the task of derivational paradigm
completion as a parallel to inflectional paradigm completion. State-of-the-art
neural models, adapted from the inflection task, are able to learn a range of
derivation patterns, and outperform a non-neural baseline by 16.4%. However,
due to semantic, historical, and lexical considerations involved in
derivational morphology, future work will be needed to achieve performance
parity with inflection-generating systems.Comment: EMNLP 201
Recommended from our members
Automatic Segmentation and Part-Of-Speech Tagging For Tibetan: A First Step Towards Machine Translation
This paper presents what we believe to be the first reported work on Tibetan machine translation (MT). Of the three conceptually distinct components of a MT system — analysis, transfer, and generation — the first phase, consisting of POS tagging has been successfully completed. The combination POS tagger / word-segmenter was manually constructed as a rule-based multi-tagger relying on the Wilson formulation of Tibetan grammar. Partial parsing was also performed in combination with POS-tag sequence disambiguation. The component was evaluated at the task of document indexing for Information Retrieval (IR). Preliminary analysis indicated slightly better (though statistically comparable) performance to n-gram based approaches at a known-item IR task. Although segmentation is application specific, error analysis placed segmentation accuracy at 99%; the accuracy of the POS tagger is also estimated at 99% based on IR error analysis and random sampling
- …