3,126 research outputs found
A Factoid Question Answering System for Vietnamese
In this paper, we describe the development of an end-to-end factoid question
answering system for the Vietnamese language. This system combines both
statistical models and ontology-based methods in a chain of processing modules
to provide high-quality mappings from natural language text to entities. We
present the challenges in the development of such an intelligent user interface
for an isolating language like Vietnamese and show that techniques developed
for inflectional languages cannot be applied "as is". Our question answering
system can answer a wide range of general knowledge questions with promising
accuracy on a test set.Comment: In the proceedings of the HQA'18 workshop, The Web Conference
Companion, Lyon, Franc
BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals
A new type of language resource ’BAStat’ has been released by the Bavarian Archive for Speech Signals. In contrast to primary resources like speech and text corpora BAStat comprises statistical estimates based on a number of primary resources: first and second order occurrence probability of phones, syllables and words, duration statistics, probabilities of pronunciation variants of words and probabilities of context information. Unlike other statistical speech resources BAStat is based solely on recordings of conversational German and therefore models spoken language. It consists of 7-bit ASCII tables and matrices to maximize inter-operability between different platforms and can be downloaded from the BAS web-site. This paper gives a detailed description about the empirical basis, the contained data types, some interesting interpretations and a brief comparison to the text-based statistical resource CELEX
Memory-Based Lexical Acquisition and Processing
Current approaches to computational lexicology in language technology are
knowledge-based (competence-oriented) and try to abstract away from specific
formalisms, domains, and applications. This results in severe complexity,
acquisition and reusability bottlenecks. As an alternative, we propose a
particular performance-oriented approach to Natural Language Processing based
on automatic memory-based learning of linguistic (lexical) tasks. The
consequences of the approach for computational lexicology are discussed, and
the application of the approach on a number of lexical acquisition and
disambiguation tasks in phonology, morphology and syntax is described.Comment: 18 page
- …