6,034 research outputs found
Joint morphological-lexical language modeling for processing morphologically rich languages with application to dialectal Arabic
Language modeling for an inflected language
such as Arabic poses new challenges for speech recognition and
machine translation due to its rich morphology. Rich morphology
results in large increases in out-of-vocabulary (OOV) rate and
poor language model parameter estimation in the absence of large
quantities of data. In this study, we present a joint
morphological-lexical language model (JMLLM) that takes
advantage of Arabic morphology. JMLLM combines
morphological segments with the underlying lexical items and
additional available information sources with regards to
morphological segments and lexical items in a single joint model.
Joint representation and modeling of morphological and lexical
items reduces the OOV rate and provides smooth probability
estimates while keeping the predictive power of whole words.
Speech recognition and machine translation experiments in
dialectal-Arabic show improvements over word and morpheme
based trigram language models. We also show that as the
tightness of integration between different information sources
increases, both speech recognition and machine translation
performances improve
New Method for Optimization of License Plate Recognition system with Use of Edge Detection and Connected Component
License Plate recognition plays an important role on the traffic monitoring
and parking management systems. In this paper, a fast and real time method has
been proposed which has an appropriate application to find tilt and poor
quality plates. In the proposed method, at the beginning, the image is
converted into binary mode using adaptive threshold. Then, by using some edge
detection and morphology operations, plate number location has been specified.
Finally, if the plat has tilt, its tilt is removed away. This method has been
tested on another paper data set that has different images of the background,
considering distance, and angel of view so that the correct extraction rate of
plate reached at 98.66%.Comment: 3rd IEEE International Conference on Computer and Knowledge
Engineering (ICCKE 2013), October 31 & November 1, 2013, Ferdowsi Universit
Mashha
Towards Understanding Egyptian Arabic Dialogues
Labelling of user's utterances to understanding his attends which called
Dialogue Act (DA) classification, it is considered the key player for dialogue
language understanding layer in automatic dialogue systems. In this paper, we
proposed a novel approach to user's utterances labeling for Egyptian
spontaneous dialogues and Instant Messages using Machine Learning (ML) approach
without relying on any special lexicons, cues, or rules. Due to the lack of
Egyptian dialect dialogue corpus, the system evaluated by multi-genre corpus
includes 4725 utterances for three domains, which are collected and annotated
manually from Egyptian call-centers. The system achieves F1 scores of 70. 36%
overall domains.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0308
- …