399,902 research outputs found
Semi-Supervised Single- and Multi-Domain Regression with Multi-Domain Training
We address the problems of multi-domain and single-domain regression based on
distinct and unpaired labeled training sets for each of the domains and a large
unlabeled training set from all domains. We formulate these problems as a
Bayesian estimation with partial knowledge of statistical relations. We propose
a worst-case design strategy and study the resulting estimators. Our analysis
explicitly accounts for the cardinality of the labeled sets and includes the
special cases in which one of the labeled sets is very large or, in the other
extreme, completely missing. We demonstrate our estimators in the context of
removing expressions from facial images and in the context of audio-visual word
recognition, and provide comparisons to several recently proposed multi-modal
learning algorithms.Comment: 24 pages, 6 figures, 2 table
Using Statistical Models of Morphology in the Search for Optimal Units of Representation in the Human Mental Lexicon
Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.Peer reviewe
Amharic Speech Recognition for Speech Translation
International audienceThe state-of-the-art speech translation can be seen as a cascade of Automatic Speech Recognition, Statistical Machine Translation and Text-To-Speech synthesis. In this study an attempt is made to experiment on Amharic speech recognition for Amharic-English speech translation in tourism domain. Since there is no Amharic speech corpus, we developed a read-speech corpus of 7.43hr in tourism domain. The Amharic speech corpus has been recorded after translating standard Basic Traveler Expression Corpus (BTEC) under a normal working environment. In our ASR experiments phoneme and syllable units are used for acoustic models, while morpheme and word are used for language models. Encouraging ASR results are achieved using morpheme-based language models and phoneme-based acoustic models with a recognition accuracy result of 89.1%, 80.9%, 80.6%, and 49.3% at character, morph, word and sentence level respectively. We are now working towards designing Amharic-English speech translation through cascading components under different error correction algorithms
Efficacy of Communicative Reading Strategies as an Instructional Approach for Adult Low-Ability Readers.
Twelve adult low-ability readers participated in a pretest-posttest control group study investigating the efficacy of Communicative Reading Strategies (CRS) as an instructional reading approach. Six adults received CRS instruction and constituted the experimental group. The remaining six adults received skill-based instruction and served as the control group. All participants demonstrated instructional level reading skills at or below a fifth grade level and completed 40 hours of instruction. Changes in performance on measures of word recognition, comprehension, and reading rate from pretest to posttest were used to compare CRS and control groups. Results of Mann Whitney U analyses revealed that both methods of instruction were effective in improving word recognition and comprehension abilities for most subjects. For individual subjects and mean group gains, the word recognition and comprehension results favored the CRS group, although these differences did not reach a level of statistical significance. Further analyses of the reading performance of CRS subjects revealed additional findings. Scaffolding provided by CRS interactions increased both the assisted word recognition level and assisted comprehension scores for most subjects at both pretest and posttest. Furthermore, reading gains made under scaffolded conditions at pretest were highly predictive of actual unassisted reading gains demonstrated after 40 hours of instruction. Measures of reading accuracy, fluency, rate, comprehension, and story retelling ability obtained from CRS subjects after every 10 hours of instruction was not representative of actual gains demonstrated at posttest
A scalable hybrid decision system (HDS) for Roman word recognition using ANN SVM: Study case on Malay word recognition
An off-line handwriting recognition (OFHR) system is a computerized system that is capable of intelligently converting human handwritten data extracted from scanned paper documents into an equivalent text format. This paper studies a proposed OFHR for Malaysian bank cheques written in the Malay language. The proposed system comprised of three components, namely a character recognition system (CRS), a hybrid decision system and lexical word classification system. Two types of feature extraction techniques have been used in the system, namely statistical and geometrical. Experiments show that the statistical feature is reliable, accessible and offers results that are more accurate. The CRS in this system was implemented using two individual classifiers, namely an adaptive multilayer feed-forward back-propagation neural network and support vector machine. The results of this study are very promising and could generalize to the entire Malay lexical dictionary in future work toward scaled-up applications
Teaching high frequency words to poor readers using flashcards : its effects on novel word acquisition, skill trasfer to in-text word reading, and passage reading competencies : a thesis in partial fulfillment of the requirements for the degree of Master of Educational Psychology, Massey University, Albany, New Zealand
Several literacy reports published in the last decade have emphasised the large gap in the reading
attainment of children in New Zealand. A common barrier that prevents poor readers to catch up
to their peers is difficulty in reading fluency, which is theorised to represent underlying difficulty
in rapid and automatic word recognition. The ability to rapidly recognise a few common words,
also known as high frequency words (HFWs), may increase the fluency of reading the majority
of novel text. As such, the National Standards for literacy achievement outline the development
of basic HFW vocabulary by the end of the first few years at school. However, past research that
has investigated single word training has rarely used HFWs and those that have used HFWs have
scarcely investigated its transfer to in-text reading. Therefore, the aims of the current research
were to provide an investigation of HFW training and its influence on word reading accuracy, intext
word reading, and passage reading accuracy, speed, and comprehension. Experiment 1 was a
single case design carried out with one 8 year old participant and was largely used to inform the
second experiment. Experiment 2 was a multiple baseline design carried out with five 8-9 year
old participants using a modified training procedure. Experiment 1 utilised visual analysis and
Cohen’s d effect size analysis whereas Experiment 2 also used statistical analysis, made possible
through the Wampold-Worsham method of randomisation incorporated into the experimental
design. The results of both experiments indicated that training facilitated word reading accuracy
but the successful transfer of target words to in-text reading was only observed in Experiment 2.
Post-training increases to passage reading accuracy, speed, and comprehension scores were not
apparent in either experiment. The main contribution of the current research is its applicability
to classroom practice. Another important contribution of the study to research practice is the rare
application of the Wampold-Worsham method of randomisation
Statistical models of morphology predict eye-tracking measures during visual word recognition
We studied how statistical models of morphology that are built on different kinds of representational units, i.e., models emphasizing either holistic units or decomposition, perform in predicting human word recognition. More specifically, we studied the predictive power of such models at early vs. late stages of word recognition by using eye-tracking during two tasks. The tasks included a standard lexical decision task and a word recognition task that assumedly places less emphasis on postlexical reanalysis and decision processes. The lexical decision results showed good performance of Morfessor models based on the Minimum Description Length optimization principle. Models which segment words at some morpheme boundaries and keep other boundaries unsegmented performed well both at early and late stages of word recognition, supporting dual- or multiple-route cognitive models of morphological processing. Statistical models based on full forms fared better in late than early measures. The results of the second, multi-word recognition task showed that early and late stages of processing often involve accessing morphological constituents, with the exception of short complex words. Late stages of word recognition additionally involve predicting upcoming morphemes on the basis of previous ones in multimorphemic words. The statistical models based fully on whole words did not fare well in this task. Thus, we assume that the good performance of such models in global measures such as gaze durations or reaction times in lexical decision largely stems from postlexical reanalysis or decision processes. This finding highlights the importance of considering task demands in the study of morphological processing.Peer reviewe
The Statistics of Subtypes: A Proposed Study Investigating Statistical Learning Across Subtypes of Dyslexia
Current research regarding dyslexia and its subtypes is inconsistent. There are discrepancies in the literature surrounding the causes and manifestations of dyslexia. Furthermore, there is very little research concerning the role of statistical learning in differentiating between subtypes of dyslexia. The purpose of the proposed study is to quantify the differences in statistical learning ability across three subtypes of dyslexia (i.e., phonological dyslexia, surface dyslexia, and deep dyslexia). It is predicted that participants with a dyslexia diagnosis of any subtype will be worse at using statistics to find word boundaries than control participants. Additionally, it is hypothesized that participants with surface dyslexia will express the highest capacity for statistical learning among the three subtypes. Finally, it is hypothesized that participants belonging to the deep dyslexia subgroup will express the lowest capacity for statistical learning. Participants from each of the four treatments (i.e., phonological dyslexia, surface dyslexia, deep dyslexia, and control) will be exposed to the same auditory nonsense word stream. After finishing the listening phase, all participants will complete a forced-choice recognition task. The task will be to indicate which of the two sound strings sounds most like a word from the nonsense language. If the results of this study show that there are differences in statistical learning ability between different subtypes of dyslexia, treatments and interventions can be tailored more appropriately to individuals belonging to each subtype. Additionally, it will be possible to highlight early risk factors that can help with early identification of dyslexia in children
False memories in recognition memory: Recollection or familiarity?
False recollection refers to the retrieval of contextual information associated with an event that has not occurred. For instance, during a recognition task, one might identify a nonstudied word presented at test as old because she remembers the font color of the word during study. Although instances such as this are rare and typically occur at a varying rate of 0-5%, current models of recognition such as the Complementary Learning Systems (CLS) model and the Dual-Process Signal-Detection (DPSD) model do not contain a mechanism to account for their occurrence. Although both the CLS and DPSD models have support from studies demonstrating functional dissociations, neurophysiological dissociations, and behavioral findings of process dissociation, their ability to explain false memories has been more elusive; neither theory specifically addresses false recollection. Instead, such models have ignored false recollection as inconsequential noise in the data. The purpose of this dissertation was to determine whether the false recognition effect obtained by the Payne-Eakin paradigm was due to false recollection or familiarity. The Payne-Eakin paradigm is based on the PIER2 model, which theorizes that targets implicitly activated during study lead to the falser recognition of a false-target pair. Using a modified version of the Payne-Eakin paradigm, we investigated the nature of the false recognition effect using a priori behavioral analyses and statistical modeling. The findings of this dissertation provide a step toward a more solid understanding of the cognitive mechanisms involved in the recognition of nonstudied items. This dissertation demonstrates that modeling false recollection is possible. The results of this dissertation suggest that, because current models of recognition do not provide a mechanism to account for false recollection, our understanding of recognition is not fully understood. The results highlight that the current understanding of how false recollection contributes to recognition performance is an area in need of further development
- …