399,902 research outputs found

    Semi-Supervised Single- and Multi-Domain Regression with Multi-Domain Training

    Full text link
    We address the problems of multi-domain and single-domain regression based on distinct and unpaired labeled training sets for each of the domains and a large unlabeled training set from all domains. We formulate these problems as a Bayesian estimation with partial knowledge of statistical relations. We propose a worst-case design strategy and study the resulting estimators. Our analysis explicitly accounts for the cardinality of the labeled sets and includes the special cases in which one of the labeled sets is very large or, in the other extreme, completely missing. We demonstrate our estimators in the context of removing expressions from facial images and in the context of audio-visual word recognition, and provide comparisons to several recently proposed multi-modal learning algorithms.Comment: 24 pages, 6 figures, 2 table

    Using Statistical Models of Morphology in the Search for Optimal Units of Representation in the Human Mental Lexicon

    Get PDF
    Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.Peer reviewe

    Amharic Speech Recognition for Speech Translation

    No full text
    International audienceThe state-of-the-art speech translation can be seen as a cascade of Automatic Speech Recognition, Statistical Machine Translation and Text-To-Speech synthesis. In this study an attempt is made to experiment on Amharic speech recognition for Amharic-English speech translation in tourism domain. Since there is no Amharic speech corpus, we developed a read-speech corpus of 7.43hr in tourism domain. The Amharic speech corpus has been recorded after translating standard Basic Traveler Expression Corpus (BTEC) under a normal working environment. In our ASR experiments phoneme and syllable units are used for acoustic models, while morpheme and word are used for language models. Encouraging ASR results are achieved using morpheme-based language models and phoneme-based acoustic models with a recognition accuracy result of 89.1%, 80.9%, 80.6%, and 49.3% at character, morph, word and sentence level respectively. We are now working towards designing Amharic-English speech translation through cascading components under different error correction algorithms

    Efficacy of Communicative Reading Strategies as an Instructional Approach for Adult Low-Ability Readers.

    Get PDF
    Twelve adult low-ability readers participated in a pretest-posttest control group study investigating the efficacy of Communicative Reading Strategies (CRS) as an instructional reading approach. Six adults received CRS instruction and constituted the experimental group. The remaining six adults received skill-based instruction and served as the control group. All participants demonstrated instructional level reading skills at or below a fifth grade level and completed 40 hours of instruction. Changes in performance on measures of word recognition, comprehension, and reading rate from pretest to posttest were used to compare CRS and control groups. Results of Mann Whitney U analyses revealed that both methods of instruction were effective in improving word recognition and comprehension abilities for most subjects. For individual subjects and mean group gains, the word recognition and comprehension results favored the CRS group, although these differences did not reach a level of statistical significance. Further analyses of the reading performance of CRS subjects revealed additional findings. Scaffolding provided by CRS interactions increased both the assisted word recognition level and assisted comprehension scores for most subjects at both pretest and posttest. Furthermore, reading gains made under scaffolded conditions at pretest were highly predictive of actual unassisted reading gains demonstrated after 40 hours of instruction. Measures of reading accuracy, fluency, rate, comprehension, and story retelling ability obtained from CRS subjects after every 10 hours of instruction was not representative of actual gains demonstrated at posttest

    A scalable hybrid decision system (HDS) for Roman word recognition using ANN SVM: Study case on Malay word recognition

    Get PDF
    An off-line handwriting recognition (OFHR) system is a computerized system that is capable of intelligently converting human handwritten data extracted from scanned paper documents into an equivalent text format. This paper studies a proposed OFHR for Malaysian bank cheques written in the Malay language. The proposed system comprised of three components, namely a character recognition system (CRS), a hybrid decision system and lexical word classification system. Two types of feature extraction techniques have been used in the system, namely statistical and geometrical. Experiments show that the statistical feature is reliable, accessible and offers results that are more accurate. The CRS in this system was implemented using two individual classifiers, namely an adaptive multilayer feed-forward back-propagation neural network and support vector machine. The results of this study are very promising and could generalize to the entire Malay lexical dictionary in future work toward scaled-up applications

    Teaching high frequency words to poor readers using flashcards : its effects on novel word acquisition, skill trasfer to in-text word reading, and passage reading competencies : a thesis in partial fulfillment of the requirements for the degree of Master of Educational Psychology, Massey University, Albany, New Zealand

    Get PDF
    Several literacy reports published in the last decade have emphasised the large gap in the reading attainment of children in New Zealand. A common barrier that prevents poor readers to catch up to their peers is difficulty in reading fluency, which is theorised to represent underlying difficulty in rapid and automatic word recognition. The ability to rapidly recognise a few common words, also known as high frequency words (HFWs), may increase the fluency of reading the majority of novel text. As such, the National Standards for literacy achievement outline the development of basic HFW vocabulary by the end of the first few years at school. However, past research that has investigated single word training has rarely used HFWs and those that have used HFWs have scarcely investigated its transfer to in-text reading. Therefore, the aims of the current research were to provide an investigation of HFW training and its influence on word reading accuracy, intext word reading, and passage reading accuracy, speed, and comprehension. Experiment 1 was a single case design carried out with one 8 year old participant and was largely used to inform the second experiment. Experiment 2 was a multiple baseline design carried out with five 8-9 year old participants using a modified training procedure. Experiment 1 utilised visual analysis and Cohen’s d effect size analysis whereas Experiment 2 also used statistical analysis, made possible through the Wampold-Worsham method of randomisation incorporated into the experimental design. The results of both experiments indicated that training facilitated word reading accuracy but the successful transfer of target words to in-text reading was only observed in Experiment 2. Post-training increases to passage reading accuracy, speed, and comprehension scores were not apparent in either experiment. The main contribution of the current research is its applicability to classroom practice. Another important contribution of the study to research practice is the rare application of the Wampold-Worsham method of randomisation

    Statistical models of morphology predict eye-tracking measures during visual word recognition

    Get PDF
    We studied how statistical models of morphology that are built on different kinds of representational units, i.e., models emphasizing either holistic units or decomposition, perform in predicting human word recognition. More specifically, we studied the predictive power of such models at early vs. late stages of word recognition by using eye-tracking during two tasks. The tasks included a standard lexical decision task and a word recognition task that assumedly places less emphasis on postlexical reanalysis and decision processes. The lexical decision results showed good performance of Morfessor models based on the Minimum Description Length optimization principle. Models which segment words at some morpheme boundaries and keep other boundaries unsegmented performed well both at early and late stages of word recognition, supporting dual- or multiple-route cognitive models of morphological processing. Statistical models based on full forms fared better in late than early measures. The results of the second, multi-word recognition task showed that early and late stages of processing often involve accessing morphological constituents, with the exception of short complex words. Late stages of word recognition additionally involve predicting upcoming morphemes on the basis of previous ones in multimorphemic words. The statistical models based fully on whole words did not fare well in this task. Thus, we assume that the good performance of such models in global measures such as gaze durations or reaction times in lexical decision largely stems from postlexical reanalysis or decision processes. This finding highlights the importance of considering task demands in the study of morphological processing.Peer reviewe

    The Statistics of Subtypes: A Proposed Study Investigating Statistical Learning Across Subtypes of Dyslexia

    Get PDF
    Current research regarding dyslexia and its subtypes is inconsistent. There are discrepancies in the literature surrounding the causes and manifestations of dyslexia. Furthermore, there is very little research concerning the role of statistical learning in differentiating between subtypes of dyslexia. The purpose of the proposed study is to quantify the differences in statistical learning ability across three subtypes of dyslexia (i.e., phonological dyslexia, surface dyslexia, and deep dyslexia). It is predicted that participants with a dyslexia diagnosis of any subtype will be worse at using statistics to find word boundaries than control participants. Additionally, it is hypothesized that participants with surface dyslexia will express the highest capacity for statistical learning among the three subtypes. Finally, it is hypothesized that participants belonging to the deep dyslexia subgroup will express the lowest capacity for statistical learning. Participants from each of the four treatments (i.e., phonological dyslexia, surface dyslexia, deep dyslexia, and control) will be exposed to the same auditory nonsense word stream. After finishing the listening phase, all participants will complete a forced-choice recognition task. The task will be to indicate which of the two sound strings sounds most like a word from the nonsense language. If the results of this study show that there are differences in statistical learning ability between different subtypes of dyslexia, treatments and interventions can be tailored more appropriately to individuals belonging to each subtype. Additionally, it will be possible to highlight early risk factors that can help with early identification of dyslexia in children

    False memories in recognition memory: Recollection or familiarity?

    Get PDF
    False recollection refers to the retrieval of contextual information associated with an event that has not occurred. For instance, during a recognition task, one might identify a nonstudied word presented at test as old because she remembers the font color of the word during study. Although instances such as this are rare and typically occur at a varying rate of 0-5%, current models of recognition such as the Complementary Learning Systems (CLS) model and the Dual-Process Signal-Detection (DPSD) model do not contain a mechanism to account for their occurrence. Although both the CLS and DPSD models have support from studies demonstrating functional dissociations, neurophysiological dissociations, and behavioral findings of process dissociation, their ability to explain false memories has been more elusive; neither theory specifically addresses false recollection. Instead, such models have ignored false recollection as inconsequential noise in the data. The purpose of this dissertation was to determine whether the false recognition effect obtained by the Payne-Eakin paradigm was due to false recollection or familiarity. The Payne-Eakin paradigm is based on the PIER2 model, which theorizes that targets implicitly activated during study lead to the falser recognition of a false-target pair. Using a modified version of the Payne-Eakin paradigm, we investigated the nature of the false recognition effect using a priori behavioral analyses and statistical modeling. The findings of this dissertation provide a step toward a more solid understanding of the cognitive mechanisms involved in the recognition of nonstudied items. This dissertation demonstrates that modeling false recollection is possible. The results of this dissertation suggest that, because current models of recognition do not provide a mechanism to account for false recollection, our understanding of recognition is not fully understood. The results highlight that the current understanding of how false recollection contributes to recognition performance is an area in need of further development
    corecore