9,970 research outputs found

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    A survey of computer uses in music

    Full text link
    This thesis covers research into the mathematical basis inherent in music including review of projects related to optical character recognition (OCR) of musical symbols. Research was done about fractals creating new pieces by assigning pitches to numbers. Existing musical pieces can be taken apart and reassembled creating new ideas for composers. Musical notation understanding is covered and its requirement for the recognition of a music sheet by the computer for editing and reproduction purposes is explained. The first phase of a musical OCR was created in this thesis with the recognition of staff lines on a good quality image. Modifications will need to be made to take care of noise and tilted images that may result from scanning

    The system integration and verification testing of an orbital maneuvering vehicle for an air bearing floor

    Get PDF
    The teleoperator and Robotics Evaluation Facility (TOREF) is composed of a 4,000 square foot precision air bearing floor, the Teleoperator Motion Base, the Target Motion and Support Simulator, the mock-ups of the Hubble Space Telescope, Multi-mission Modular Spacecraft, and the Orbital Maneuvering Vehicle (OMV). The TOREF and its general capabilities to support the OMV and other remote system simulations; the facility operating procedures and requirements; and the results of generic OMV investigations are summarized

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    読み方の定量的分析に基づく個人およびテキストの特徴認識

    Get PDF
    学位の種別:課程博士University of Tokyo(東京大学

    Using neuro-cognitive modelling to link attention deficits to structural and functional brain changes

    Get PDF
    ‘Visual attention’ is an emerging property of interconnected neural networks, in which the interconnections are biased to promote targets over distracting stimuli. It has been shown that efficiency of the attention system is lost after many kinds of brain damage, with each presumably effecting different aspects of basic visual attention functions. Yet, our understanding of these processes is limited by the methodological shortcomings of classical neuropsychological assessment. The overarching goal of the current thesis was to overcome these constrains and thereby extend the link between attention deficits and underlying brain changes. The here used approach incorporates parametric measurement of visual attention derived from the computational Theory of Visual Attention (TVA, Bundesen, 1990) and modern magnetic resonance imaging techniques. Project 1 of the current thesis applied a combined TVA–neuroimaging analysis in a neurodevelopmental model (preterm birth) to relate attention deficits with changes in functional connectivity networks. We found that pre- versus full-term born adults show a selective reduction of visual short-term memory capacity. The remarkable changes we observed in attention-related large-scale brain networks of the occipital and posterior parietal cortices were most pronounced in those preterm born individuals with the most preserved attention functions. This finding was interpreted as evidence for a compensatory reorganization of functional connectivity in order to ameliorate the advert consequences of preterm birth on visual short-term memory. Project 2 of this thesis applied a combined TVA-neuroimaging analysis in a neurodegenerative model (posterior cortical atrophy) to relate attention deficits with structural changes in grey and white matter morphometry. Compared to healthy control participants, patients with posterior cortical atrophy suffered from a selective disturbance of visual processing speed. The individual rate of processing speed slowing was a valid predictor for the severity of simultanagnosia, the core symptom in this clinical condition. We further found wide-spread atrophy in occipital as well as parietal and to a smaller degree in temporal brain areas. White matter degeneration in the superior parietal lobe, rather than atrophy of any grey matter cluster, was significantly associated with patients’ impaired processing speed. Based on these results we propose that disruption of white matter pathways especially within the superior parietal lobe leads to reduced processing speed which then results in the overt clinical symptoms of simultanagnosia. Altogether, projects of the current thesis expanded the link between specific attention deficits and underlying brain damage by using neuro-cognitive modelling. We demonstrated that parametric measurements of attention facilitate, in the role of intermediate cognitive constructs, the mapping between etiological factors and behavioral outcomes. Identifying predictable behavior-brain relationships in attention disorders may offer new perspectives for diagnosis and treatment. The clinical application of an integrated TVA-neuroimaging analysis could additionally compliment insights from healthy participants toward understanding the principles of normal visual attention as well as identifying their neuronal basis

    Visual-Linguistic Semantic Alignment: Fusing Human Gaze and Spoken Narratives for Image Region Annotation

    Get PDF
    Advanced image-based application systems such as image retrieval and visual question answering depend heavily on semantic image region annotation. However, improvements in image region annotation are limited because of our inability to understand how humans, the end users, process these images and image regions. In this work, we expand a framework for capturing image region annotations where interpreting an image is influenced by the end user\u27s visual perception skills, conceptual knowledge, and task-oriented goals. Human image understanding is reflected by individuals\u27 visual and linguistic behaviors, but the meaningful computational integration and interpretation of their multimodal representations (e.g. gaze, text) remain a challenge. Our work explores the hypothesis that eye movements can help us understand experts\u27 perceptual processes and that spoken language descriptions can reveal conceptual elements of image inspection tasks. We propose that there exists a meaningful relation between gaze, spoken narratives, and image content. Using unsupervised bitext alignment, we create meaningful mappings between participants\u27 eye movements (which reveal key areas of images) and spoken descriptions of those images. The resulting alignments are then used to annotate image regions with concept labels. Our alignment accuracy exceeds baseline alignments that are obtained using both simultaneous and a fixed-delay temporal correspondence. Additionally, comparison of alignment accuracy between a method that identifies clusters in the images based on eye movements and a method that identifies clusters using image features shows that the two approaches perform well on different types of images and concept labels. This suggests that an image annotation framework could integrate information from more than one technique to handle heterogeneous images. The resulting alignments can be used to create a database of low-level image features and high-level semantic annotations corresponding to perceptually important image regions. We demonstrate the applicability of the proposed framework with two datasets: one consisting of general-domain images and another with images from the domain of medicine. This work is an important contribution toward the highly challenging problem of fusing human-elicited multimodal data sources, a problem that will become increasingly important as low-resource scenarios become more common

    Thalamic bursts modulate cortical synchrony locally to switch between states of global functional connectivity in a cognitive task

    Get PDF
    Performing a cognitive task requires going through a sequence of functionally diverse stages. Although it is typically assumed that these stages are characterized by distinct states of cortical synchrony that are triggered by sub-cortical events, little reported evidence supports this hypothesis. To test this hypothesis, we first identified cognitive stages in single-trial MEG data of an associative recognition task, showing with a novel method that each stage begins with local modulations of synchrony followed by a state of directed functional connectivity. Second, we developed the first whole-brain model that can simulate cortical synchrony throughout a task. The model suggests that the observed synchrony is caused by thalamocortical bursts at the onset of each stage, targeted at cortical synapses and interacting with the structural anatomical connectivity. These findings confirm that cognitive stages are defined by distinct states of cortical synchrony and explains the network-level mechanisms necessary for reaching stage-dependent synchrony states

    Cognitive control of attention, emotion, and memory : an ERP study

    Get PDF
    Unwanted retrieval of negative memories can be problematic for many clinical populations. The Think/No-Think (T/NT) task (Anderson & Green, 2001) is a new paradigm for studying cognitive control during cued recall. In this task participants view a cue item and are asked to consciously retrieve (think) or interrupt retrieval (no-think) of the associated target item. Eyer (2009) found that self-reported mindfulness was correlated with T/NT cued recall, suggesting a relationship between control of memory retrieval and a general cognitive control skill. The current study measured event-related potentials (ERPs; i.e., electrical brain responses time-locked to cue presentation) for negative and neutral stimuli on the TNT task to assess cognitive control during retrieval. Method: Participants (N = 35) completed questionnaires (e.g., mindfulness, intrusive thoughts) and cognitive tasks related to cognitive control (e.g., attention, working memory span). Then, ERPs were recorded during the TNT task, followed by a final cued recall test. Results: Analyses of ERPs found evidence to support somewhat separable neural networks for control of memory retrieval and for processing the emotional content of the target pictures, with some time windows only exhibiting a main effect of strategy or of emotional valence. However, there was widespread evidence for interactions of these subsystems across a range of time latencies post-cue presentation. Of particular note was a significant Strategy x Valence interaction for the early P1 component (125-164 ms). The overall size of the N2 (250–324 ms) peak was correlated with a wide range of self- report and cognitive test measures of cognitive control at frontal electrode sites. Discussion: The present study adds to knowledge of the timing of control processes during performance of the TNT task through its use of ERP methodology. The effect of the emotional valence of the to-be-recalled target on the early P1 ERP component suggests surprisingly early emotional processing during memory retrieval. The present results also suggest that at least some of the control processes used during the TNT task are part of a larger general-purpose cognitive control system. These results suggest that individual traits provide important and varying influences on the cognitive control of emotional memories
    corecore