14,473 research outputs found

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State Vowel Identification

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Integration of geographic information system and RADARSAT synthetic aperture radar data using a self-organizing map network as compensation for realtime ground data in automatic image classification

    Get PDF
    The paper presents results of using advanced techniques such as Self-Organizing feature Map (SOM) to incorporate a GIS data layer to compensate for the limited amount of real-time ground-truth data available for land-use and land-cover mapping in wet-season conditions in Bangladesh based on multi-temporal RADARSAT-1 SAR images. The experimental results were compared with those of traditional statistical classifiers such as Maximum Likelihood, Mahalanobis Distance, and Minimum Distance, which are not suitable for incorporating low-level GIS data in the image classification process. The performances of the classifiers were evaluated in terms of the classification accuracy with respect to the collected real-time ground truth data. The SOM neural network provided the highest overall accuracy when a GIS layer of land type classification with respect to the depth and duration of regular flooding was used in the network. Using this method, the overall accuracy was around 15% higher than the previously mentioned traditional classifiers at 79.6% where the training data covered only 0.53% of the total image. It also achieved higher accuracies for more classes in comparison to the other classifiers

    Digital image processing techniques for detecting, quantifying and classifying plant diseases.

    Get PDF
    Abstract. This paper presents a survey on methods that use digital image processing techniques to detect, quantify and classify plant diseases from digital images in the visible spectrum. Although disease symptoms can manifest in any part of the plant, only methods that explore visible symptoms in leaves and stems were considered. This was done for two main reasons: to limit the length of the paper and because methods dealing with roots, seeds and fruits have some peculiarities that would warrant a specific survey. The selected proposals are divided into three classes according to their objective: detection, severity quantification, and classification. Each of those classes, in turn, are subdivided according to the main technical solution used in the algorithm. This paper is expected to be useful to researchers working both on vegetable pathology and pattern recognition, providing a comprehensive and accessible overview of this important field of research
    corecore