18 research outputs found

    Using a low-bit rate speech enhancement variable post-filter as a speech recognition system pre-filter to improve robustness to GSM speech

    Get PDF
    Includes bibliographical references.Performance of speech recognition systems degrades when they are used to recognize speech that has been transmitted through GS1 (Global System for Mobile Communications) voice communication channels (GSM speech). This degradation is mainly due to GSM speech coding and GSM channel noise on speech signals transmitted through the network. This poor recognition of GSM channel speech limits the use of speech recognition applications over GSM networks. If speech recognition technology is to be used unlimitedly over GSM networks recognition accuracy of GSM channel speech has to be improved. Different channel normalization techniques have been developed in an attempt to improve recognition accuracy of voice channel modified speech in general (not specifically for GSM channel speech). These techniques can be classified into three broad categories, namely, model modification, signal pre-processing and feature processing techniques. In this work, as a contribution toward improving the robustness of speech recognition systems to GSM speech, the use of a low-bit speech enhancement post-filter as a speech recognition system pre-filter is proposed. This filter is to be used in recognition systems in combination with channel normalization techniques

    Analytic Assessment of Telephone Transmission Impact on ASR Performance Using a Simulation Model

    Get PDF
    This paper addresses the impact of telephone transmission channels on automatic speech recognition (ASR) performance. A real-time simulation model is described and implemented, which allows impairments that are encountered in traditional as well as modern (mobile, IP-based) networks to be flexibly and efficiently generated. The model is based on input parameters which are known to telephone network planners; thus, it can be applied without measuring specific network characteristics. It can be used for an analytic assessment of the impact of channel impairments on ASR performance, for producing training material with defined transmission characteristics, or for testing spoken dialogue systems in realistic network environments. In the present paper, we present an investigation of the first point. Two speech recognizers which are integrated into a spoken dialogue system for information retrieval are assessed in relation to controlled amounts of transmission degradations. The measured ASR performance degradation is compared to speech quality degradation in human-human communication. It turns out that different behavior can be expected for some impairments. This fact has to be taken into account in both telephone network planning as well as in speech and language technology development

    Acta Cybernetica : Volume 15. Number 2.

    Get PDF

    Decision Support Systems

    Get PDF
    Decision support systems (DSS) have evolved over the past four decades from theoretical concepts into real world computerized applications. DSS architecture contains three key components: knowledge base, computerized model, and user interface. DSS simulate cognitive decision-making functions of humans based on artificial intelligence methodologies (including expert systems, data mining, machine learning, connectionism, logistical reasoning, etc.) in order to perform decision support functions. The applications of DSS cover many domains, ranging from aviation monitoring, transportation safety, clinical diagnosis, weather forecast, business management to internet search strategy. By combining knowledge bases with inference rules, DSS are able to provide suggestions to end users to improve decisions and outcomes. This book is written as a textbook so that it can be used in formal courses examining decision support systems. It may be used by both undergraduate and graduate students from diverse computer-related fields. It will also be of value to established professionals as a text for self-study or for reference

    Intelligent Sensor Networks

    Get PDF
    In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

    Epälineaarisen visuaalisen prosessoinnin oppiminen luonnollisista kuvista

    Get PDF
    The paradigm of computational vision hypothesizes that any visual function -- such as the recognition of your grandparent -- can be replicated by computational processing of the visual input. What are these computations that the brain performs? What should or could they be? Working on the latter question, this dissertation takes the statistical approach, where the suitable computations are attempted to be learned from the natural visual data itself. In particular, we empirically study the computational processing that emerges from the statistical properties of the visual world and the constraints and objectives specified for the learning process. This thesis consists of an introduction and 7 peer-reviewed publications, where the purpose of the introduction is to illustrate the area of study to a reader who is not familiar with computational vision research. In the scope of the introduction, we will briefly overview the primary challenges to visual processing, as well as recall some of the current opinions on visual processing in the early visual systems of animals. Next, we describe the methodology we have used in our research, and discuss the presented results. We have included some additional remarks, speculations and conclusions to this discussion that were not featured in the original publications. We present the following results in the publications of this thesis. First, we empirically demonstrate that luminance and contrast are strongly dependent in natural images, contradicting previous theories suggesting that luminance and contrast were processed separately in natural systems due to their independence in the visual data. Second, we show that simple cell -like receptive fields of the primary visual cortex can be learned in the nonlinear contrast domain by maximization of independence. Further, we provide first-time reports of the emergence of conjunctive (corner-detecting) and subtractive (opponent orientation) processing due to nonlinear projection pursuit with simple objective functions related to sparseness and response energy optimization. Then, we show that attempting to extract independent components of nonlinear histogram statistics of a biologically plausible representation leads to projection directions that appear to differentiate between visual contexts. Such processing might be applicable for priming, \ie the selection and tuning of later visual processing. We continue by showing that a different kind of thresholded low-frequency priming can be learned and used to make object detection faster with little loss in accuracy. Finally, we show that in a computational object detection setting, nonlinearly gain-controlled visual features of medium complexity can be acquired sequentially as images are encountered and discarded. We present two online algorithms to perform this feature selection, and propose the idea that for artificial systems, some processing mechanisms could be selectable from the environment without optimizing the mechanisms themselves. In summary, this thesis explores learning visual processing on several levels. The learning can be understood as interplay of input data, model structures, learning objectives, and estimation algorithms. The presented work adds to the growing body of evidence showing that statistical methods can be used to acquire intuitively meaningful visual processing mechanisms. The work also presents some predictions and ideas regarding biological visual processing.Laskennallisen näön paradigma esittää, että mikä tahansa näkötoiminto - esimerkiksi jonkun esineen tunnistaminen - voidaan toistaa keinotekoisesti käyttäen laskennallisia menetelmiä. Minkälaisia nämä laskennalliset menetelmät voisivat olla, tai minkälaisia niiden tulisi olla? Tässä väitöskirjassa tutkitaan tilastollista lähestymistapaa näkemisen mekanismien muodostamiseen. Sovelletussa lähestymistavassa laskennallista käsittelyä yritetään muodostaa optimoimalla (tai 'oppimalla') siten, että toivotulle käsittelylle asetetaan erilaisia tavoitteita jonkin annetun luonnollisten kuvien joukon suhteen. Väitöskirja koostuu johdannosta ja seitsemästä kansainvälisillä foorumeilla julkaistusta tutkimusartikkelista. Johdanto esittelee väitöskirjan poikkitieteellistä tutkimusaluetta niille, jotka eivät entuudestaan tunne laskennallista näkötutkimusta. Johdannossa käydään läpi visuaalisen prosessoinnin haasteita sekä valotetaan hieman tämänhetkisiä mielipiteitä biologisista näkömekanismeista. Seuraavaksi lukija tutustutetaan työssä käytettyyn tutkimusmetodologiaan, jonka voi pitkälti nähdä koneoppimisen (tilastotieteen) soveltamisena. Johdannon lopuksi käydään läpi työn tutkimusartikkelit. Tämä katsaus on varustettu sellaisilla lisäkommenteilla, havainnoilla ja kritiikeillä, jotka eivät sisältyneet alkuperäisiin artikkeleihin. Varsinaiset tulokset väitöskirjassa liittyvät siihen, minkälaisia yksinkertaisia prosessointimekanismeja muodostuu yhdistelemällä erilaisia oppimistavoitteita, funktioluokkia, epälineaarisuuksia ja luonnollista kuvadataa. Työssä tarkastellaan erityisesti representaatioiden riippumattomuuteen ja harvuuteen tähtääviä oppimistavoitteita, mutta myös sellaisia, jotka pyrkivät edesauttamaan objektintunnistuksessa. Esitämme näiden aiheiden tiimoilta uusia löydöksiä, jotka listataan tarkemmin sekä englanninkielisessä tiivistelmässä että väitöskirjan alkusivuilla. Esitetty väitöskirjatyö tarjoaa lisänäyttöä siitä, että intuitiivisesti mielekkäitä visuaalisia prosessointimekanismeja voidaan muodostaa tilastollisin keinoin. Työ tarjoaa myös joitakin ennusteita ja ideoita liittyen biologisiin näkömekanismeihin

    Proceedings of the Third Edition of the Annual Conference on Wireless On-demand Network Systems and Services (WONS 2006)

    Get PDF
    Ce fichier regroupe en un seul documents l'ensemble des articles accéptés pour la conférences WONS2006/http://citi.insa-lyon.fr/wons2006/index.htmlThis year, 56 papers were submitted. From the Open Call submissions we accepted 16 papers as full papers (up to 12 pages) and 8 papers as short papers (up to 6 pages). All the accepted papers will be presented orally in the Workshop sessions. More precisely, the selected papers have been organized in 7 session: Channel access and scheduling, Energy-aware Protocols, QoS in Mobile Ad-Hoc networks, Multihop Performance Issues, Wireless Internet, Applications and finally Security Issues. The papers (and authors) come from all parts of the world, confirming the international stature of this Workshop. The majority of the contributions are from Europe (France, Germany, Greece, Italy, Netherlands, Norway, Switzerland, UK). However, a significant number is from Australia, Brazil, Canada, Iran, Korea and USA. The proceedings also include two invited papers. We take this opportunity to thank all the authors who submitted their papers to WONS 2006. You helped make this event again a success

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences
    corecore