80 research outputs found

    Swarm-Organized Topographic Mapping

    Get PDF
    Topographieerhaltende Abbildungen versuchen, hochdimensionale oder komplexe Datenbestände auf einen niederdimensionalen Ausgaberaum abzubilden, wobei die Topographie der Daten hinreichend gut wiedergegeben werden soll. Die Qualität solcher Abbildung hängt gewöhnlich vom eingesetzten Nachbarschaftskonzept des konstruierenden Algorithmus ab. Die Schwarm-Organisierte Projektion ermöglicht eine Lösung dieses Parametrisierungsproblems durch die Verwendung von Techniken der Schwarmintelligenz. Die praktische Verwendbarkeit dieser Methodik wurde durch zwei Anwendungen auf dem Feld der Molekularbiologie sowie der Finanzanalytik demonstriert

    Exploratory Cluster Analysis from Ubiquitous Data Streams using Self-Organizing Maps

    Get PDF
    This thesis addresses the use of Self-Organizing Maps (SOM) for exploratory cluster analysis over ubiquitous data streams, where two complementary problems arise: first, to generate (local) SOM models over potentially unbounded multi-dimensional non-stationary data streams; second, to extrapolate these capabilities to ubiquitous environments. Towards this problematic, original contributions are made in terms of algorithms and methodologies. Two different methods are proposed regarding the first problem. By focusing on visual knowledge discovery, these methods fill an existing gap in the panorama of current methods for cluster analysis over data streams. Moreover, the original SOM capabilities in performing both clustering of observations and features are transposed to data streams, characterizing these contributions as versatile compared to existing methods, which target an individual clustering problem. Also, additional methodologies that tackle the ubiquitous aspect of data streams are proposed in respect to the second problem, allowing distributed and collaborative learning strategies. Experimental evaluations attest the effectiveness of the proposed methods and realworld applications are exemplified, namely regarding electric consumption data, air quality monitoring networks and financial data, motivating their practical use. This research study is the first to clearly address the use of the SOM towards ubiquitous data streams and opens several other research opportunities in the future

    Multi-tier framework for the inferential measurement and data-driven modeling

    Get PDF
    A framework for the inferential measurement and data-driven modeling has been proposed and assessed in several real-world application domains. The architecture of the framework has been structured in multiple tiers to facilitate extensibility and the integration of new components. Each of the proposed four tiers has been assessed in an uncoupled way to verify their suitability. The first tier, dealing with exploratory data analysis, has been assessed with the characterization of the chemical space related to the biodegradation of organic chemicals. This analysis has established relationships between physicochemical variables and biodegradation rates that have been used for model development. At the preprocessing level, a novel method for feature selection based on dissimilarity measures between Self-Organizing maps (SOM) has been developed and assessed. The proposed method selected more features than others published in literature but leads to models with improved predictive power. Single and multiple data imputation techniques based on the SOM have also been used to recover missing data in a Waste Water Treatment Plant benchmark. A new dynamic method to adjust the centers and widths of in Radial basis Function networks has been proposed to predict water quality. The proposed method outperformed other neural networks. The proposed modeling components have also been assessed in the development of prediction and classification models for biodegradation rates in different media. The results obtained proved the suitability of this approach to develop data-driven models when the complex dynamics of the process prevents the formulation of mechanistic models. The use of rule generation algorithms and Bayesian dependency models has been preliminary screened to provide the framework with interpretation capabilities. Preliminary results obtained from the classification of Modes of Toxic Action (MOA) indicate that this could be a promising approach to use MOAs as proxy indicators of human health effects of chemicals.Finally, the complete framework has been applied to three different modeling scenarios. A virtual sensor system, capable of inferring product quality indices from primary process variables has been developed and assessed. The system was integrated with the control system in a real chemical plant outperforming multi-linear correlation models usually adopted by chemical manufacturers. A model to predict carcinogenicity from molecular structure for a set of aromatic compounds has been developed and tested. Results obtained after the application of the SOM-dissimilarity feature selection method yielded better results than models published in the literature. Finally, the framework has been used to facilitate a new approach for environmental modeling and risk management within geographical information systems (GIS). The SOM has been successfully used to characterize exposure scenarios and to provide estimations of missing data through geographic interpolation. The combination of SOM and Gaussian Mixture models facilitated the formulation of a new probabilistic risk assessment approach.Aquesta tesi proposa i avalua en diverses aplicacions reals, un marc general de treball per al desenvolupament de sistemes de mesurament inferencial i de modelat basats en dades. L'arquitectura d'aquest marc de treball s'organitza en diverses capes que faciliten la seva extensibilitat així com la integració de nous components. Cadascun dels quatre nivells en que s'estructura la proposta de marc de treball ha estat avaluat de forma independent per a verificar la seva funcionalitat. El primer que nivell s'ocupa de l'anàlisi exploratòria de dades ha esta avaluat a partir de la caracterització de l'espai químic corresponent a la biodegradació de certs compostos orgànics. Fruit d'aquest anàlisi s'han establert relacions entre diverses variables físico-químiques que han estat emprades posteriorment per al desenvolupament de models de biodegradació. A nivell del preprocés de les dades s'ha desenvolupat i avaluat una nova metodologia per a la selecció de variables basada en l'ús del Mapes Autoorganitzats (SOM). Tot i que el mètode proposat selecciona, en general, un major nombre de variables que altres mètodes proposats a la literatura, els models resultants mostren una millor capacitat predictiva. S'han avaluat també tot un conjunt de tècniques d'imputació de dades basades en el SOM amb un conjunt de dades estàndard corresponent als paràmetres d'operació d'una planta de tractament d'aigües residuals. Es proposa i avalua en un problema de predicció de qualitat en aigua un nou model dinàmic per a ajustar el centre i la dispersió en xarxes de funcions de base radial. El mètode proposat millora els resultats obtinguts amb altres arquitectures neuronals. Els components de modelat proposat s'han aplicat també al desenvolupament de models predictius i de classificació de les velocitats de biodegradació de compostos orgànics en diferents medis. Els resultats obtinguts demostren la viabilitat d'aquesta aproximació per a desenvolupar models basats en dades en aquells casos en els que la complexitat de dinàmica del procés impedeix formular models mecanicistes. S'ha dut a terme un estudi preliminar de l'ús de algorismes de generació de regles i de grafs de dependència bayesiana per a introduir una nova capa que faciliti la interpretació dels models. Els resultats preliminars obtinguts a partir de la classificació dels Modes d'acció Tòxica (MOA) apunten a que l'ús dels MOA com a indicadors intermediaris dels efectes dels compostos químics en la salut és una aproximació factible.Finalment, el marc de treball proposat s'ha aplicat en tres escenaris de modelat diferents. En primer lloc, s'ha desenvolupat i avaluat un sensor virtual capaç d'inferir índexs de qualitat a partir de variables primàries de procés. El sensor resultant ha estat implementat en una planta química real millorant els resultats de les correlacions multilineals emprades habitualment. S'ha desenvolupat i avaluat un model per a predir els efectes carcinògens d'un grup de compostos aromàtics a partir de la seva estructura molecular. Els resultats obtinguts desprès d'aplicar el mètode de selecció de variables basat en el SOM milloren els resultats prèviament publicats. Aquest marc de treball s'ha usat també per a proporcionar una nova aproximació al modelat ambiental i l'anàlisi de risc amb sistemes d'informació geogràfica (GIS). S'ha usat el SOM per a caracteritzar escenaris d'exposició i per a desenvolupar un nou mètode d'interpolació geogràfica. La combinació del SOM amb els models de mescla de gaussianes dona una nova formulació al problema de l'anàlisi de risc des d'un punt de vista probabilístic

    Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)

    Get PDF
    Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression

    Proceedings of the Mobile Satellite Conference

    Get PDF
    A satellite-based mobile communications system provides voice and data communications to mobile users over a vast geographic area. The technical and service characteristics of mobile satellite systems (MSSs) are presented and form an in-depth view of the current MSS status at the system and subsystem levels. Major emphasis is placed on developments, current and future, in the following critical MSS technology areas: vehicle antennas, networking, modulation and coding, speech compression, channel characterization, space segment technology and MSS experiments. Also, the mobile satellite communications needs of government agencies are addressed, as is the MSS potential to fulfill them

    Technology 2002: The Third National Technology Transfer Conference and Exposition, volume 2

    Get PDF
    Proceedings from symposia of the Technology 2002 Conference and Exposition, December 1-3, 1992, Baltimore, MD. Volume 2 features 60 papers presented during 30 concurrent sessions

    Phase-Distortion-Robust Voice-Source Analysis

    Get PDF
    This work concerns itself with the analysis of voiced speech signals, in particular the analysis of the glottal source signal. Following the source-filter theory of speech, the glottal signal is produced by the vibratory behaviour of the vocal folds and is modulated by the resonances of the vocal tract and radiation characteristic of the lips to form the speech signal. As it is thought that the glottal source signal contributes much of the non-linguistic and prosodical information to speech, it is useful to develop techniques which can estimate and parameterise this signal accurately. Because of vocal tract modulation, estimating the glottal source waveform from the speech signal is a blind deconvolution problem which necessarily makes assumptions about the characteristics of both the glottal source and vocal tract. A common assumption is that the glottal signal and/or vocal tract can be approximated by a parametric model. Other assumptions include the causality of the speech signal: the vocal tract is assumed to be a minimum phase system while the glottal source is assumed to exhibit mixed phase characteristics. However, as the literature review within this thesis will show, the error criteria utilised to determine the parameters are not robust to the conditions under which the speech signal is recorded, and are particularly degraded in the common scenario where low frequency phase distortion is introduced. Those that are robust to this type of distortion are not well suited to the analysis of real-world signals. This research proposes a voice-source estimation and parameterisation technique, called the Power-spectrum-based determination of the Rd parameter (PowRd) method. Illustrated by theory and demonstrated by experiment, the new technique is robust to the time placement of the analysis frame and phase issues that are generally encountered during recording. The method assumes that the derivative glottal flow signal is approximated by the transformed Liljencrants-Fant model and that the vocal tract can be represented by an all-pole filter. Unlike many existing glottal source estimation methods, the PowRd method employs a new error criterion to optimise the parameters which is also suitable to determine the optimal vocal-tract filter order. In addition to the issue of glottal source parameterisation, nonlinear phase recording conditions can also adversely affect the results of other speech processing tasks such as the estimation of the instant of glottal closure. In this thesis, a new glottal closing instant estimation algorithm is proposed which incorporates elements from the state-of-the-art techniques and is specifically designed for operation upon speech recorded under nonlinear phase conditions. The new method, called the Fundamental RESidual Search or FRESS algorithm, is shown to estimate the glottal closing instant of voiced speech with superior precision and comparable accuracy as other existing methods over a large database of real speech signals under real and simulated recording conditions. An application of the proposed glottal source parameterisation method and glottal closing instant detection algorithm is a system which can analyse and re-synthesise voiced speech signals. This thesis describes perceptual experiments which show that, iunder linear and nonlinear recording conditions, the system produces synthetic speech which is generally preferred to speech synthesised based upon a state-of-the-art timedomain- based parameterisation technique. In sum, this work represents a movement towards flexible and robust voice-source analysis, with potential for a wide range of applications including speech analysis, modification and synthesis

    A Soft Computing Based Approach for Multi-Accent Classification in IVR Systems

    Get PDF
    A speaker's accent is the most important factor affecting the performance of Natural Language Call Routing (NLCR) systems because accents vary widely, even within the same country or community. This variation also occurs when non-native speakers start to learn a second language, the substitution of native language phonology being a common process. Such substitution leads to fuzziness between the phoneme boundaries and phoneme classes, which reduces out-of-class variations, and increases the similarities between the different sets of phonemes. Thus, this fuzziness is the main cause of reduced NLCR system performance. The main requirement for commercial enterprises using an NLCR system is to have a robust NLCR system that provides call understanding and routing to appropriate destinations. The chief motivation for this present work is to develop an NLCR system that eliminates multilayered menus and employs a sophisticated speaker accent-based automated voice response system around the clock. Currently, NLCRs are not fully equipped with accent classification capability. Our main objective is to develop both speaker-independent and speaker-dependent accent classification systems that understand a caller's query, classify the caller's accent, and route the call to the acoustic model that has been thoroughly trained on a database of speech utterances recorded by such speakers. In the field of accent classification, the dominant approaches are the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). Of the two, GMM is the most widely implemented for accent classification. However, GMM performance depends on the initial partitions and number of Gaussian mixtures, both of which can reduce performance if poorly chosen. To overcome these shortcomings, we propose a speaker-independent accent classification system based on a distance metric learning approach and evolution strategy. This approach depends on side information from dissimilar pairs of accent groups to transfer data points to a new feature space where the Euclidean distances between similar and dissimilar points are at their minimum and maximum, respectively. Finally, a Non-dominated Sorting Evolution Strategy (NSES)-based k-means clustering algorithm is employed on the training data set processed by the distance metric learning approach. The main objectives of the NSES-based k-means approach are to find the cluster centroids as well as the optimal number of clusters for a GMM classifier. In the case of a speaker-dependent application, a new method is proposed based on the fuzzy canonical correlation analysis to find appropriate Gaussian mixtures for a GMM-based accent classification system. In our proposed method, we implement a fuzzy clustering approach to minimize the within-group sum-of-square-error and canonical correlation analysis to maximize the correlation between the speech feature vectors and cluster centroids. We conducted a number of experiments using the TIMIT database, the speech accent archive, and the foreign accent English databases for evaluating the performance of speaker-independent and speaker-dependent applications. Assessment of the applications and analysis shows that our proposed methodologies outperform the HMM, GMM, vector quantization GMM, and radial basis neural networks

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research
    corecore