5 research outputs found

    Nonlinear Hybrid System Identification with Kernel Models

    Get PDF
    CDROM DOI: 10.1109/CDC.2010.5718011International audienceThis paper focuses on the identification of nonlinear hybrid systems involving unknown nonlinear dynamics. The proposed method extends the framework of [1] by introducing nonparametric models based on kernel functions in order to estimate arbitrary nonlinearities without prior knowledge. In comparison to the previous work of [2], which also dealt with unknown nonlinearities, the new algorithm assumes the form of an unconstrained nonlinear continuous optimization problem, which can be efficiently solved for moderate numbers of parameters in the model, as is typically the case for linear hybrid systems. However, to maintain the efficiency of the method on large data sets with nonlinear kernel models, a preprocessing step is required in order to fix the model size and limit the number of optimization variables. A support vector selection procedure, based on a maximum entropy criterion, is proposed to perform this step. The efficiency of the resulting algorithm is demonstrated on large-scale experiments involving the identification of nonlinear switched dynamical systems

    The EDAM Project: Mining Atmospheric Aerosol Datasets

    Get PDF
    Data mining has been a very active area of research in the database, machine learning, and mathematical programming communities in recent years. EDAM (Exploratory Data Analysis and Management) is a joint project between researchers in Atmospheric Chemistry and Computer Science at Carleton College and the University of Wisconsin-Madison that aims to develop data mining techniques for advancing the state of the art in analyzing atmospheric aerosol datasets. There is a great need to better understand the sources, dynamics, and compositions of atmospheric aerosols. The traditional approach for particle measurement, which is the collection of bulk samples of particulates on filters, is not adequate for studying particle dynamics and real-time correlations. This has led to the development of a new generation of real-time instruments that provide continuous or semi-continuous streams of data about certain aerosol properties. However, these instruments have added a significant level of complexity to atmospheric aerosol data, and dramatically increased the amounts of data to be collected, managed, and analyzed. Our abilit y to integrate the data from all of these new and complex instruments now lags far behind our data-collection capabilities, and severely limits our ability to understand the data and act upon it in a timely manner. In this paper, we present an overview of the EDAM project. The goal of the project, which is in its early stages, is to develop novel data mining algorithms and approaches to managing and monitoring multiple complex data streams. An important objective is data quality assurance, and real-time data mining offers great potential. The approach that we take should also provide good techniques to deal with gas-phase and semi-volatile data. While atmospheric aerosol analysis is an important and challenging domain that motivates us with real problems and serves as a concrete test of our results, our objective is to develop techniques that have broader applicability, and to explore some fundamental challenges in data mining that are not specific to any given application domain

    Large Scale Kernel Regression via Linear Programming

    Get PDF
    The problem of tolerant data tting by a nonlinear surface, in- duced by a kernel-based support vector machine [24], is formulated as a linear program with fewer number of variables than that of other linear programming formulations [21]. A generalization of the lin- ear programming chunking algorithm [2] for arbitrary kernels [13] is implemented for solving problems with very large datasets wherein chunking is performed on both data points and problem variables. The proposed approach tolerates a small error, which is adjusted paramet- rically, while tting the given data. This leads to improved tting of noisy data (over ordinary least error solutions) as demonstrated com- putationally. Comparative numerical results indicate an average time reduction as high as 26.0% over other formulations, with a maximal time reduction of 79.7%. Additionally, linear programs with as many as 16,000 data points and more than a billion nonzero matrix elements are solved

    Réseaux de neurones, SVM et approches locales pour la prévision de séries temporelles

    Get PDF
    La prévision des séries temporelles est un problème qui est traité depuis de nombreuses années. On y trouve des applications dans différents domaines tels que : la finance, la médecine, le transport, etc. Dans cette thèse, on s est intéressé aux méthodes issues de l apprentissage artificiel : les réseaux de neurones et les SVM. On s est également intéressé à l intérêt des méta-méthodes pour améliorer les performances des prédicteurs, notamment l approche locale. Dans une optique de diviser pour régner, les approches locales effectuent le clustering des données avant d affecter les prédicteurs aux sous ensembles obtenus. Nous présentons une modification dans l algorithme d apprentissage des réseaux de neurones récurrents afin de les adapter à cette approche. Nous proposons également deux nouvelles techniques de clustering, la première basée sur les cartes de Kohonen et la seconde sur les arbres binaires.Time series forecasting is a widely discussed issue for many years. Researchers from various disciplines have addressed it in several application areas : finance, medical, transportation, etc. In this thesis, we focused on machine learning methods : neural networks and SVM. We have also been interested in the meta-methods to push up the predictor performances, and more specifically the local models. In a divide and conquer strategy, the local models perform a clustering over the data sets before different predictors are affected into each obtained subset. We present in this thesis a new algorithm for recurrent neural networks to use them as local predictors. We also propose two novel clustering techniques suitable for local models. The first is based on Kohonen maps, and the second is based on binary trees.TOURS-Bibl.électronique (372610011) / SudocSudocFranceF
    corecore