110 research outputs found

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    Signal and data processing for machine olfaction and chemical sensing: A review

    Get PDF
    Signal and data processing are essential elements in electronic noses as well as in most chemical sensing instruments. The multivariate responses obtained by chemical sensor arrays require signal and data processing to carry out the fundamental tasks of odor identification (classification), concentration estimation (regression), and grouping of similar odors (clustering). In the last decade, important advances have shown that proper processing can improve the robustness of the instruments against diverse perturbations, namely, environmental variables, background changes, drift, etc. This article reviews the advances made in recent years in signal and data processing for machine olfaction and chemical sensing

    Machine Learning in Resource-constrained Devices: Algorithms, Strategies, and Applications

    Get PDF
    The ever-increasing growth of technologies is changing people's everyday life. As a major consequence: 1) the amount of available data is growing and 2) several applications rely on battery supplied devices that are required to process data in real time. In this scenario the need for ad-hoc strategies for the development of low-power and low-latency intelligent systems capable of learning inductive rules from data using a modest mount of computational resources is becoming vital. At the same time, one needs to develop specic methodologies to manage complex patterns such as text and images. This Thesis presents different approaches and techniques for the development of fast learning models explicitly designed to be hosted on embedded systems. The proposed methods proved able to achieve state-of-the-art performances in term of the trade-off between generalization capabilities and area requirements when implemented in low-cost digital devices. In addition, advanced strategies for ecient sentiment analysis in text and images are proposed

    Computational intelligence techniques for maximum energy efficiency of cogeneration processes based on internal combustion engines

    Get PDF
    153 p.El objeto de la tesis consiste en desarrollar estrategias de modelado y optimización del rendimiento energético de plantas de cogeneración basadas en motores de combustión interna (MCI), mediante el uso de las últimas tecnologías de inteligencia computacional. Con esta finalidad se cuenta con datos reales de una planta de cogeneración de energía, propiedad de la compañía EnergyWorks, situada en la localidad de Monzón (provincia de Huesca). La tesis se realiza en el marco de trabajo conjunto del Grupo de Diseño en Electrónica Digital (GDED) de la Universidad del País Vasco UPV/EHU y la empresa Optimitive S.L., empresa dedicada al software avanzado para la mejora en tiempo real de procesos industriale

    Computational intelligence techniques for maritime and coastal remote sensing

    Get PDF
    The aim of this thesis is to investigate the potential of computational intelligence techniques for some applications in the analysis of remotely sensed multi-spectral images. In particular, two problems are addressed. The first one is the classification of oil spills at sea, while the second one is the estimation of sea bottom depth. In both cases, the exploitation of optical satellite data allows to develop operational tools for easily accessing and monitoring large marine areas, in an efficient and cost effective way. Regarding the oil spill problem, today public opinion is certainly aware of the huge impact that oil tanker accidents and oil rig leaks have on marine and coastal environment. However, it is less known that most of the oil released in our seas cannot be ascribed to accidental spills, but rather to illegal ballast waters discharge, and to pollutant dumping at sea, during routine operations of oil tankers. For this reason, any effort for improving oil spill detection systems is of great importance. So far, Synthetic Aperture Radar (SAR) data have been preferred to multi-spectral data for oil spill detection applications, because of their all-weather and all-day capabilities, while optical images necessitate of clear sky conditions and day-light. On the other hand, many features make an optical approach desirable, such as lower cost and higher revisit time. Moreover, unlike SAR data, optical data are not affected by sea state, and are able to reduce false alarm rate, since they do not suffer from the main false alarm source in SAR data, that is represented by the presence of calm sea regions. In this thesis the problem of oil spill classification is tackled by applying different machine learning techniques to a significant dataset of regions of interest, collected in multi-spectral satellite images, acquired by MODIS sensor. These regions are then classified in one of two possible classes, that are oil spills and look-alikes, where look-alikes include any phenomena other than oil spills (e.g. algal blooms...). Results show that efficient and reliable oil spill classification systems based on optical data are feasible, and could offer a valuable support to the existing satellite-based monitoring systems. The estimation of sea bottom depth from high resolution multi-spectral satellite images is the second major topic of this thesis. The motivations for dealing with this problem arise from the necessity of limiting expensive and time consuming measurement campaigns. Since satellite data allow to quickly analyse large areas, a solution for this issue is to employ intelligent techniques, which, by exploiting a small set of depth measurements, are able to extend bathymetry estimate to a much larger area, covered by a multi-spectral satellite image. Such techniques, once that the training phase has been completed, allow to achieve very accurate results, and, thanks to their generalization capabilities, provide reliable bathymetric maps which cover wide areas. A crucial element is represented by the training dataset, which is built by coupling a number of depth measurements, located in a limited part of the image, with corresponding radiances, acquired by the satellite sensor. A successful estimate essentially depends on how the training dataset resembles the rest of the scene. On the other hand, the result is not affected by model uncertainties and systematic errors, as results from model-based analytic approaches are. In this thesis a neuro-fuzzy technique is applied to two case studies, more precisely, two high resolution multi-spectral images related to the same area, but acquired in different years and in different meteorological conditions. Different situations of in-situ depths availability are considered in the study, and the effect of limited in-situ data availability on performance is evaluated. The effect of both meteorological conditions and training set size reduction on the overall performance is also taken into account. Results outperform previous studies on bathymetry estimation techniques, and allow to give indications on the optimal paths which can be adopted when planning data collection at sea

    Enhanced clustering analysis pipeline for performance analysis of parallel applications

    Get PDF
    Clustering analysis is widely used to stratify data in the same cluster when they are similar according to the specific metrics. We can use the cluster analysis to group the CPU burst of a parallel application, and the regions on each process in-between communication calls or calls to the parallel runtime. The resulting clusters obtained are the different computational trends or phases that appear in the application. These clusters are useful to understand the behavior of the computation part of the application and focus the analyses on those that present performance issues. Although density-based clustering algorithms are a powerful and efficient tool to summarize this type of information, their traditional user-guided clustering methodology has many shortcomings and deficiencies in dealing with the complexity of data, the diversity of data structures, high-dimensionality of data, and the dramatic increase in the amount of data. Consequently, the majority of DBSCAN-like algorithms have weaknesses to handle high-dimensionality and/or Multi-density data, and they are sensitive to their hyper-parameter configuration. Furthermore, extracting insight from the obtained clusters is an intuitive and manual task. To mitigate these weaknesses, we have proposed a new unified approach to replace the user-guided clustering with an automated clustering analysis pipeline, called Enhanced Cluster Identification and Interpretation (ECII) pipeline. To build the pipeline, we propose novel techniques including Robust Independent Feature Selection, Feature Space Curvature Map, Organization Component Analysis, and hyper-parameters tuning to feature selection, density homogenization, cluster interpretation, and model selection which are the main components of our machine learning pipeline. This thesis contributes four new techniques to the Machine Learning field with a particular use case in Performance Analytics field. The first contribution is a novel unsupervised approach for feature selection on noisy data, called Robust Independent Feature Selection (RIFS). Specifically, we choose a feature subset that contains most of the underlying information, using the same criteria as the Independent component analysis. Simultaneously, the noise is separated as an independent component. The second contribution of the thesis is a parametric multilinear transformation method to homogenize cluster densities while preserving the topological structure of the dataset, called Feature Space Curvature Map (FSCM). We present a new Gravitational Self-organizing Map to model the feature space curvature by plugging the concepts of gravity and fabric of space into the Self-organizing Map algorithm to mathematically describe the density structure of the data. To homogenize the cluster density, we introduce a novel mapping mechanism to project the data from the non-Euclidean curved space to a new Euclidean flat space. The third contribution is a novel topological-based method to study potentially complex high-dimensional categorized data by quantifying their shapes and extracting fine-grain insights from them to interpret the clustering result. We introduce our Organization Component Analysis (OCA) method for the automatic arbitrary cluster-shape study without an assumption about the data distribution. Finally, to tune the DBSCAN hyper-parameters, we propose a new tuning mechanism by combining techniques from machine learning and optimization domains, and we embed it in the ECII pipeline. Using this cluster analysis pipeline with the CPU burst data of a parallel application, we provide the developer/analyst with a high-quality SPMD computation structure detection with the added value that reflects the fine grain of the computation regions.El análisis de conglomerados se usa ampliamente para estratificar datos en el mismo conglomerado cuando son similares según las métricas específicas. Nosotros puede usar el análisis de clúster para agrupar la ráfaga de CPU de una aplicación paralela y las regiones en cada proceso intermedio llamadas de comunicación o llamadas al tiempo de ejecución paralelo. Los clusters resultantes obtenidos son las diferentes tendencias computacionales o fases que aparecen en la solicitud. Estos clusters son útiles para entender el comportamiento de la parte de computación del aplicación y centrar los análisis en aquellos que presenten problemas de rendimiento. Aunque los algoritmos de agrupamiento basados en la densidad son una herramienta poderosa y eficiente para resumir este tipo de información, su La metodología tradicional de agrupación en clústeres guiada por el usuario tiene muchas deficiencias y deficiencias al tratar con la complejidad de los datos, la diversidad de estructuras de datos, la alta dimensionalidad de los datos y el aumento dramático en la cantidad de datos. En consecuencia, el La mayoría de los algoritmos similares a DBSCAN tienen debilidades para manejar datos de alta dimensionalidad y/o densidad múltiple, y son sensibles a su configuración de hiperparámetros. Además, extraer información de los clústeres obtenidos es una forma intuitiva y tarea manual Para mitigar estas debilidades, hemos propuesto un nuevo enfoque unificado para reemplazar el agrupamiento guiado por el usuario con un canalización de análisis de agrupamiento automatizado, llamada canalización de identificación e interpretación de clúster mejorada (ECII). para construir el tubería, proponemos técnicas novedosas que incluyen la selección robusta de características independientes, el mapa de curvatura del espacio de características, Análisis de componentes de la organización y ajuste de hiperparámetros para la selección de características, homogeneización de densidad, agrupación interpretación y selección de modelos, que son los componentes principales de nuestra canalización de aprendizaje automático. Esta tesis aporta cuatro nuevas técnicas al campo de Machine Learning con un caso de uso particular en el campo de Performance Analytics. La primera contribución es un enfoque novedoso no supervisado para la selección de características en datos ruidosos, llamado Robust Independent Feature. Selección (RIFS).Específicamente, elegimos un subconjunto de funciones que contiene la mayor parte de la información subyacente, utilizando el mismo criterios como el análisis de componentes independientes. Simultáneamente, el ruido se separa como un componente independiente. La segunda contribución de la tesis es un método de transformación multilineal paramétrica para homogeneizar densidades de clústeres mientras preservando la estructura topológica del conjunto de datos, llamado Mapa de Curvatura del Espacio de Características (FSCM). Presentamos un nuevo Gravitacional Mapa autoorganizado para modelar la curvatura del espacio característico conectando los conceptos de gravedad y estructura del espacio en el Algoritmo de mapa autoorganizado para describir matemáticamente la estructura de densidad de los datos. Para homogeneizar la densidad del racimo, introducimos un mecanismo de mapeo novedoso para proyectar los datos del espacio curvo no euclidiano a un nuevo plano euclidiano espacio. La tercera contribución es un nuevo método basado en topología para estudiar datos categorizados de alta dimensión potencialmente complejos mediante cuantificando sus formas y extrayendo información detallada de ellas para interpretar el resultado de la agrupación. presentamos nuestro Método de análisis de componentes de organización (OCA) para el estudio automático de forma arbitraria de conglomerados sin una suposición sobre el distribución de datos.Postprint (published version

    Approximation Theory and Related Applications

    Get PDF
    In recent years, we have seen a growing interest in various aspects of approximation theory. This happened due to the increasing complexity of mathematical models that require computer calculations and the development of the theoretical foundations of the approximation theory. Approximation theory has broad and important applications in many areas of mathematics, including functional analysis, differential equations, dynamical systems theory, mathematical physics, control theory, probability theory and mathematical statistics, and others. Approximation theory is also of great practical importance, as approximate methods and estimation of approximation errors are used in physics, economics, chemistry, signal theory, neural networks and many other areas. This book presents the works published in the Special Issue "Approximation Theory and Related Applications". The research of the world’s leading scientists presented in this book reflect new trends in approximation theory and related topics

    Energy and Area Efficient Machine Learning Architectures using Spin-Based Neurons

    Get PDF
    Recently, spintronic devices with low energy barrier nanomagnets such as spin orbit torque-Magnetic Tunnel Junctions (SOT-MTJs) and embedded magnetoresistive random access memory (MRAM) devices are being leveraged as a natural building block to provide probabilistic sigmoidal activation functions for RBMs. In this dissertation research, we use the Probabilistic Inference Network Simulator (PIN-Sim) to realize a circuit-level implementation of deep belief networks (DBNs) using memristive crossbars as weighted connections and embedded MRAM-based neurons as activation functions. Herein, a probabilistic interpolation recoder (PIR) circuit is developed for DBNs with probabilistic spin logic (p-bit)-based neurons to interpolate the probabilistic output of the neurons in the last hidden layer which are representing different output classes. Moreover, the impact of reducing the Magnetic Tunnel Junction\u27s (MTJ\u27s) energy barrier is assessed and optimized for the resulting stochasticity present in the learning system. In p-bit based DBNs, different defects such as variation of the nanomagnet thickness can undermine functionality by decreasing the fluctuation speed of the p-bit realized using a nanomagnet. A method is developed and refined to control the fluctuation frequency of the output of a p-bit device by employing a feedback mechanism. The feedback can alleviate this process variation sensitivity of p-bit based DBNs. This compact and low complexity method which is presented by introducing the self-compensating circuit can alleviate the influences of process variation in fabrication and practical implementation. Furthermore, this research presents an innovative image recognition technique for MNIST dataset on the basis of p-bit-based DBNs and TSK rule-based fuzzy systems. The proposed DBN-fuzzy system is introduced to benefit from low energy and area consumption of p-bit-based DBNs and high accuracy of TSK rule-based fuzzy systems. This system initially recognizes the top results through the p-bit-based DBN and then, the fuzzy system is employed to attain the top-1 recognition results from the obtained top outputs. Simulation results exhibit that a DBN-Fuzzy neural network not only has lower energy and area consumption than bigger DBN topologies while also achieving higher accuracy
    corecore