41 research outputs found

    Ameliorating integrated sensor drift and imperfections: an adaptive "neural" approach

    Get PDF

    Can deep-sub-micron device noise be used as the basis for probabilistic neural computation?

    Get PDF
    This thesis explores the potential of probabilistic neural architectures for computation with future nanoscale Metal-Oxide-Semiconductor Field Effect Transistors (MOSFETs). In particular, the performance of a Continuous Restricted Boltzmann Machine {CRBM) implemented with generated noise of Random Telegraph Signal (RTS) and 1/ f form has been studied with reference to the 'typical' Gaussian implementation. In this study, a time domain RTS based noise analysis capability has been developed based upon future nanoscale MOSFETs, to represent the effect of nanoscale MOSFET noise on circuit implementation in particular the synaptic analogue multiplier which is subsequently used to implement stochastic behaviour of the CRBM. The result of this thesis indicates little degradation in performance from that of the typical Gaussian CRBM. Through simulation experiments, the CRBM with nanoscale MOSFET noise shows the ability to reconstruct training data, although it takes longer to converge to equilibrium. The results in this thesis do not prove that nanoscale MOSFET noise can be exploited in all contexts and with all data, for probabilistic computation. However, the result indicates, for the first time, that nanoscale MOSFET noise has the potential to be used for probabilistic neural computation hardware implementation. This thesis thus introduces a methodology for a form of technology-downstreaming and highlights the potential of probabilistic architecture for computation with future nanoscale MOSFETs

    Distributed cloud-edge analytics and machine learning for transportation emissions estimation

    Get PDF
    (English) In recent years IoT and Smart Cities have become a popular paradigm of computing that is based on network-enabled devices connected providing different functionalities, from sensor measures to domotic actions. With this paradigm, it is possible to provide to the stakeholders near-realtime information of the field, e.g. the current pollution of the city. Along with the mentioned paradigms, Fog Computing enables computation near the sensors where the data is produced, i.e. Edge nodes. This paradigm provides low latency and fault tolerance given the possible independence of the sensor devices. Moreover, pushing this computation enables derived results in a near-realtime fashion. This ability to push the computation to where the data is produced can be beneficial in many situations, however it also requires to include in the Edge the data preparation processes that ensure the fitness for use of the data as the incoming data can be erroneous. Given this situation, Machine Learning can be useful to correct data and also to produce predictions of the future values. Even though there have been studies regarding on the uses of data at the Edge, to our knowledge there is no evaluation of the different modeling situations and the viability of the approach. Therefore, this thesis aims to evaluate the possibility of building a distributed system that ensures the fitness for use of the incoming data through Machine Learning enabled Data Preparation, estimates the emissions and predicts the future status of the city in a near-realtime fashion. We evaluate the viability through three contributions. The first contribution focuses on forecasting in a distributed scenario with road traffic dataset for evaluation. It provides a robust solution to build a central model. This approach is based on Federated Learning, which allows training models at the Edge nodes and then merging them centrally. This way the models in the Edge can be independent but also can be synchronized. The results show the trade-off between accuracy versions training time and a comparison between low-powered devices versus server-class machines. These analyses show that it is viable to use Machine Learning with this paradigm. The second contribution focuses on a particular use case of ship emission estimation. To estimate exhaust emissions data must be correct, which is not always the case. This contribution explores the different techniques available to correct ship registry data and proposes the usage of simple Machine Learning techniques to do imputation of missing or erroneous values. This contribution analyzes the different variables and their relationship to provide the practitioners with guidelines for correction and data treatment. The results show that with classical Machine Learning it is possible to improve the state-of-the-art results. Moreover, as these algorithms are simple enough, they can be used in an Edge device if required. The third contribution focuses on generating new variables from the ones available with a ship trace dataset obtained from the Automatic Identification System (AIS). We use a pipeline of two different methods, a Neural Networks and a clustering algorithm, to group movements into movement patterns or \emph{behaviors}. We test the predicting power of these behaviors to predict ship type, main engine power, and navigational status. The prediction of the main engine power is compared against the standard technique used in ship emission estimation when the ship registry is missing. Our approach was able to detect 45\% of the otherwise undetected emissions if the baseline method was to be used. As ship navigational status is prone to error, the behaviors found are proposed as an alternative variable based in robust data. These contributions build a framework that can distribute the learning processes and that resists network failures in low-powered devices.(Español) En los últimos años, IoT y las Smart Cities se han convertido en un paradigma popular de computación que se basa en dispositivos conectados a la red que proporcionan diferentes funcionalidades, desde medidas de sensores hasta acciones domóticas. Con este paradigma, es posible tener información en casi tiempo real, como por ejemplo la contaminación actual de la ciudad. Junto con los paradigmas mencionados, Fog Computing permite computar cerca de donde se producen los datos, es decir, los nodos Edge. Este paradigma proporciona baja latencia y tolerancia a fallos dada la posible independencia de los dispositivos sensores. Esta posibilidad puede ser beneficiosa en muchas situaciones, sin embargo, requiere incluir en el Edge los procesos de preparación de datos que aseguran la idoneidad para su uso, ya que los datos entrantes pueden ser erróneos. Ante esta situación, el Machine Learning es útil para corregir datos y también para producir predicciones de los valores futuros. A pesar de que se han realizado estudios sobre los usos de los datos en el Edge, hasta donde sabemos, no hay una evaluación de las diferentes situaciones de modelado y la viabilidad del enfoque. Por lo tanto, esta tesis tiene como objetivo evaluar la posibilidad de construir un sistema distribuido que garantice que los datos sean correctos a través de su preparación con Machine Learning. También el sistema deberá estimar las emisiones y predecir el estado futuro de la ciudad de una manera casi en tiempo real. La viabilidad se evalúa a través a través de tres contribuciones. La primera contribución se centra en escenario distribuido con un conjunto de datos de tráfico vial que proporciona una solución robusta para construir un modelo central. Este enfoque se basa en Federated Learning, que permite entrenar modelos en los nodos Edge y luego fusionarlos de forma centralizada. De esta manera, los modelos en el Edge pueden ser independientes, pero también se pueden sincronizar. Los resultados muestran la comparación de la precisión con un modelo central y uno distribuido y una comparación con dispositivos de bajo consumos contra servidores. Estos análisis muestran que es viable utilizar el Machine Learning en este paradigma. La segunda contribución se centra en un caso de uso particular de estimación de las emisiones de barcos. Para estimar las emisiones, los datos deben ser correctos, cosa que no siempre pasa. Esta contribución explora las diferentes técnicas disponibles para corregir los datos del registro de barcos y propone el uso de técnicas simples de Machine Learning para hacer imputación de valores faltantes o erróneos. Esta contribución analiza las diferentes variables y su relación para proporcionar a los profesionales pautas para la corrección y el tratamiento de datos. Los resultados muestran que con el Machine Learning clásico es posible mejorar los resultados frente a métodos del estado del arte. Además, como estos algoritmos son lo suficientemente simples como para poder utilizarse en dispositivos Edge. La tercera contribución se centra en generar nuevas variables a partir de las disponibles con un conjunto de datos de trazabilidad de barcos obtenido del Sistema AIS. Esto se hace utilizando en conjunto una red neuronal y un algoritmo de agrupación para agrupar los movimientos en patrones de movimiento o comportamientos. Se evalúa su funcionamiento para predecir el tipo de barco, la potencia del motor principal y el estado de navegación. Con esta predicción, nuestro sistema es capaz de detectar el 45% de las emisiones que no se detectan con métodos standard. Como el estado de navegación del barco es propenso a errores, los comportamientos encontrados se proponen como una variable alternativa basada en datos robustos. Estas contribuciones constituyen un marco para distribuir los procesos de aprendizaje y que resiste errores en la red con dispositivos de bajo consumo.Arquitectura de computador

    Expressive movement generation with machine learning

    Get PDF
    Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation

    Pulse-stream binary stochastic hardware for neural computation the Helmholtz Machine

    Get PDF

    Learning workload behaviour models from monitored time-series for resource estimation towards data center optimization

    Get PDF
    In recent years there has been an extraordinary growth of the demand of Cloud Computing resources executed in Data Centers. Modern Data Centers are complex systems that need management. As distributed computing systems grow, and workloads benefit from such computing environments, the management of such systems increases in complexity. The complexity of resource usage and power consumption on cloud-based applications makes the understanding of application behavior through expert examination difficult. The difficulty increases when applications are seen as "black boxes", where only external monitoring can be retrieved. Furthermore, given the different amount of scenarios and applications, automation is required. To deal with such complexity, Machine Learning methods become crucial to facilitate tasks that can be automatically learned from data. Firstly, this thesis proposes an unsupervised learning technique to learn high level representations from workload traces. Such technique provides a fast methodology to characterize workloads as sequences of abstract phases. The learned phase representation is validated on a variety of datasets and used in an auto-scaling task where we show that it can be applied in a production environment, achieving better performance than other state-of-the-art techniques. Secondly, this thesis proposes a neural architecture, based on Sequence-to-Sequence models, that provides the expected resource usage of applications sharing hardware resources. The proposed technique provides resource managers the ability to predict resource usage over time as well as the completion time of the running applications. The technique provides lower error predicting usage when compared with other popular Machine Learning methods. Thirdly, this thesis proposes a technique for auto-tuning Big Data workloads from the available tunable parameters. The proposed technique gathers information from the logs of an application generating a feature descriptor that captures relevant information from the application to be tuned. Using this information we demonstrate that performance models can generalize up to a 34% better when compared with other state-of-the-art solutions. Moreover, the search time to find a suitable solution can be drastically reduced, with up to a 12x speedup and almost equal quality results as modern solutions. These results prove that modern learning algorithms, with the right feature information, provide powerful techniques to manage resource allocation for applications running in cloud environments. This thesis demonstrates that learning algorithms allow relevant optimizations in Data Center environments, where applications are externally monitored and careful resource management is paramount to efficiently use computing resources. We propose to demonstrate this thesis in three areas that orbit around resource management in server environmentsEls Centres de Dades (Data Centers) moderns són sistemes complexos que necessiten ser gestionats. A mesura que creixen els sistemes de computació distribuïda i les aplicacions es beneficien d’aquestes infraestructures, també n’augmenta la seva complexitat. La complexitat que implica gestionar recursos de còmput i d’energia en sistemes de computació al núvol fa difícil entendre el comportament de les aplicacions que s'executen de manera manual. Aquesta dificultat s’incrementa quan les aplicacions s'observen com a "caixes negres", on només es poden monitoritzar algunes mètriques de les caixes de manera externa. A més, degut a la gran varietat d’escenaris i aplicacions, és necessari automatitzar la gestió d'aquests recursos. Per afrontar-ne el repte, l'aprenentatge automàtic juga un paper cabdal que facilita aquestes tasques, que poden ser apreses automàticament en base a dades prèvies del sistema que es monitoritza. Aquesta tesi demostra que els algorismes d'aprenentatge poden aportar optimitzacions molt rellevants en la gestió de Centres de Dades, on les aplicacions són monitoritzades externament i la gestió dels recursos és de vital importància per a fer un ús eficient de la capacitat de còmput d'aquests sistemes. En primer lloc, aquesta tesi proposa emprar aprenentatge no supervisat per tal d’aprendre representacions d'alt nivell a partir de traces d'aplicacions. Aquesta tècnica ens proporciona una metodologia ràpida per a caracteritzar aplicacions vistes com a seqüències de fases abstractes. La representació apresa de fases és validada en diferents “datasets” i s'aplica a la gestió de tasques d'”auto-scaling”, on es conclou que pot ser aplicable en un medi de producció, aconseguint un millor rendiment que altres mètodes de vanguardia. En segon lloc, aquesta tesi proposa l'ús de xarxes neuronals, basades en arquitectures “Sequence-to-Sequence”, que proporcionen una estimació dels recursos usats per aplicacions que comparteixen recursos de hardware. La tècnica proposada facilita als gestors de recursos l’habilitat de predir l'ús de recursos a través del temps, així com també una estimació del temps de còmput de les aplicacions. Tanmateix, redueix l’error en l’estimació de recursos en comparació amb d’altres tècniques populars d'aprenentatge automàtic. Per acabar, aquesta tesi introdueix una tècnica per a fer “auto-tuning” dels “hyper-paràmetres” d'aplicacions de Big Data. Consisteix així en obtenir informació dels “logs” de les aplicacions, generant un vector de característiques que captura informació rellevant de les aplicacions que s'han de “tunejar”. Emprant doncs aquesta informació es valida que els ”Regresors” entrenats en la predicció del rendiment de les aplicacions són capaços de generalitzar fins a un 34% millor que d’altres “Regresors” de vanguàrdia. A més, el temps de cerca per a trobar una bona solució es pot reduir dràsticament, aconseguint un increment de millora de fins a 12 vegades més dels resultats de qualitat en contraposició a alternatives modernes. Aquests resultats posen de manifest que els algorismes moderns d'aprenentatge automàtic esdevenen tècniques molt potents per tal de gestionar l'assignació de recursos en aplicacions que s'executen al núvol.Arquitectura de computador

    Active learning for data streams.

    Get PDF
    With the exponential growth of data amount and sources, access to large collections of data has become easier and cheaper. However, data is generally unlabelled and labels are often difficult, expensive, and time consuming to obtain. Two learning paradigms have been used by machine learning community to diminish the need for labels in training data: semi-supervised learning (SSL) and active learning (AL). AL is a reliable way to efficiently building up training sets with minimal supervision. By querying the class (label) of the most interesting samples based upon previously seen data and some selection criteria, AL can produce a nearly optimal hypothesis, while requiring the minimum possible quantity of labelled data. SSL, on the other hand, takes the advantage of both labelled and unlabelled data to address the challenge of learning from a small number of labelled samples and large amount of unlabelled data. In this thesis, we borrow the concept of SSL by allowing AL algorithms to make use of redundant unlabelled data so that both labelled and unlabelled data are used in their querying criteria. Another common tradition within the AL community is to assume that data samples are already gathered in a pool and AL has the luxury to exhaustively search in that pool for the samples worth labelling. In this thesis, we go beyond that by applying AL to data streams. In a stream, data may grow infinitely making its storage prior to processing impractical. Due to its dynamic nature, the underlying distribution of the data stream may change over time resulting in the so-called concept drift or possibly emergence and fading of classes, known as concept evolution. Another challenge associated with AL, in general, is the sampling bias where the sampled training set does not reflect on the underlying data distribution. In presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the underlying distribution of the evolving data. Given these challenges, the research questions that the thesis addresses are: can AL improve learning given that data comes in streams? Is it possible to harness AL to handle changes in streams (i.e., concept drift and concept evolution by querying selected samples)? How can sampling bias be attenuated, while maintaining AL advantages? Finally, applying AL for sequential data steams (like time series) requires new approaches especially in the presence of concept drift and concept evolution. Hence, the question is how to handle concept drift and concept evolution in sequential data online and can AL be useful in such case? In this thesis, we develop a set of stream-based AL algorithms to answer these questions in line with the aforementioned challenges. The core idea of these algorithms is to query samples that give the largest reduction of an expected loss function that measures the learning performance. Two types of AL are proposed: decision theory based AL whose losses involve the prediction error and information theory based AL whose losses involve the model parameters. Although, our work focuses on classification problems, AL algorithms for other problems such as regression and parameter estimation can be derived from the proposed AL algorithms. Several experiments have been performed in order to evaluate the performance of the proposed algorithms. The obtained results show that our algorithms outperform other state-of-the-art algorithms

    Incorporating complex cells into neural networks for pattern classification

    Full text link
    Dans le domaine des neurosciences computationnelles, l'hypothèse a été émise que le système visuel, depuis la rétine et jusqu'au cortex visuel primaire au moins, ajuste continuellement un modèle probabiliste avec des variables latentes, à son flux de perceptions. Ni le modèle exact, ni la méthode exacte utilisée pour l'ajustement ne sont connus, mais les algorithmes existants qui permettent l'ajustement de tels modèles ont besoin de faire une estimation conditionnelle des variables latentes. Cela nous peut nous aider à comprendre pourquoi le système visuel pourrait ajuster un tel modèle; si le modèle est approprié, ces estimé conditionnels peuvent aussi former une excellente représentation, qui permettent d'analyser le contenu sémantique des images perçues. Le travail présenté ici utilise la performance en classification d'images (discrimination entre des types d'objets communs) comme base pour comparer des modèles du système visuel, et des algorithmes pour ajuster ces modèles (vus comme des densités de probabilité) à des images. Cette thèse (a) montre que des modèles basés sur les cellules complexes de l'aire visuelle V1 généralisent mieux à partir d'exemples d'entraînement étiquetés que les réseaux de neurones conventionnels, dont les unités cachées sont plus semblables aux cellules simples de V1; (b) présente une nouvelle interprétation des modèles du système visuels basés sur des cellules complexes, comme distributions de probabilités, ainsi que de nouveaux algorithmes pour les ajuster à des données; et (c) montre que ces modèles forment des représentations qui sont meilleures pour la classification d'images, après avoir été entraînés comme des modèles de probabilités. Deux innovations techniques additionnelles, qui ont rendu ce travail possible, sont également décrites : un algorithme de recherche aléatoire pour sélectionner des hyper-paramètres, et un compilateur pour des expressions mathématiques matricielles, qui peut optimiser ces expressions pour processeur central (CPU) et graphique (GPU).Computational neuroscientists have hypothesized that the visual system from the retina to at least primary visual cortex is continuously fitting a latent variable probability model to its stream of perceptions. It is not known exactly which probability model, nor exactly how the fitting takes place, but known algorithms for fitting such models require conditional estimates of the latent variables. This gives us a strong hint as to why the visual system might be fitting such a model; in the right kind of model those conditional estimates can also serve as excellent features for analyzing the semantic content of images perceived. The work presented here uses image classification performance (accurate discrimination between common classes of objects) as a basis for comparing visual system models, and algorithms for fitting those models as probability densities to images. This dissertation (a) finds that models based on visual area V1's complex cells generalize better from labeled training examples than conventional neural networks whose hidden units are more like V1's simple cells, (b) presents novel interpretations for complex-cell-based visual system models as probability distributions and novel algorithms for fitting them to data, and (c) demonstrates that these models form better features for image classification after they are first trained as probability models. Visual system models based on complex cells achieve some of the best results to date on the CIFAR-10 image classification benchmark, and samples from their probability distributions indicate that they have learnt to capture important aspects of natural images. Two auxiliary technical innovations that made this work possible are also described: a random search algorithm for selecting hyper-parameters, and an optimizing compiler for matrix-valued mathematical expressions which can target both CPU and GPU devices

    Energy Analytics for Infrastructure: An Application to Institutional Buildings

    Get PDF
    abstract: Commercial buildings in the United States account for 19% of the total energy consumption annually. Commercial Building Energy Consumption Survey (CBECS), which serves as the benchmark for all the commercial buildings provides critical input for EnergyStar models. Smart energy management technologies, sensors, innovative demand response programs, and updated versions of certification programs elevate the opportunity to mitigate energy-related problems (blackouts and overproduction) and guides energy managers to optimize the consumption characteristics. With increasing advancements in technologies relying on the ‘Big Data,' codes and certification programs such as the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE), and the Leadership in Energy and Environmental Design (LEED) evaluates during the pre-construction phase. It is mostly carried out with the assumed quantitative and qualitative values calculated from energy models such as Energy Plus and E-quest. However, the energy consumption analysis through Knowledge Discovery in Databases (KDD) is not commonly used by energy managers to perform complete implementation, causing the need for better energy analytic framework. The dissertation utilizes Interval Data (ID) and establishes three different frameworks to identify electricity losses, predict electricity consumption and detect anomalies using data mining, deep learning, and mathematical models. The process of energy analytics integrates with the computational science and contributes to several objectives which are to 1. Develop a framework to identify both technical and non-technical losses using clustering and semi-supervised learning techniques. 2. Develop an integrated framework to predict electricity consumption using wavelet based data transformation model and deep learning algorithms. 3. Develop a framework to detect anomalies using ensemble empirical mode decomposition and isolation forest algorithms. With a thorough research background, the first phase details on performing data analytics on the demand-supply database to determine the potential energy loss reduction potentials. Data preprocessing and electricity prediction framework in the second phase integrates mathematical models and deep learning algorithms to accurately predict consumption. The third phase employs data decomposition model and data mining techniques to detect the anomalies of institutional buildings.Dissertation/ThesisDoctoral Dissertation Civil, Environmental and Sustainable Engineering 201

    Masked Conditional Neural Networks for Sound Recognition

    Get PDF
    Sound recognition has been studied for decades to grant machines the human hearing ability. The advances in this field help in a range of applications, from industrial ones such as fault detection in machines and noise monitoring to household applications such as surveillance and hearing aids. The problem of sound recognition like any pattern recognition task involves the reliability of the extracted features and the recognition model. The problem has been approached through decades of crafted features used collaboratively with models based on neural networks or statistical models such as Gaussian Mixtures and Hidden Markov models. Neural networks are currently being considered as a method to automate the feature extraction stage together with the already incorporated role of recognition. The performance of such models is approaching handcrafted features. Current neural network based models are not primarily designed for the nature of the sound signal, which may not optimally harness distinctive properties of the signal. This thesis proposes neural network models that exploit the nature of the time-frequency representation of the sound signal. We propose the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN). The CLNN is designed to account for the temporal dimension of a signal and behaves as the framework for the MCLNN. The MCLNN allows a filterbank-like behaviour to be embedded within the network using a specially designed binary mask. The masking subdivides the frequency range of a signal into bands and allows concurrent consideration of different feature combinations analogous to the manual handcrafting of the optimum set of features for a recognition task. The proposed models have been evaluated through an extensive set of experiments using a range of publicly available datasets of music genres and environmental sounds, where they surpass state-of-the-art Convolutional Neural Networks and several hand-crafted attempts
    corecore