164 research outputs found

    Adaptive Control of Arm Movement based on Cerebellar Model

    Get PDF
    This study is an attempt to take advantage of a cerebellar model to control a biomimetic arm. Aware that a variety of cerebellar models with different levels of details has been developed, we focused on a high-level model called MOSAIC. This model is thought to be able to describe the cerebellar functionality without getting into the details of the neural circuitry. To understand where this model exactly fits, we glanced over the biology of the cerebellum and a few alternative models. Certainly, the arm control loop is composed of other components. We reviewed those elements with emphasis on modeling for our simulation. Among these models, the arm and the muscle system received the most attention. The musculoskeletal model tested independently and by means of optimization techniques, a human-like control of arm through muscle activations achieved. We have discussed how MOSAIC can solve a control problem and what drawbacks it has. Consequently, toward making a practical use of MOSAIC model, several ideas developed and tested. In this process, we borrowed concepts and methods from the control theory. Specifically, known schemes of adaptive control of a manipulator, linearization and approximation were utilized. Our final experiment dealt with a modified/adjusted MOSAIC model to adaptively control the arm. We call this model ORF-MOSAIC (Organized by Receptive Fields MOdular Selection And Identification for Control). With as few as 16 modules, we were able to control the arm in a workspace of 30 x 30 cm. The system was able to adapt to an external field as well as handling new objects despite delays. The discussion section suggests that there are similarities between microzones in the cerebellum and the modules of this new model

    Shallow and deep networks intrusion detection system : a taxonomy and survey

    Get PDF
    Intrusion detection has attracted a considerable interest from researchers and industries. The community, after many years of research, still faces the problem of building reliable and efficient IDS that are capable of handling large quantities of data, with changing patterns in real time situations. The work presented in this manuscript classifies intrusion detection systems (IDS). Moreover, a taxonomy and survey of shallow and deep networks intrusion detection systems is presented based on previous and current works. This taxonomy and survey reviews machine learning techniques and their performance in detecting anomalies. Feature selection which influences the effectiveness of machine learning (ML) IDS is discussed to explain the role of feature selection in the classification and training phase of ML IDS. Finally, a discussion of the false and true positive alarm rates is presented to help researchers model reliable and efficient machine learning based intrusion detection systems

    Data-Based Modeling: Application in Process Identification, Monitoring and Fault Detection

    Get PDF
    Present thesis explores the application of different data based modeling techniques in identification, product quality monitoring and fault detection of a process. Biodegradation of an organic pollutant phenol has been considered for the identification and fault detection purpose. A wine data set has been used for demonstrating the application of data based models in product quality monitoring. A comprehensive discussion was done on theoretical and mathematical background of different data based models, multivariate statistical models and statistical models used in the present thesis.The identification of phenol biodegradation was done by using Artificial Neural Networks (namely Multi Layer Percetprons) and Auto Regression models with eXogenious inputs (ARX) considering the draw backs and complications associated with the first principle model. Both the models have shown a good efficiency in identifying the dynamics of the phenol biodegradation process. ANN has proved its worth over ARX models when trained with sufficient data with an efficiency of almost 99.99%. A Partial Least Squares (PLS) based model has been developed which can predict the process outcome at any level of the process variables (within the range considered for the development of the model) at steady state. Three continuous process variables namely temperature, pH and RPM were monitored using statistical process monitoring. Both univariate and multivariate statistical process monitoring techniques were used for the fault detection purpose. X-bar charts along with Range charts were used for univariate SPM and Principal Component Analysis (PCA) has been used for multivariate SPM. The advantage of multivariate statistical process monitoring over univariate statistical process monitoring has been demonstrated

    Exploring Hyperspectral Imaging and 3D Convolutional Neural Network for Stress Classification in Plants

    Get PDF
    Hyperspectral imaging (HSI) has emerged as a transformative technology in imaging, characterized by its ability to capture a wide spectrum of light, including wavelengths beyond the visible range. This approach significantly differs from traditional imaging methods such as RGB imaging, which uses three color channels, and multispectral imaging, which captures several discrete spectral bands. Through this approach, HSI offers detailed spectral signatures for each pixel, facilitating a more nuanced analysis of the imaged subjects. This capability is particularly beneficial in applications like agricultural practices, where it can detect changes in physiological and structural characteristics of crops. Moreover, the ability of HSI to monitor these changes over time is advantageous for observing how subjects respond to different environmental conditions or treatments. However, the high-dimensional nature of hyperspectral data presents challenges in data processing and feature extraction. Traditional machine learning algorithms often struggle to handle such complexity. This is where 3D Convolutional Neural Networks (CNNs) become valuable. Unlike 1D-CNNs, which extract features from spectral dimensions, and 2D-CNNs, which focus on spatial dimensions, 3D CNNs have the capability to process data across both spectral and spatial dimensions. This makes them adept at extracting complex features from hyperspectral data. In this thesis, we explored the potency of HSI combined with 3D-CNN in agriculture domain where plant health and vitality are paramount. To evaluate this, we subjected lettuce plants to varying stress levels to assess the performance of this method in classifying the stressed lettuce at the early stages of growth into their respective stress-level groups. For this study, we created a dataset comprising 88 hyperspectral image samples of stressed lettuce. Utilizing Bayesian optimization, we developed 350 distinct 3D-CNN models to assess the method. The top-performing model achieved a 75.00\% test accuracy. Additionally, we addressed the challenge of generating valid 3D-CNN models in the Keras Tuner library through meticulous hyperparameter configuration. Our investigation also extends to the role of individual channels and channel groups within the color and near-infrared spectrum in predicting results for each stress-level group. We observed that the red and green spectra have a higher influence on the prediction results. Furthermore, we conducted a comprehensive review of 3D-CNN-based classification techniques for diseased and defective crops using non-UAV-based hyperspectral images.MITACSMaster of Science in Applied Computer Scienc

    Representation learning on complex data

    Get PDF
    Machine learning has enabled remarkable progress in various fields of research and application in recent years. The primary objective of machine learning consists of developing algorithms that can learn and improve through observation and experience. Machine learning algorithms learn from data, which may exhibit various forms of complexity, which pose fundamental challenges. In this thesis, we address two major types of data complexity: First, data is often inherently connected and can be modeled by a single or multiple graphs. Machine learning methods could potentially exploit these connections, for instance, to find groups of similar users in a social network for targeted marketing or to predict functional properties of proteins for drug design. Secondly, data is often high-dimensional, for instance, due to a large number of recorded features or induced by a quadratic pixel grid on images. Classical machine learning methods perennially fail when exposed to high-dimensional data as several key assumptions cease to be satisfied. Therefore, a major challenge associated with machine learning on graphs and high-dimensional data is to derive meaningful representations of this data, which allow models to learn effectively. In contrast to conventional manual feature engineering methods, representation learning aims at automatically learning data representations that are particularly suitable for a specific task at hand. Driven by a rapidly increasing availability of data, these methods have celebrated tremendous success for tasks such as object detection in images and speech recognition. However, there is still a considerable amount of research work to be done to fully leverage such techniques for learning on graphs and high-dimensional data. In this thesis, we address the problem of learning meaningful representations for highly-effective machine learning on complex data, in particular, graph data and high-dimensional data. Additionally, most of our proposed methods are highly scalable, allowing them to learn from massive amounts of data. While we address a wide range of general learning problems with different modes of supervision, ranging from unsupervised problems on unlabeled data to (semi-)-supervised learning on annotated data sets, we evaluate our models on specific tasks from fields such as social network analysis, information security, and computer vision. The first part of this thesis addresses representation learning on graphs. While existing graph neural network models commonly perform synchronous message passing between nodes and thus struggle with long-range dependencies and efficiency issues, our first proposed method performs fast asynchronous message passing and, therefore, supports adaptive and efficient learning and additionally scales to large graphs. Another contribution consists of a novel graph-based approach to malware detection and classification based on network traffic. While existing methods classify individual network flows between two endpoints, our algorithm collects all traffic in a monitored network within a specific time frame and builds a communication graph, which is then classified using a novel graph neural network model. The developed model can be generally applied to further graph classification or anomaly detection tasks. Two further contributions challenge a common assumption made by graph learning methods, termed homophily, which states that nodes with similar properties are usually closely connected in the graph. To this end, we develop a method that predicts node-level properties leveraging the distribution of class labels appearing in the neighborhood of the respective node. That allows our model to learn general relations between a node and its neighbors, which are not limited to homophily. Another proposed method specifically models structural similarity between nodes to model different roles, for instance, influencers and followers in a social network. In particular, we develop an unsupervised algorithm for deriving node descriptors based on how nodes spread probability mass to their neighbors and aggregate these descriptors to represent entire graphs. The second part of this thesis addresses representation learning on high-dimensional data. Specifically, we consider the problem of clustering high-dimensional data, such as images, texts, or gene expression profiles. Classical clustering algorithms struggle with this type of data since it can usually not be assumed that data objects will be similar w.r.t. all attributes, but only within a particular subspace of the full-dimensional ambient space. Subspace clustering is an approach to clustering high-dimensional data based on this assumption. While there already exist powerful neural network-based subspace clustering methods, these methods commonly suffer from scalability issues and lack a theoretical foundation. To this end, we propose a novel metric learning approach to subspace clustering, which can provably recover linear subspaces under suitable assumptions and, at the same time, tremendously reduces the required numbear of model parameters and memory compared to existing algorithms.Maschinelles Lernen hat in den letzten Jahren bemerkenswerte Fortschritte in verschiedenen Forschungs- und Anwendungsbereichen ermöglicht. Das primäre Ziel des maschinellen Lernens besteht darin, Algorithmen zu entwickeln, die durch Beobachtung und Erfahrung lernen und sich verbessern können. Algorithmen des maschinellen Lernens lernen aus Daten, die verschiedene Formen von Komplexität aufweisen können, was grundlegende Herausforderungen mit sich bringt. Im Rahmen dieser Dissertation werden zwei Haupttypen von Datenkomplexität behandelt: Erstens weisen Daten oft inhärente Verbindungen, die durch einen einzelnen oder mehrere Graphen modelliert werden können. Methoden des maschinellen Lernens können diese Verbindungen potenziell ausnutzen, um beispielsweise Gruppen ähnlicher Nutzer in einem sozialen Netzwerk für gezieltes Marketing zu finden oder um funktionale Eigenschaften von Proteinen für das Design von Medikamenten vorherzusagen. Zweitens sind die Daten oft hochdimensional, z. B. aufgrund einer großen Anzahl von erfassten Merkmalen oder bedingt durch ein quadratisches Pixelraster auf Bildern. Klassische Methoden des maschinellen Lernens versagen immer wieder, wenn sie hochdimensionalen Daten ausgesetzt werden, da mehrere Schlüsselannahmen nicht mehr erfüllt sind. Daher besteht eine große Herausforderung beim maschinellen Lernen auf Graphen und hochdimensionalen Daten darin, sinnvolle Repräsentationen dieser Daten abzuleiten, die es den Modellen ermöglichen, effektiv zu lernen. Im Gegensatz zu konventionellen manuellen Feature-Engineering-Methoden zielt Representation Learning darauf ab, automatisch Datenrepräsentationen zu lernen, die für eine bestimmte Aufgabenstellung besonders geeignet sind. Angetrieben durch eine rasant steigende Datenverfügbarkeit haben diese Methoden bei Aufgaben wie der Objekterkennung in Bildern und der Spracherkennung enorme Erfolge gefeiert. Es besteht jedoch noch ein erheblicher Forschungsbedarf, um solche Verfahren für das Lernen auf Graphen und hochdimensionalen Daten voll auszuschöpfen. Diese Dissertation beschäftigt sich mit dem Problem des Lernens sinnvoller Repräsentationen für hocheffektives maschinelles Lernen auf komplexen Daten, insbesondere auf Graphen und hochdimensionalen Daten. Zusätzlich sind die meisten hier vorgeschlagenen Methoden hoch skalierbar, so dass sie aus großen Datenmengen lernen können. Obgleich eine breite Palette von allgemeinen Lernproblemen mit verschiedenen Arten der Überwachung adressiert wird, die von unüberwachten Problemen auf unannotierten Daten bis hin zum (semi-)überwachten Lernen auf annotierten Datensätzen reichen, werden die vorgestellten Metoden anhand spezifischen Anwendungen aus Bereichen wie der Analyse sozialer Netzwerke, der Informationssicherheit und der Computer Vision evaluiert. Der erste Teil der Dissertation befasst sich mit dem Representation Learning auf Graphen. Während existierende neuronale Netze für Graphen üblicherweise eine synchrone Nachrichtenübermittlung zwischen den Knoten durchführen und somit mit langreichweitigen Abhängigkeiten und Effizienzproblemen zu kämpfen haben, führt die erste hier vorgeschlagene Methode eine schnelle asynchrone Nachrichtenübermittlung durch und unterstützt somit adaptives und effizientes Lernen und skaliert zudem auf große Graphen. Ein weiterer Beitrag besteht in einem neuartigen graphenbasierten Ansatz zur Malware-Erkennung und -Klassifizierung auf Basis des Netzwerkverkehrs. Während bestehende Methoden einzelne Netzwerkflüsse zwischen zwei Endpunkten klassifizieren, sammelt der vorgeschlagene Algorithmus den gesamten Verkehr in einem überwachten Netzwerk innerhalb eines bestimmten Zeitraums und baut einen Kommunikationsgraphen auf, der dann mithilfe eines neuartigen neuronalen Netzes für Graphen klassifiziert wird. Das entwickelte Modell kann allgemein für weitere Graphenklassifizierungs- oder Anomalieerkennungsaufgaben eingesetzt werden. Zwei weitere Beiträge stellen eine gängige Annahme von Graphen-Lernmethoden in Frage, die so genannte Homophilie-Annahme, die besagt, dass Knoten mit ähnlichen Eigenschaften in der Regel eng im Graphen verbunden sind. Zu diesem Zweck wird eine Methode entwickelt, die Eigenschaften auf Knotenebene vorhersagt, indem sie die Verteilung der annotierten Klassen in der Nachbarschaft des jeweiligen Knotens nutzt. Das erlaubt dem vorgeschlagenen Modell, allgemeine Beziehungen zwischen einem Knoten und seinen Nachbarn zu lernen, die nicht auf Homophilie beschränkt sind. Eine weitere vorgeschlagene Methode modelliert strukturelle Ähnlichkeit zwischen Knoten, um unterschiedliche Rollen zu modellieren, zum Beispiel Influencer und Follower in einem sozialen Netzwerk. Insbesondere entwickeln wir einen unüberwachten Algorithmus zur Ableitung von Knoten-Deskriptoren, die darauf basieren, wie Knoten Wahrscheinlichkeitsmasse auf ihre Nachbarn verteilen, und aggregieren diese Deskriptoren, um ganze Graphen darzustellen. Der zweite Teil dieser Dissertation befasst sich mit dem Representation Learning auf hochdimensionalen Daten. Konkret wird das Problem des Clusterns hochdimensionaler Daten, wie z. B. Bilder, Texte oder Genexpressionsprofile, betrachtet. Klassische Clustering-Algorithmen haben mit dieser Art von Daten zu kämpfen, da in der Regel nicht davon ausgegangen werden kann, dass die Datenobjekte in Bezug auf alle Attribute ähnlich sind, sondern nur innerhalb eines bestimmten Unterraums des volldimensionalen Datenraums. Das Unterraum-Clustering ist ein Ansatz zum Clustern hochdimensionaler Daten, der auf dieser Annahme basiert. Obwohl es bereits leistungsfähige, auf neuronalen Netzen basierende Unterraum-Clustering-Methoden gibt, leiden diese Methoden im Allgemeinen unter Skalierbarkeitsproblemen und es fehlt ihnen an einer theoretischen Grundlage. Zu diesem Zweck wird ein neuartiger Metric Learning Ansatz für das Unterraum-Clustering vorgeschlagen, der unter geeigneten Annahmen nachweislich lineare Unterräume detektieren kann und gleichzeitig die erforderliche Anzahl von Modellparametern und Speicher im Vergleich zu bestehenden Algorithmen enorm reduziert

    RBF-sítě s dynamickou architekturou

    Get PDF
    V tejto diplomovej práci som zrekapituloval viacero metód vhodných pre klastrovanie vstupných dát. Predstavil som dva dobré známe klastrovacie algoritmy, a to konkrétne K-means algoritmus a Fuzzy C-means (FCM) algoritmus. Uviedol som niekoľko metód vhodných pre odhad optimálneho počtu klastrov. Ďalej som predstavil Kohonenové mapy a dva modely Kohonenových máp s adaptívnou topológiou, konkrétne Kohonenové mapy s rastúcou mriežkou a model rastúcich neurónových plynov. Ako posledný som predstavil pomerne nový model radiálne bázických neurónových sieti. Pre tento typ neurónových sieti som uviedol viacero učiacich algoritmov. V závere práce som aplikoval jednotlivé klastrovacie metódy na reálne dáta popisujúce vzájomný obchod štátov sveta.In this master thesis I recapitulated several methods for clustering input data. Two well known clustering algorithms, concretely K-means algorithm and Fuzzy C-means (FCM) algorithm, were described in the submitted work. I presented several methods, which could help estimate the optimal number of clusters. Further, I described Kohonen maps and two models of Kohonen's maps with dynamically changing structure, namely Kohonen map with growing grid and the model of growing neural gas. At last I described quite new model of radial basis function neural networks. I presented several learning algorithms for this model of neural networks. In the end of this work I made some clustering experiments with real data. This data describes the international trade among states of the whole world.Department of Theoretical Computer Science and Mathematical LogicKatedra teoretické informatiky a matematické logikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult

    Towards Improved Marketing Mix Decisions through Deep Learning and Human-AI Collaboration

    Get PDF

    Domain-Specific Computing Architectures and Paradigms

    Full text link
    We live in an exciting era where artificial intelligence (AI) is fundamentally shifting the dynamics of industries and businesses around the world. AI algorithms such as deep learning (DL) have drastically advanced the state-of-the-art cognition and learning capabilities. However, the power of modern AI algorithms can only be enabled if the underlying domain-specific computing hardware can deliver orders of magnitude more performance and energy efficiency. This work focuses on this goal and explores three parts of the domain-specific computing acceleration problem; encapsulating specialized hardware and software architectures and paradigms that support the ever-growing processing demand of modern AI applications from the edge to the cloud. This first part of this work investigates the optimizations of a sparse spatio-temporal (ST) cognitive system-on-a-chip (SoC). This design extracts ST features from videos and leverages sparse inference and kernel compression to efficiently perform action classification and motion tracking. The second part of this work explores the significance of dataflows and reduction mechanisms for sparse deep neural network (DNN) acceleration. This design features a dynamic, look-ahead index matching unit in hardware to efficiently discover fine-grained parallelism, achieving high energy efficiency and low control complexity for a wide variety of DNN layers. Lastly, this work expands the scope to real-time machine learning (RTML) acceleration. A new high-level architecture modeling framework is proposed. Specifically, this framework consists of a set of high-performance RTML-specific architecture design templates, and a Python-based high-level modeling and compiler tool chain for efficient cross-stack architecture design and exploration.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162870/1/lchingen_1.pd

    A Comparative Analysis of Purkinje Cells Across Species Combining Modelling, Machine Learning and Information Theory

    Get PDF
    There have been a number of computational modelling studies that aim to replicate the cerebellar Purkinje cell, though these typically use the morphology of rodent cells. While many species, including rodents, display intricate dendritic branching, it is not a universal feature among Purkinje cells. This study uses morphological reconstructions of 24 Purkinje cells from seven species to explore the changes that occur to the cell through evolution and examine whether this has an effect on the processing capacity of the cell. This is achieved by combining several modes of study in order to gain a comprehensive overview of the variations between the cells in both morphology and behaviour. Passive and active computational models of the cells were created, using the same electrophysiological parameters and ion channels for all models, to characterise the voltage attenuation and electrophysiological behaviour of the cells. These results and several measures of branching and size were then used to look for clusters in the data set using machine learning techniques. They were also used to visualise the differences within each species group. Information theory methods were also employed to compare the estimated information transfer from input to output across each cell. Along with a literature review into what is known about Purkinje cells and the cerebellum across the phylogenetic tree, these results show that while there are some obvious differences in morphology, the variation within species groups in electrophysiological behaviour is often as high as between them. This suggests that morphological changes may occur in order to conserve behaviour in the face of other changes to the cerebellum
    corecore