82 research outputs found

    Merging chrominance and luminance in early, medium, and late fusion using Convolutional Neural Networks

    Get PDF
    The field of Machine Learning has received extensive attention in recent years. More particularly, computer vision problems have got abundant consideration as the use of images and pictures in our daily routines is growing. The classification of images is one of the most important tasks that can be used to organize, store, retrieve, and explain pictures. In order to do that, researchers have been designing algorithms that automatically detect objects in images. During last decades, the common approach has been to create sets of features -- manually designed -- that could be exploited by image classification algorithms. More recently, researchers designed algorithms that automatically learn these sets of features, surpassing state-of-the-art performances. However, learning optimal sets of features is computationally expensive and it can be relaxed by adding prior knowledge about the task, improving and accelerating the learning phase. Furthermore, with problems with a large feature space the complexity of the models need to be reduced to make it computationally tractable (e.g. the recognition of human actions in videos). Consequently, we propose to use multimodal learning techniques to reduce the complexity of the learning phase in Artificial Neural Networks by incorporating prior knowledge about the connectivity of the network. Furthermore, we analyze state-of-the-art models for image classification and propose new architectures that can learn a locally optimal set of features in an easier and faster manner. In this thesis, we demonstrate that merging the luminance and the chrominance part of the images using multimodal learning techniques can improve the acquisition of good visual set of features. We compare the validation accuracy of several models and we demonstrate that our approach outperforms the basic model with statistically significant results

    A Framework for Vector-Weighted Deep Neural Networks

    Full text link
    The vast majority of advances in deep neural network research operate on the basis of a real-valued weight space. Recent work in alternative spaces have challenged and complemented this idea; for instance, the use of complex- or binary-valued weights have yielded promising and fascinating results. We propose a framework for a novel weight space consisting of vector values which we christen VectorNet. We first develop the theoretical foundations of our proposed approach, including formalizing the requisite theory for forward and backpropagating values in a vector-weighted layer. We also introduce the concept of expansion and aggregation functions for conversion between real and vector values. These contributions enable the seamless integration of vector-weighted layers with conventional layers, resulting in network architectures exhibiting height in addition to width and depth, and consequently models which we might be inclined to call tall learning. As a means of evaluating its effect on model performance, we apply our framework on top of three neural network architectural families—the multilayer perceptron (MLP), convolutional neural network (CNN), and directed acyclic graph neural network (DAG-NN)—trained over multiple classic machine learning and image classification benchmarks. We also consider evolutionary algorithms for performing neural architecture search over the new hyperparameters introduced by our framework. Lastly, we solidify the case for the utility of our contributions by implementing our approach on real-world data in the domains of mental illness diagnosis and static malware detection, achieving state-of-the-art results in both. Our implementations are made publicly available to drive further investigation into the exciting potential of VectorNet

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy

    Multimodal machine learning for intelligent mobility

    Get PDF
    Scientific problems are solved by finding the optimal solution for a specific task. Some problems can be solved analytically while other problems are solved using data driven methods. The use of digital technologies to improve the transportation of people and goods, which is referred to as intelligent mobility, is one of the principal beneficiaries of data driven solutions. Autonomous vehicles are at the heart of the developments that propel Intelligent Mobility. Due to the high dimensionality and complexities involved in real-world environments, it needs to become commonplace for intelligent mobility to use data-driven solutions. As it is near impossible to program decision making logic for every eventuality manually. While recent developments of data-driven solutions such as deep learning facilitate machines to learn effectively from large datasets, the application of techniques within safety-critical systems such as driverless cars remain scarce.Autonomous vehicles need to be able to make context-driven decisions autonomously in different environments in which they operate. The recent literature on driverless vehicle research is heavily focused only on road or highway environments but have discounted pedestrianized areas and indoor environments. These unstructured environments tend to have more clutter and change rapidly over time. Therefore, for intelligent mobility to make a significant impact on human life, it is vital to extend the application beyond the structured environments. To further advance intelligent mobility, researchers need to take cues from multiple sensor streams, and multiple machine learning algorithms so that decisions can be robust and reliable. Only then will machines indeed be able to operate in unstructured and dynamic environments safely. Towards addressing these limitations, this thesis investigates data driven solutions towards crucial building blocks in intelligent mobility. Specifically, the thesis investigates multimodal sensor data fusion, machine learning, multimodal deep representation learning and its application of intelligent mobility. This work demonstrates that mobile robots can use multimodal machine learning to derive driver policy and therefore make autonomous decisions.To facilitate autonomous decisions necessary to derive safe driving algorithms, we present an algorithm for free space detection and human activity recognition. Driving these decision-making algorithms are specific datasets collected throughout this study. They include the Loughborough London Autonomous Vehicle dataset, and the Loughborough London Human Activity Recognition dataset. The datasets were collected using an autonomous platform design and developed in house as part of this research activity. The proposed framework for Free-Space Detection is based on an active learning paradigm that leverages the relative uncertainty of multimodal sensor data streams (ultrasound and camera). It utilizes an online learning methodology to continuously update the learnt model whenever the vehicle experiences new environments. The proposed Free Space Detection algorithm enables an autonomous vehicle to self-learn, evolve and adapt to new environments never encountered before. The results illustrate that online learning mechanism is superior to one-off training of deep neural networks that require large datasets to generalize to unfamiliar surroundings. The thesis takes the view that human should be at the centre of any technological development related to artificial intelligence. It is imperative within the spectrum of intelligent mobility where an autonomous vehicle should be aware of what humans are doing in its vicinity. Towards improving the robustness of human activity recognition, this thesis proposes a novel algorithm that classifies point-cloud data originated from Light Detection and Ranging sensors. The proposed algorithm leverages multimodality by using the camera data to identify humans and segment the region of interest in point cloud data. The corresponding 3-dimensional data was converted to a Fisher Vector Representation before being classified by a deep Convolutional Neural Network. The proposed algorithm classifies the indoor activities performed by a human subject with an average precision of 90.3%. When compared to an alternative point cloud classifier, PointNet[1], [2], the proposed framework out preformed on all classes. The developed autonomous testbed for data collection and algorithm validation, as well as the multimodal data-driven solutions for driverless cars, is the major contributions of this thesis. It is anticipated that these results and the testbed will have significant implications on the future of intelligent mobility by amplifying the developments of intelligent driverless vehicles.</div

    A Review of Findings from Neuroscience and Cognitive Psychology as Possible Inspiration for the Path to Artificial General Intelligence

    Full text link
    This review aims to contribute to the quest for artificial general intelligence by examining neuroscience and cognitive psychology methods for potential inspiration. Despite the impressive advancements achieved by deep learning models in various domains, they still have shortcomings in abstract reasoning and causal understanding. Such capabilities should be ultimately integrated into artificial intelligence systems in order to surpass data-driven limitations and support decision making in a way more similar to human intelligence. This work is a vertical review that attempts a wide-ranging exploration of brain function, spanning from lower-level biological neurons, spiking neural networks, and neuronal ensembles to higher-level concepts such as brain anatomy, vector symbolic architectures, cognitive and categorization models, and cognitive architectures. The hope is that these concepts may offer insights for solutions in artificial general intelligence.Comment: 143 pages, 49 figures, 244 reference

    Visual perception an information-based approach to understanding biological and artificial vision

    Get PDF
    The central issues of this dissertation are (a) what should we be doing — what problems should we be trying to solve — in order to build computer vision systems, and (b) what relevance biological vision has to the solution of these problems. The approach taken to tackle these issues centres mostly on the clarification and use of information-based ideas, and an investigation into the nature of the processes underlying perception. The primary objective is to demonstrate that information theory and extensions of it, and measurement theory are powerful tools in helping to find solutions to these problems. The quantitative meaning of information is examined, from its origins in physical theories, through Shannon information theory, Gabor representations and codes towards semantic interpretations of the term. Also the application of information theory to the understanding of the developmental and functional properties of biological visual systems is discussed. This includes a review of the current state of knowledge of the architecture and function of the early visual pathways, particularly the retina, and a discussion of the possible coding functions of cortical neurons. The nature of perception is discussed from a number of points of view: the types and function of explanation of perceptual systems and how these relate to the operation of the system; the role of the observer in describing perceptual functions in other systems or organisms; the status and role of objectivist and representational viewpoints in understanding vision; the philosophical basis of perception; the relationship between pattern recognition and perception, and the interpretation of perception in terms of a theory of measurement These two threads of research, information theory and measurement theory are brought together in an overview and reinterpretation of the cortical role in mammalian vision. Finally the application of some of the coding and recognition concepts to industrial inspection problems are described. The nature of the coding processes used are unusual in that coded images are used as the input for a simple neural network classifier, rather than a heuristic feature set The relationship between the Karhunen-Loùve transform and the singular value decomposition is clarified as background the coding technique used to code the images. This coding technique has also been used to code long sequences of moving images to investigate the possibilities of recognition of people on the basis of their gait or posture and this application is briefly described

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

    Get PDF
    abstract: Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications. In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level. Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well. Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm. Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Inimaju arvutuslikke protsesside mÔistmine masinÔpe mudelite tÔlgendamise kaudu. AndmepÔhine lÀhenemine arvutuslikku neuroteadusesse

    Get PDF
    Modelleerimine on inimkonna pĂ”line viis keerulistest nĂ€htustest arusaamiseks. Planeetide liikumise mudel, gravitatsiooni mudel ja osakestefĂŒĂŒsika standardmudel on nĂ€ited selle lĂ€henemise edukusest. Neuroteaduses on olemas kaks viisi mudelite loomiseks: traditsiooniline hĂŒpoteesipĂ”hine lĂ€henemine, mille puhul kĂ”igepealt mudel sĂ”nastatakse ja alles siis valideeritakse andmete peal; ja uuem andmepĂ”hine lĂ€henemine, mis toetub masinĂ”pele, et sĂ”nastada mudeleid automaatselt. HĂŒpoteesipĂ”hine viis annab tĂ€ieliku mĂ”istmise sellest, kuidas mudel töötab, aga nĂ”uab aega, kuna iga hĂŒpotees peab olema sĂ”nastatud ja valideeritud kĂ€sitsi. AndmepĂ”hine lĂ€henemine toetub ainult andmetele ja arvutuslikele ressurssidele mudelite otsimisel, aga ei seleta kuidas tĂ€pselt mudel jĂ”uab oma tulemusteni. Me vĂ€idame, et neuroandmestike suur hulk ja nende mahu kiire kasv nĂ”uab andmepĂ”hise lĂ€henemise laiemat kasutuselevĂ”ttu neuroteaduses, nihkes uurija rolli mudelite tööprintsiipide tĂ”lgendamisele. Doktoritöö koosneb kolmest nĂ€itest neuroteaduse teadmisi avastamisest masinĂ”ppe tĂ”lgendamismeetodeid kasutades. Esimeses uuringus tĂ”lgendatava mudeli abiga me kirjeldame millised ajas muutuvad sageduskomponendid iseloomustavad inimese ajusignaali visuaalsete objektide tuvastamise ĂŒlesande puhul. Teises uuringus vĂ”rdleme omavahel signaale inimese aju ventraalses piirkonnas ja konvolutsiooniliste tehisnĂ€rvivĂ”rkude aktivatsioone erinevates kihtides. SÀÀrane vĂ”rdlus vĂ”imaldas meil kinnitada hĂŒpoteesi, et mĂ”lemad sĂŒsteemid kasutavad hierarhilist struktuuri. Viimane nĂ€ide kasutab topoloogiat sĂ€ilitavat mÔÔtmelisuse vĂ€hendamise ja visualiseerimise meetodit, et nĂ€ha, millised ajusignaalid ja mĂ”tteseisundid on ĂŒksteisele sarnased. Viimased tulemused masinĂ”ppes ja tehisintellektis nĂ€itasid et mĂ”ned mehhanismid meie ajus on sarnased mehhanismidega, milleni jĂ”uavad Ă”ppimise kĂ€igus masinĂ”ppe algoritmid. Oma tööga me rĂ”hutame masinĂ”ppe mudelite tĂ”lgendamise tĂ€htsust selliste mehhanismide avastamiseks.Building a model of a complex phenomenon is an ancient way of gaining knowledge and understanding of the reality around us. Models of planetary motion, gravity, particle physics are examples of this approach. In neuroscience, there are two ways of coming up with explanations of reality: a traditional hypothesis-driven approach, where a model is first formulated and then tested using the data, and a more recent data-driven approach, that relies on machine learning to generate models automatically. Hypothesis-driven approach provides full understanding of the model, but is time-consuming as each model has to be conceived and tested manually. Data-driven approach requires only the data and computational resources to sift through potential models, saving time, but leaving the resulting model itself to be a black box. Given the growing amount of neural data, we argue in favor of a more widespread adoption of the data-driven approach, reallocating part of the human effort from manual modeling. The thesis is based on three examples of how interpretation of machine-learned models leads to neuroscientific insights on three different levels of neural organization. Our first interpretable model is used to characterize neural dynamics of localized neural activity during the task of visual perceptual categorization. Next, we compare the activity of human visual system with the activity of a convolutional neural network, revealing explanations about the functional organization of human visual cortex. Lastly, we use dimensionality reduction and visualization techniques to understand relative organization of mental concepts within a subject's mental state space and apply it in the context of brain-computer interfaces. Recent results in neuroscience and AI show similarities between the mechanisms of both systems. This fact endorses the relevance of our approach: interpreting the mechanisms employed by machine learning models can shed light on the mechanisms employed by our brainhttps://www.ester.ee/record=b536057
    • 

    corecore