885 research outputs found

    Mobile application to identify recyclable materials

    Get PDF
    This dissertation proposes a system to help the consumer recycle efficiently. The system is composed by a mobile application that can capture images of waste and classify their category through the usage of a machine learning model. Furthermore, this application can communicate with a server to update the model with new improved versions and also upload the images to the server in order to contribute to the creation of more precise model versions. The system has been validated by a fully working prototype. Although the proof of concept has been achieved, with some types of waste items correctly categorized, the machine learning model produced is not precise enough to be used in real-life scenarios, that is, for any type of waste. The main contributions of this study are a compendium of information in the area of computer vision and machine learning to categorize waste, and a working prototype system that utilizes crowdsourcing and machine learning elements to help the consumer recycle more efficiently.Nesta dissertação é proposto um sistema para ajudar o consumidor a reciclar eficientemente. O sistema é composto por uma aplicação móvel que captura imagens de lixo e classifica a sua categoria usando um modelo de aprendizagem automática. Consegue também comunicar com um servidor para atualizar o modelo com versões melhoradas e enviar as imagens para o servidor para contribuir para a criação de modelos mais precisos. Foi demonstrado através de um protótipo totalmente funcional que o sistema proposto funciona. Algumas imagens de lixo foram categorizadas correctamente, mas o modelo de aprendizagem automática produzido durante este projeto não é preciso o suficiente, em qualquer categoria de lixo, para usar em cenários da vida real. As principais contribuições deste estudo são um compêndio de informação na área de visão de computador e aprendizagem automática para categorizar lixo, e um sistema protótipo funcional que utiliza elementos de contribuição colaborativa e aprendizagem automática para ajudar o consumidor a reciclar mais eficientemente

    IST Austria Thesis

    Get PDF
    Deep learning is best known for its empirical success across a wide range of applications spanning computer vision, natural language processing and speech. Of equal significance, though perhaps less known, are its ramifications for learning theory: deep networks have been observed to perform surprisingly well in the high-capacity regime, aka the overfitting or underspecified regime. Classically, this regime on the far right of the bias-variance curve is associated with poor generalisation; however, recent experiments with deep networks challenge this view. This thesis is devoted to investigating various aspects of underspecification in deep learning. First, we argue that deep learning models are underspecified on two levels: a) any given training dataset can be fit by many different functions, and b) any given function can be expressed by many different parameter configurations. We refer to the second kind of underspecification as parameterisation redundancy and we precisely characterise its extent. Second, we characterise the implicit criteria (the inductive bias) that guide learning in the underspecified regime. Specifically, we consider a nonlinear but tractable classification setting, and show that given the choice, neural networks learn classifiers with a large margin. Third, we consider learning scenarios where the inductive bias is not by itself sufficient to deal with underspecification. We then study different ways of ‘tightening the specification’: i) In the setting of representation learning with variational autoencoders, we propose a hand- crafted regulariser based on mutual information. ii) In the setting of binary classification, we consider soft-label (real-valued) supervision. We derive a generalisation bound for linear networks supervised in this way and verify that soft labels facilitate fast learning. Finally, we explore an application of soft-label supervision to the training of multi-exit models

    Towards Developing Computer Vision Algorithms and Architectures for Real-world Applications

    Get PDF
    abstract: Computer vision technology automatically extracts high level, meaningful information from visual data such as images or videos, and the object recognition and detection algorithms are essential in most computer vision applications. In this dissertation, we focus on developing algorithms used for real life computer vision applications, presenting innovative algorithms for object segmentation and feature extraction for objects and actions recognition in video data, and sparse feature selection algorithms for medical image analysis, as well as automated feature extraction using convolutional neural network for blood cancer grading. To detect and classify objects in video, the objects have to be separated from the background, and then the discriminant features are extracted from the region of interest before feeding to a classifier. Effective object segmentation and feature extraction are often application specific, and posing major challenges for object detection and classification tasks. In this dissertation, we address effective object flow based ROI generation algorithm for segmenting moving objects in video data, which can be applied in surveillance and self driving vehicle areas. Optical flow can also be used as features in human action recognition algorithm, and we present using optical flow feature in pre-trained convolutional neural network to improve performance of human action recognition algorithms. Both algorithms outperform the state-of-the-arts at their time. Medical images and videos pose unique challenges for image understanding mainly due to the fact that the tissues and cells are often irregularly shaped, colored, and textured, and hand selecting most discriminant features is often difficult, thus an automated feature selection method is desired. Sparse learning is a technique to extract the most discriminant and representative features from raw visual data. However, sparse learning with \textit{L1} regularization only takes the sparsity in feature dimension into consideration; we improve the algorithm so it selects the type of features as well; less important or noisy feature types are entirely removed from the feature set. We demonstrate this algorithm to analyze the endoscopy images to detect unhealthy abnormalities in esophagus and stomach, such as ulcer and cancer. Besides sparsity constraint, other application specific constraints and prior knowledge may also need to be incorporated in the loss function in sparse learning to obtain the desired results. We demonstrate how to incorporate similar-inhibition constraint, gaze and attention prior in sparse dictionary selection for gastroscopic video summarization that enable intelligent key frame extraction from gastroscopic video data. With recent advancement in multi-layer neural networks, the automatic end-to-end feature learning becomes feasible. Convolutional neural network mimics the mammal visual cortex and can extract most discriminant features automatically from training samples. We present using convolutinal neural network with hierarchical classifier to grade the severity of Follicular Lymphoma, a type of blood cancer, and it reaches 91\% accuracy, on par with analysis by expert pathologists. Developing real world computer vision applications is more than just developing core vision algorithms to extract and understand information from visual data; it is also subject to many practical requirements and constraints, such as hardware and computing infrastructure, cost, robustness to lighting changes and deformation, ease of use and deployment, etc.The general processing pipeline and system architecture for the computer vision based applications share many similar design principles and architecture. We developed common processing components and a generic framework for computer vision application, and a versatile scale adaptive template matching algorithm for object detection. We demonstrate the design principle and best practices by developing and deploying a complete computer vision application in real life, building a multi-channel water level monitoring system, where the techniques and design methodology can be generalized to other real life applications. The general software engineering principles, such as modularity, abstraction, robust to requirement change, generality, etc., are all demonstrated in this research.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Efficient hardware implementations of bio-inspired networks

    Get PDF
    The human brain, with its massive computational capability and power efficiency in small form factor, continues to inspire the ultimate goal of building machines that can perform tasks without being explicitly programmed. In an effort to mimic the natural information processing paradigms observed in the brain, several neural network generations have been proposed over the years. Among the neural networks inspired by biology, second-generation Artificial or Deep Neural Networks (ANNs/DNNs) use memoryless neuron models and have shown unprecedented success surpassing humans in a wide variety of tasks. Unlike ANNs, third-generation Spiking Neural Networks (SNNs) closely mimic biological neurons by operating on discrete and sparse events in time called spikes, which are obtained by the time integration of previous inputs. Implementation of data-intensive neural network models on computers based on the von Neumann architecture is mainly limited by the continuous data transfer between the physically separated memory and processing units. Hence, non-von Neumann architectural solutions are essential for processing these memory-intensive bio-inspired neural networks in an energy-efficient manner. Among the non-von Neumann architectures, implementations employing non-volatile memory (NVM) devices are most promising due to their compact size and low operating power. However, it is non-trivial to integrate these nanoscale devices on conventional computational substrates due to their non-idealities, such as limited dynamic range, finite bit resolution, programming variability, etc. This dissertation demonstrates the architectural and algorithmic optimizations of implementing bio-inspired neural networks using emerging nanoscale devices. The first half of the dissertation focuses on the hardware acceleration of DNN implementations. A 4-layer stochastic DNN in a crossbar architecture with memristive devices at the cross point is analyzed for accelerating DNN training. This network is then used as a baseline to explore the impact of experimental memristive device behavior on network performance. Programming variability is found to have a critical role in determining network performance compared to other non-ideal characteristics of the devices. In addition, noise-resilient inference engines are demonstrated using stochastic memristive DNNs with 100 bits for stochastic encoding during inference and 10 bits for the expensive training. The second half of the dissertation focuses on a novel probabilistic framework for SNNs using the Generalized Linear Model (GLM) neurons for capturing neuronal behavior. This work demonstrates that probabilistic SNNs have comparable perform-ance against equivalent ANNs on two popular benchmarks - handwritten-digit classification and human activity recognition. Considering the potential of SNNs in energy-efficient implementations, a hardware accelerator for inference is proposed, termed as Spintronic Accelerator for Probabilistic SNNs (SpinAPS). The learning algorithm is optimized for a hardware friendly implementation and uses first-to-spike decoding scheme for low latency inference. With binary spintronic synapses and digital CMOS logic neurons for computations, SpinAPS achieves a performance improvement of 4x in terms of GSOPS/W/mm2^2 when compared to a conventional SRAM-based design. Collectively, this work demonstrates the potential of emerging memory technologies in building energy-efficient hardware architectures for deep and spiking neural networks. The design strategies adopted in this work can be extended to other spike and non-spike based systems for building embedded solutions having power/energy constraints

    Training based segmentation for tissue extraction in whole slide image

    Get PDF
    Reducing the time and storage memory required for scanning whole slide images (WSIs) is crucial. In this thesis work we tested and assessed the performance of two popular neural network architectures, namely DeepLabV3+ and Unet. In addition to that, a desktop application used to annotate histopathology images was developed, such application ultimately provided the data needed in order to train the neural networks. Both DeepLabV3+ and Unet accurately separated the regions of interest out of the WSIs, however DeepLabV3+ outperformed Unet, striking a pixel wise accuracy of 96.3%, while Unet scored 94.7% in the same metric. Morover DeepLabV3+ also outscored Unet in the IoU metric with values of 0:446 and 0:398 respectively. We showed the effectiveness of using deep neural networks for the case of semantic segmentation in histopathology images, more specifically for extracting tissue areas from WSIs, and how this can be used to improve the performance of WSI scanners

    Systém na podporu analýzy biomedicínských dat

    Get PDF
    The analysis of biomedical data is a current task, mainly thanks to the ever-evolving technologies for obtaining and preprocessing biological samples. This diploma thesis presents a software environment supporting experiments with real-world data using machine learning methods based on neural networks. The first part of the work discusses core concepts of machine learning and neural networks. The second part describes requirements, architecture, used technologies and all the capabilities of the application.Analýza biomedicínských dat je aktuální úlohou zejména díky stále se vyvíjejícím technologiím pro získávání a předzpracování biologických vzorků. Tato diplomová práce prezentuje prostředí pro podporu experimentů s metodami z oblasti strojového učení založenými na neuronových sítích. První část práce popisuje základní pojmy strojového učení a neuronových sítí. Druhá část popisuje požadavky, architekturu, použité technologie a všechny možnosti aplikace.460 - Katedra informatikyvýborn

    12th SC@RUG 2015 proceedings:Student Colloquium 2014-2015

    Get PDF

    12th SC@RUG 2015 proceedings:Student Colloquium 2014-2015

    Get PDF
    corecore