127 research outputs found

    Deep Learning for Crowd Anomaly Detection

    Get PDF
    Today, public areas across the globe are monitored by an increasing amount of surveillance cameras. This widespread usage has presented an ever-growing volume of data that cannot realistically be examined in real-time. Therefore, efforts to understand crowd dynamics have brought light to automatic systems for the detection of anomalies in crowds. This thesis explores the methods used across literature for this purpose, with a focus on those fusing dense optical flow in a feature extraction stage to the crowd anomaly detection problem. To this extent, five different deep learning architectures are trained using optical flow maps estimated by three deep learning-based techniques. More specifically, a 2D convolutional network, a 3D convolutional network, and LSTM-based convolutional recurrent network, a pre-trained variant of the latter, and a ConvLSTM-based autoencoder is trained using both regular frames and optical flow maps estimated by LiteFlowNet3, RAFT, and GMA on the UCSD Pedestrian 1 dataset. The experimental results have shown that while prone to overfitting, the use of optical flow maps may improve the performance of supervised spatio-temporal architectures

    Development of a recommendation system for scientific literature based on deep learning

    Get PDF
    Dissertação de mestrado em BioinformaticsThe previous few decades have seen an enormous volume of articles from the scientific commu nity on the most diverse biomedical topics, making it extremely challenging for researchers to find relevant information. Methods like Machine Learning (ML) and Deep Learning (DL) have been used to create tools that can speed up this process. In that context, this work focuses on examining the performance of different ML and DL techniques when classifying biomedical documents, mainly regarding their relevance to given topics. To evaluate the different techniques, the dataset from the BioCreative VI Track 4 challenge was used. The objective of the challenge was to identify documents related to protein-protein interactions altered by mutations, a topic extremely important in precision medicine. Protein-protein interactions play a crucial role in the cellular mechanisms of all living organisms, and mutations in these interaction sites could be indicative of diseases. To handle the data to be used in training, some text processing methods were implemented in the Omnia package from OmniumAI, the host company of this work. Several preprocessing and feature extraction methods were implemented, such as removing stopwords and TF-IDF, which may be used in other case studies. They can be used either with generic text or biomedical text. These methods, in conjunction with ML pipelines already developed by the Omnia team, allowed the training of several traditional ML models. We were able to achieve a small improvement on performance, compared to the challenge baseline, when applying these traditional ML models on the same dataset. Regarding DL, testing with a CNN model, it was clear that the BioWordVec pre-trained embedding achieved the best performance of all pre-trained embeddings. Additionally, we explored the application of more complex DL models. These models achieved a better performance than the best challenge submission. BioLinkBERT managed an improvement of 0.4 percent points on precision, 4.9 percent points on recall, and 2.2 percent points on F1.As décadas anteriores assistiram a um enorme aumento no volume de artigos da comunidade científica sobre os mais diversos tópicos biomédicos, tornando extremamente difícil para os investigadores encontrar informação relevante. Métodos como Aprendizagem Máquina (AM) e Aprendizagem Profunda (AP) tem sido utilizados para criar ferramentas que podem acelerar este processo. Neste contexto, este trabalho centra-se na avaliação do desempenho de diferentes técnicas de AM e AP na classificação de documentos biomédicos, principalmente no que diz respeito à sua relevância para determinados tópicos. Para avaliar as diferentes técnicas, foi utilizado o conjunto de dados do desafio BioCreative VI Track 4. O objectivo do desafio era identificar documentos relacionados com as interações proteína-proteína alteradas por mutações, um tópico extremamente importante na medicina de precisão. As interacções proteína-proteína desempenham um papel crucial nos mecanismos celulares de todos os organismos vivos, e as mutações nestes locais de interacção podem ser indicativas de doenças. Para tratar os dados a utilizar no treino, alguns métodos de processamento de texto foram implementados no pacote Omnia da OmniumAI, a empresa anfitriã deste trabalho. Foram implementados vários métodos de pré-processamento e extracção de características, tais como a remoção de palavras irrelevantes e TF-IDF, que podem ser utilizados em outros casos de estudos, tanto com texto genérico quer com texto biomédico. Estes métodos, em conjunto com as pipelines de AM já desenvolvidas pela equipa da Omnia, permitiram o treino de vários modelos tradicionais de AM. Conseguimos alcançar uma pequena melhoria no desempenho, em comparação com a linha de referência do desafio, ao aplicar estes modelos tradicionais de AM no mesmo conjunto de dados. Relativamente a AP, testando com um modelo CNN, ficou claro que o embedding pré-treinado BioWordVec alcançou o melhor desempenho de todos os embeddings pré-treinados. Adicionalmente, exploramos a aplicação de modelos de AP mais complexos. Estes modelos alcançaram um melhor desempenho do que a melhor submissão do desafio. BioLinkBERT conseguiu uma melhoria de 0,4 pontos percentuais na precisão, 4,9 pontos percentuais no recall, e 2,2 pontos percentuais em F1

    Hardware Considerations for Signal Processing Systems: A Step Toward the Unconventional.

    Full text link
    As we progress into the future, signal processing algorithms are becoming more computationally intensive and power hungry while the desire for mobile products and low power devices is also increasing. An integrated ASIC solution is one of the primary ways chip developers can improve performance and add functionality while keeping the power budget low. This work discusses ASIC hardware for both conventional and unconventional signal processing systems, and how integration, error resilience, emerging devices, and new algorithms can be leveraged by signal processing systems to further improve performance and enable new applications. Specifically this work presents three case studies: 1) a conventional and highly parallel mix signal cross-correlator ASIC for a weather satellite performing real-time synthetic aperture imaging, 2) an unconventional native stochastic computing architecture enabled by memristors, and 3) two unconventional sparse neural network ASICs for feature extraction and object classification. As improvements from technology scaling alone slow down, and the demand for energy efficient mobile electronics increases, such optimization techniques at the device, circuit, and system level will become more critical to advance signal processing capabilities in the future.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116685/1/knagphil_1.pd

    A machine learning approach to the unsupervised segmentation of mitochondria in subcellular electron microscopy data

    Get PDF
    Recent advances in cellular and subcellular microscopy demonstrated its potential towards unravelling the mechanisms of various diseases at the molecular level. The biggest challenge in both human- and computer-based visual analysis of micrographs is the variety of nanostructures and mitochondrial morphologies. The state-of-the-art is, however, dominated by supervised manual data annotation and early attempts to automate the segmentation process were based on supervised machine learning techniques which require large datasets for training. Given a minimal number of training sequences or none at all, unsupervised machine learning formulations, such as spectral dimensionality reduction, are known to be superior in detecting salient image structures. This thesis presents three major contributions developed around the spectral clustering framework which is proven to capture perceptual organization features. Firstly, we approach the problem of mitochondria localization. We propose a novel grouping method for the extracted line segments which describes the normal mitochondrial morphology. Experimental findings show that the clusters obtained successfully model the inner mitochondrial membrane folding and therefore can be used as markers for the subsequent segmentation approaches. Secondly, we developed an unsupervised mitochondria segmentation framework. This method follows the evolutional ability of human vision to extrapolate salient membrane structures in a micrograph. Furthermore, we designed robust non-parametric similarity models according to Gestaltic laws of visual segregation. Experiments demonstrate that such models automatically adapt to the statistical structure of the biological domain and return optimal performance in pixel classification tasks under the wide variety of distributional assumptions. The last major contribution addresses the computational complexity of spectral clustering. Here, we introduced a new anticorrelation-based spectral clustering formulation with the objective to improve both: speed and quality of segmentation. The experimental findings showed the applicability of our dimensionality reduction algorithm to very large scale problems as well as asymmetric, dense and non-Euclidean datasets

    Image Classification of Marine-Terminating Outlet Glaciers using Deep Learning Methods

    Get PDF
    A wealth of research has focused on elucidating the key controls on mass loss from the Greenland and Antarctic ice sheets in response to climate forcing, specifically in relation to the drivers of marine-terminating outlet glacier change. Despite the burgeoning availability of medium resolution satellite data, the manual methods traditionally used to monitor change of marine-terminating outlet glaciers from satellite imagery are time-consuming and can be subjective, especially where a mélange of icebergs and sea-ice exists at the terminus. To address this, recent advances in deep learning applied to image processing have created a new frontier in the field of automated delineation of glacier termini. However, at this stage, there remains a paucity of research on the use of deep learning for pixel-level semantic image classification of outlet glacier environments. This project develops and tests a two-phase deep learning approach based on a well-established convolutional neural network (CNN) called VGG16 for automated classification of Sentinel-2 satellite images. The novel workflow, termed CNN-Supervised Classification (CSC), was originally developed for fluvial settings but is adapted here to produce multi-class outputs for test imagery of glacial environments containing marine-terminating outlet glaciers in eastern Greenland. Results show mean F1 scores up to 95% for in-sample test imagery and 93% for out-of-sample test imagery, with significant improvements over traditional pixel-based methods such as band ratio techniques. This demonstrates the robustness of the deep learning workflow for automated classification despite the complex characteristics of the imagery. Future research could focus on the integration of deep learning classification workflows with platforms such as Google Earth Engine (GEE), to classify imagery more efficiently and produce datasets for a range of glacial applications without the need for substantial prior experience in coding or deep learning

    Data driven approaches for investigating molecular heterogeneity of the brain

    Get PDF
    It has been proposed that one of the clearest organizing principles for most sensory systems is the existence of parallel subcircuits and processing streams that form orderly and systematic mappings from stimulus space to neurons. Although the spatial heterogeneity of the early olfactory circuitry has long been recognized, we know comparatively little about the circuits that propagate sensory signals downstream. Investigating the potential modularity of the bulb’s intrinsic circuits proves to be a difficult task as termination patterns of converging projections, as with the bulb’s inputs, are not feasibly realized. Thus, if such circuit motifs exist, their detection essentially relies on identifying differential gene expression, or “molecular signatures,” that may demarcate functional subregions. With the arrival of comprehensive (whole genome, cellular resolution) datasets in biology and neuroscience, it is now possible for us to carry out large-scale investigations and make particular use of the densely catalogued, whole genome expression maps of the Allen Brain Atlas to carry out systematic investigations of the molecular topography of the olfactory bulb’s intrinsic circuits. To address the challenges associated with high-throughput and high-dimensional datasets, a deep learning approach will form the backbone of our informatic pipeline. In the proposed work, we test the hypothesis that the bulb’s intrinsic circuits are parceled into distinct, parallel modules that can be defined by genome-wide patterns of expression. In pursuit of this aim, our deep learning framework will facilitate the group-registration of the mitral cell layers of ~ 50,000 in-situ olfactory bulb circuits to test this hypothesis

    Automatic assessment of honey bee cells using deep learning

    Get PDF
    Temporal assessment of honey bee colony strength is required for different applications in many research projects, which often involves counting the number of comb cells with brood and food reserves multiple times a year. There are thousands of cells in each comb, which makes manual counting a time-consuming, tedious and thereby an error-prone task. Therefore, the automation of this task using modern imaging processing techniques represents a major advance. Herein, we developed a software capable of (i) detecting each cell from comb images, (ii) classifying its content and (iii) display the results to the researcher in a simple way. The cells’ contents typically display a high variation of patterns which make their classification by software a challenging endeavour. To address this challenge, we used Deep Neural Networks (DNNs). DNNs are known for achieving the state of art in many fields of study including image classification, because they can learn features that best describe the content being classified by themselves. Our DNN model was trained with over 70,000 manually labelled cell images whose cells were separated into seven classes. Our contribution is an end-to-end software capable of doing automatic background removal, cell detection, and classification of cell content based on an input comb image. With this software, colony assessment achieves an average accuracy of 94% across the seven classes in our dataset, representing a substantial progress regarding the approximation methods (e.g. Lieberfeld) currently used by honey bee researchers and previous techniques based on machine learning that used handmade features like colour and texture.A análise temporal sobre a qualidade e força de colônias de abelha melífera (Apis mellifera L.) é necessária em muitos projetos de pesquisa. Ela pode ser realizada contando alvéolos com alimento (pólen e néctar) e criação. É comum que ela seja feita diversas vezes ao ano. A grande quantidade de alvéolos em cada favo torna a tarefa demorada e tediosa ao pesquisador. Assim, frequentemente essa contagem é feita forma aproximada usando métodos como o de Lieberfeld. Automatizar este processo usando técnicas modernas de processamento de imagem representa um grande avanço, pois resultados mais precisos e padronizados poderão ser obtidos em menos tempo. O objetivo deste trabalho é construir de um software capaz de detectar, classificar e contar alvéolos a partir de uma imagem. Após, ele deverá apresentar os dados de forma simplificada ao usuário. Para tratar da alta variação de padrões como textura, cor e iluminação presente nas alvéolos, usaremos Deep Neural Network (DNN), que são modelos computacionais conhecidos por terem alcançado o estado da arte em várias tarefas relacionadas a processamento de sinais e imagens. Para o treinamento desses modelos utilizamos mais de 70.000 alvéolos anotadas por um apicultor experiente, separadas em sete classes. Entre nossas contribuições estão métodos de préprocessamento que garantem uma alta taxa de detecção de alvéolos, aliados a modelos de segmentação baseados em DNNs que asseguram uma baixa taxa de falsos positivos. Com nossos classificadores conseguimos uma acurácia média de 94% em nosso dataset e obtivemos resultados superiores a outros métodos baseados em contagens aproximadas e técnicas de análise por imagem que não utilizam DNNs.This research was conducted in the framework of the project BEEHOPE, funded through the 2013-2014 BiodivERsA/FACCE-JPI Joint call for research proposals, with the national founders FCT(Portugal), CNRS(France), and MEC(Spain)

    A Method for Image Classification Using Low-Precision Analog Computing Arrays

    Get PDF
    Computing with analog micro electronics can offer several advantages over standard digital technology, most notably: Low space and power consumption and massive parallelization. On the other hand, analog computation lacks the exactness of digital calculations due to inevitable device variations introduced during the chip production, but also due to electric noise in the analog signals. Artificial neural networks are well suited for parallel analog implementations, first, because of their inherent parallelity and second, because they can adapt to device imperfections by training. This thesis evaluates the feasibility of implementing a convolutional neural network for image classification on a massively parallel low-power hardware system. A particular, mixed analogdigital, hardware model is considered, featuring simple threshold neurons. Appropriate, gradient-free, training algorithms, combining self-organization and supervised learning are developed and tested with two benchmark problems (MNIST hand-written digits and traffic signs). Software simulations evaluate the methods under various defined computation faults. A model-free closed-loop technique is shown to compensate for rather serious computation errors without the need for explicit error quantification. Last but not least, the developed networks and the training techniques are verified on a real prototype chip
    corecore