Search CORE

71 research outputs found

A Survey on Artificial Intelligence Techniques for Biomedical Image Analysis in Skeleton-Based Forensic Human Identification

Author: Ibáñez Óscar
Martos Rubén
Mesejo Santiago Pablo
Novo Jorge
Ortega Marcos
Publication venue: 'MDPI AG'
Publication date: 08/07/2002
Field of study

This paper represents the first survey on the application of AI techniques for the analysis of biomedical images with forensic human identification purposes. Human identification is of great relevance in today’s society and, in particular, in medico-legal contexts. As consequence, all technological advances that are introduced in this field can contribute to the increasing necessity for accurate and robust tools that allow for establishing and verifying human identity. We first describe the importance and applicability of forensic anthropology in many identification scenarios. Later, we present the main trends related to the application of computer vision, machine learning and soft computing techniques to the estimation of the biological profile, the identification through comparative radiography and craniofacial superimposition, traumatism and pathology analysis, as well as facial reconstruction. The potentialities and limitations of the employed approaches are described, and we conclude with a discussion about methodological issues and future research.Spanish Ministry of Science, Innovation and UniversitiesEuropean Union (EU) PGC2018-101216-B-I00Regional Government of Andalusia under grant EXAISFI P18-FR-4262Instituto de Salud Carlos IIIEuropean Union (EU) DTS18/00136European Commission H2020-MSCA-IF-2016 through the Skeleton-ID Marie Curie Individual Fellowship 746592Spanish Ministry of Science, Innovation and Universities-CDTI, Neotec program 2019 EXP-00122609/SNEO-20191236European Union (EU)Xunta de Galicia ED431G 2019/01European Union (EU) RTI2018-095894-B-I0

Repositorio da Universidade da Coruña

Repositorio Institucional Universidad de Granada

Deep Learning for Mobile Multimedia: A Survey

Author: FRANCESCO G.B. DE Natale
MINH SON Dao
OTA Kaoru
VASILEIOS Mezaris
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/08/2017
Field of study

Deep Learning (DL) has become a crucial technology for multimedia computing. It offers a powerful instrument to automatically produce high-level abstractions of complex multimedia data, which can be exploited in a number of applications, including object detection and recognition, speech-to- text, media retrieval, multimodal data analysis, and so on. The availability of affordable large-scale parallel processing architectures, and the sharing of effective open-source codes implementing the basic learning algorithms, caused a rapid diffusion of DL methodologies, bringing a number of new technologies and applications that outperform, in most cases, traditional machine learning technologies. In recent years, the possibility of implementing DL technologies on mobile devices has attracted significant attention. Thanks to this technology, portable devices may become smart objects capable of learning and acting. The path toward these exciting future scenarios, however, entangles a number of important research challenges. DL architectures and algorithms are hardly adapted to the storage and computation resources of a mobile device. Therefore, there is a need for new generations of mobile processors and chipsets, small footprint learning and inference algorithms, new models of collaborative and distributed processing, and a number of other fundamental building blocks. This survey reports the state of the art in this exciting research area, looking back to the evolution of neural networks, and arriving to the most recent results in terms of methodologies, technologies, and applications for mobile environments

Muroran-IT Academic Resource Archive

Aprendendo características de imagens por redes convolucionais sob restrição de dados supervisionados

Author: Peixinho Alan Zanoni, 1990-
Publication venue: [s.n.]
Publication date: 01/09/2018
Field of study

Orientador: Alexandre Xavier FalcãoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A análise de imagens vem sendo largamente aplicada em diversas áreas das Ciências e Engenharia, com o intuito de extrair e interpretar o conteúdo de interesse em aplicações que variam de uma simple análise de códigos de barras ao diagnóstico automatizado de doenças. Entretanto, as soluções do Estado da Arte baseadas em redes neurais com múltiplas camadas usualmente requerem um elevado número de amostras anotadas (rotuladas), implicando em um considerável esforço humano na identificação, isolamento, e anotação dessas amostras em grandes bases de dados. O problema é agravado quando tal anotação requer especialistas no domínio da aplicação, tal como em Medicina e Agricultura, constituindo um inconveniente crucial em tais aplicações. Neste contexto, as Redes de Convolução (Convolution Networks - ConvNets), estão entre as abordagens mais bem sucedidas na extração de características de imagens, tal que, sua associação com Perceptrons Multi-Camadas (Multi Layer Perceptron - MLP) ou Máquinas de Vetores de Suporte (Support Vector Machines - SVM) permite uma classificação de amostras bastante efetiva. Outro problema importante de tais técnicas se encontra na alta dimensionalidade de suas características, que dificulta o processo de análise da distribuição das amostras por métodos baseados em distância Euclidiana, como agrupamento e visualização de dados multidimensionais. Considerando tais problemas, avaliamos as principais estratégias no projeto de ConvNets, a saber, Aprendizado de Arquitetura (Architecture Learning - AL), Aprendizado de Filtros (Filter Learning - FL) e Aprendizado por Transferência de Domínio (Transfer Learning - TL) em relação a sua capacidade de aprendizado num conjunto limitado de amostras anotadas. E, para confirmar a eficácia no aprendizado de características, analisamos a melhoria do classificador conforme o número de amostras aumenta durante o aprendizado ativo. Métodos de data augmentation também foram avaliados como uma potencial estratégia para lidar com a ausência de amostras anotadas. Finalmente, apresentamos os principais resultados do trabalho numa aplicação real ¿ o diagnóstico de parasitos intestinais ¿ em comparação com os descritores do Estado da Arte. Por fim, pudemos concluir que TL se apresenta como a melhor estratégia, sob restrição de dados supervisionados, sempre que tivermos uma rede previamente aprendida que se aplique ao problema em questão. Caso contrário, AL se apresenta como a segunda melhor alternativa. Pudemos ainda observar a eficácia da Análise Discriminante Linear (Linear Discriminant Analysis - LDA) em reduzir consideravelmente o espaço de características criado pelas ConvNets, permitindo uma melhor compreensão dos especialistas sobre os processos de aprendizado de características e aprendizado ativo, por meio de técnicas de visualização de dados multidimensionais. Estes importantes resultados sugerem que uma interação entre aprendizado de características, aprendizado ativo, e especialistas, pode beneficiar consideravelmente o aprendizado de máquinaAbstract: Image analysis has been widely employed in many areas of the Sciences and Engineering to extract and interpret high-level information from images, with applications ranging from a simple bar code analysis to the diagnosis of diseases. However, the state-of-the-art solutions based on deep learning often require a training set with a high number of annotated (labeled) examples. This may imply significant human effort in sample identification, isolation, and labeling from large image databases, specially when image annotation asks for specialists in the application domain, such as in Medicine and Agriculture, such requirement constitutes a crucial drawback. In this context, Convolution Networks (ConvNets) are among the most successful approaches for image feature extraction, such that their combination with a Multi-Layer Perceptron (MLP) network or a Support Vector Machine (SVM) can be used for effective sample classification. Another problem in these techniques is the resulting high-dimension feature space, which makes difficult the analysis of the sample distribution by the commonly used distance based data clustering and visualization methods. In this work, we analyze both problems by assessing the main strategies for ConvNet design, namely Architecture Learning (AL), Filter Learning (FL), and Transfer Learning (TL), according to their capability of learning from a limited number of labeled examples, and by evaluating the impact of feature space reduction techniques in distance-based data classification and visualization. In order to confirm the effectiveness of feature learning, we analyze the progress of the classifier as the number of supervised samples increase during active learning. Data augmentation has also been evaluated as a potential strategy to cope with the absence of labeled examples. Finally, we demonstrate the main results of the work for a real application ¿ the diagnosis of intestinal parasites ¿ in comparison to the state-of-the-art image descriptors. In conclusion, TL has shown to be the best strategy, under supervised data constraint, whenever we count with a learned network that suits the problem. When this is not the case, AL comes as the second best alternative. We have also observed the effectiveness of Linear Discriminant Analysis (LDA) in considerably reducing the feature space created by ConvNets to allow a better understanding of the feature learning and active learning processes by the expert through data visualization. This important result suggests an interplaying between feature and active learning with intervening of the experts to improve both processes as future workMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCNPQCAPE

Repositorio da Producao Cientifica e Intelectual da Unicamp

Gesture passwords: concepts, methods and challenges

Author: Wu Jonathan
Publication venue
Publication date: 21/06/2016
Field of study

Biometrics are a convenient alternative to traditional forms of access control such as passwords and pass-cards since they rely solely on user-specific traits. Unlike alphanumeric passwords, biometrics cannot be given or told to another person, and unlike pass-cards, are always “on-hand.” Perhaps the most well-known biometrics with these properties are: face, speech, iris, and gait. This dissertation proposes a new biometric modality: gestures. A gesture is a short body motion that contains static anatomical information and changing behavioral (dynamic) information. This work considers both full-body gestures such as a large wave of the arms, and hand gestures such as a subtle curl of the fingers and palm. For access control, a specific gesture can be selected as a “password” and used for identification and authentication of a user. If this particular motion were somehow compromised, a user could readily select a new motion as a “password,” effectively changing and renewing the behavioral aspect of the biometric. This thesis describes a novel framework for acquiring, representing, and evaluating gesture passwords for the purpose of general access control. The framework uses depth sensors, such as the Kinect, to record gesture information from which depth maps or pose features are estimated. First, various distance measures, such as the log-euclidean distance between feature covariance matrices and distances based on feature sequence alignment via dynamic time warping, are used to compare two gestures, and train a classifier to either authenticate or identify a user. In authentication, this framework yields an equal error rate on the order of 1-2% for body and hand gestures in non-adversarial scenarios. Next, through a novel decomposition of gestures into posture, build, and dynamic components, the relative importance of each component is studied. The dynamic portion of a gesture is shown to have the largest impact on biometric performance with its removal causing a significant increase in error. In addition, the effects of two types of threats are investigated: one due to self-induced degradations (personal effects and the passage of time) and the other due to spoof attacks. For body gestures, both spoof attacks (with only the dynamic component) and self-induced degradations increase the equal error rate as expected. Further, the benefits of adding additional sensor viewpoints to this modality are empirically evaluated. Finally, a novel framework that leverages deep convolutional neural networks for learning a user-specific “style” representation from a set of known gestures is proposed and compared to a similar representation for gesture recognition. This deep convolutional neural network yields significantly improved performance over prior methods. A byproduct of this work is the creation and release of multiple publicly available, user-centric (as opposed to gesture-centric) datasets based on both body and hand gestures

Boston University Institutional Repository (OpenBU)

Weakly and Partially Supervised Learning Frameworks for Anomaly Detection

Author: Degardin Bruno Manuel
Publication venue
Publication date: 23/07/2020
Field of study

The automatic detection of abnormal events in surveillance footage is still a concern of the research community. Since protection is the primary purpose of installing video surveillance systems, the monitoring capability to keep public safety, and its rapid response to satisfy this purpose, is a significant challenge even for humans. Nowadays, human capacity has not kept pace with the increased use of surveillance systems, requiring much supervision to identify unusual events that could put any person or company at risk, without ignoring the fact that there is a substantial waste of labor and time due to the extremely low likelihood of occurring anomalous events compared to normal ones. Consequently, the need for an automatic detection algorithm of abnormal events has become crucial in video surveillance. Even being in the scope of various research works published in the last decade, the state-of-the-art performance is still unsatisfactory and far below the required for an effective deployment of this kind of technology in fully unconstrained scenarios. Nevertheless, despite all the research done in this area, the automatic detection of abnormal events remains a challenge for many reasons. Starting by environmental diversity, the complexity of movements resemblance in different actions, crowded scenarios, and taking into account all possible standard patterns to define a normal action is undoubtedly difficult or impossible. Despite the difficulty of solving these problems, the substantive problem lies in obtaining sufficient amounts of labeled abnormal samples, which concerning computer vision algorithms, is fundamental. More importantly, obtaining an extensive set of different videos that satisfy the previously mentioned conditions is not a simple task. In addition to its effort and time-consuming, defining the boundary between normal and abnormal actions is usually unclear. Henceforward, in this work, the main objective is to provide several solutions to the problems mentioned above, by focusing on analyzing previous state-of-the-art methods and presenting an extensive overview to clarify the concepts employed on capturing normal and abnormal patterns. Also, by exploring different strategies, we were able to develop new approaches that consistently advance the state-of-the-art performance. Moreover, we announce the availability of a new large-scale first of its kind dataset fully annotated at the frame level, concerning a specific anomaly detection event with a wide diversity in fighting scenarios, that can be freely used by the research community. Along with this document with the purpose of requiring minimal supervision, two different proposals are described; the first method employs the recent technique of self-supervised learning to avoid the laborious task of annotation, where the training set is autonomously labeled using an iterative learning framework composed of two independent experts that feed data to each other through a Bayesian framework. The second proposal explores a new method to learn an anomaly ranking model in the multiple instance learning paradigm by leveraging weakly labeled videos, where the training labels are done at the video-level. The experiments were conducted in several well-known datasets, and our solutions solidly outperform the state-of-the-art. Additionally, as a proof-of-concept system, we also present the results of collected real-world simulations in different environments to perform a field test of our learned models.A detecção automática de eventos anómalos em imagens de videovigilância permanece uma inquietação por parte da comunidade científica. Sendo a proteção o principal propósito da instalação de sistemas de vigilância, a capacidade de monitorização da segurança pública, e a sua rápida resposta para satisfazer essa finalidade, é uma adversidade até para o ser humano. Nos dias de hoje, com o aumento do uso de sistemas de videovigilância, a capacidade humana não tem alcançado a cadência necessária, exigindo uma supervisão exorbitante para a identificação de acontecimentos invulgares que coloquem uma identidade ou sociedade em risco. O facto da probabilidade de se suceder um incidente ser extremamente reduzida comparada a eventualidades normais, existe um gasto substancial de tempo de ofício. Consequentemente, a necessidade para um algorítmo de detecção automática de incidentes tem vindo a ser crucial em videovigilância. Mesmo sendo alvo de vários trabalhos científicos publicados na última década, o desempenho do estado-da-arte continua insatisfatório e abaixo do requisitado para uma implementação eficiente deste tipo de tecnologias em ambientes e cenários totalmente espontâneos e incontinentes. Porém, apesar de toda a investigação realizada nesta área, a automatização de detecção de incidentes é um desafio que perdura por várias razões. Começando pela diversidade ambiental, a complexidade da semalhança entre movimentos de ações distintas, cenários de multidões, e ter em conta todos os padrões para definir uma ação normal, é indiscutivelmente difícil ou impossível. Não obstante a dificuldade de resolução destes problemas, o obstáculo fundamental consiste na obtenção de um número suficiente de instâncias classificadas anormais, considerando algoritmos de visão computacional é essencial. Mais importante ainda, obter um vasto conjunto de diferentes vídeos capazes de satisfazer as condições previamente mencionadas, não é uma tarefa simples. Em adição ao esforço e tempo despendido, estabelecer um limite entre ações normais e anormais é frequentemente indistinto. Tendo estes aspetos em consideração, neste trabalho, o principal objetivo é providenciar diversas soluções para os problemas previamente mencionados, concentrando na análise de métodos do estado-da-arte e apresentando uma visão abrangente dos mesmos para clarificar os conceitos aplicados na captura de padrões normais e anormais. Inclusive, a exploração de diferentes estratégias habilitou-nos a desenvolver novas abordagens que aprimoram consistentemente o desempenho do estado-da-arte. Por último, anunciamos a disponibilidade de um novo conjunto de dados, em grande escala, totalmente anotado ao nível da frame em relação à detecção de anomalias em um evento específico com uma vasta diversidade em cenários de luta, podendo ser livremente utilizado pela comunidade científica. Neste documento, com o propósito de requerer o mínimo de supervisão, são descritas duas propostas diferentes; O primeiro método põe em prática a recente técnica de aprendizagem auto-supervisionada para evitar a árdua tarefa de anotação, onde o conjunto de treino é classificado autonomamente usando uma estrutura de aprendizagem iterativa composta por duas redes neuronais independentes que fornecem dados entre si através de uma estrutura Bayesiana. A segunda proposta explora um novo método para aprender um modelo de classificação de anomalias no paradigma multiple-instance learning manuseando vídeos fracamente anotados, onde a classificação do conjunto de treino é feita ao nível do vídeo. As experiências foram concebidas em vários conjuntos de dados, e as nossas soluções superam consolidamente o estado-da-arte. Adicionalmente, como sistema de prova de conceito, apresentamos os resultados da execução do nosso modelo em simulações reais em diferentes ambientes

UBibliorum repositorio digital da ubi

Prediction and detection of abnormal usage of an elevator

Author: Hayati Saboktakin
Publication venue
Publication date: 09/01/2019
Field of study

In this thesis we used machine learning to detect the anomalous use of elevators by measuring the behavior at any specific time. We examined the data for unusual patterns in comparison with observed samples from the history of the elevator. We investigated forecasting the future use of the elevator to define an abnormal behavior for elevators and tried to address this issue in two approaches. First, we used Long Short-Term Memory to forecast future usage, and we optimized the result by extracting features and removing the noisy part of the data. Then we compared actual usage with our prediction and used 99.7% confidence interval (three sigma rule) to find out anomalies. The second approach used a local outlier factor to find out the distance of each week’s usage of the elevator from the other weeks. Then we took the intersection of these two methods, performed a set of post-processing actions to decrease the ratio of false positives and remove anomalies which were not sustained

Trepo - Institutional Repository of Tampere University

Fingerprint recognition system

Author: Abbey Abraham Junior
Publication venue
Publication date: 01/05/2022
Field of study

Undergraduate thesis submitted to the Department of Computer Science and Information Systems, Ashesi University, in partial fulfillment of Bachelor of Science degree in Management Information Systems, May 2022Fingerprint recognition is one of the most popular biometric techniques in personal identification. The widespread use of fingerprint recognition as a biometric is because each fingerprints pattern of ridges and valleys is unique and does not vary with time and age. While there are several algorithms or methods for fingerprint recognition systems, the quest to develop a robust fingerprint recognition system remains a significant research area. One major challenge in designing a system for fingerprint recognition is its ability to perform well on both full and partial fingerprint images. Most fingerprint recognition systems developed so far use minutiae-based algorithms which tend to perform well under full fingerprint but poorly under partial occlusion. In partial occlusion, the minutiae, which are the core points of the fingerprints, get completely distorted. The distortion of minutiae makes minutiae-based algorithms perform poorly. Therefore, this study proposed a novel non-minutiae-based algorithm that adapts the Fisherface and Eigenface method from facial recognition. The proposed algorithm is insensitive to partial occlusion. The Eigenface and Fisherface methods were tested on FVC 2002 Datasets and yielded an accuracy of 86.67% and 90% respectively. These accuracy results indicates that there is a possibility of recognizing fingerprint images using non-minutiae-based algorithms from different domain.Ashesi Universit

Ashesi Institutional Repository

Convolutional Neural Network for Intermediate View Enhancement in Multiview Streaming

Author: Grangetto Marco
Li Yu
Tillo Tammam
Xiao Jimin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

에너지 효율적 인공신경망 설계

Author: 김재현
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 최기영.최근 심층 학습은 이미지 분류, 음성 인식 및 강화 학습과 같은 영역에서 놀라운 성과를 거두고 있다. 최첨단 심층 인공신경망 중 일부는 이미 인간의 능력을 넘어선 성능을 보여주고 있다. 그러나 인공신경망은 엄청난 수의 고정밀 계산과 수백만개의 매개 변수를 이용하기 위한 빈번한 메모리 액세스를 수반한다. 이는 엄청난 칩 공간과 에너지 소모 문제를 야기하여 임베디드 시스템에서 인공신경망이 사용되는 것을 제한하게 된다. 이 문제를 해결하기 위해 인공신경망을 높은 에너지 효율성을 갖도록 설계하는 방법을 제안한다. 첫번째 파트에서는 가중 스파이크를 이용하여 짧은 추론 시간과 적은 에너지 소모의 장점을 갖는 스파이킹 인공신경망 설계 방법을 다룬다. 스파이킹 인공신경망은 인공신경망의 높은 에너지 소비 문제를 극복하기 위한 유망한 대안 중 하나이다. 기존 연구에서 심층 인공신경망을 정확도 손실없이 스파이킹 인공신경망으로 변환하는 방법이 발표되었다. 그러나 기존의 방법들은 rate coding을 사용하기 때문에 긴 추론 시간을 갖게 되고 이것이 많은 에너지 소모를 야기하게 되는 단점이 있다. 이 파트에서는 페이즈에 따라 다른 스파이크 가중치를 부여하는 방법으로 추론 시간을 크게 줄이는 방법을 제안한다. MNIST, SVHN, CIFAR-10, CIFAR-100 데이터셋에서의 실험 결과는 제안된 방법을 이용한 스파이킹 인공신경망이 기존 방법에 비해 큰 폭으로 추론 시간과 스파이크 발생 빈도를 줄여서 보다 에너지 효율적으로 동작함을 보여준다. 두번째 파트에서는 공정 변이가 있는 상황에서 동작하는 고에너지효율 아날로그 인공신경망 설계 방법을 다루고 있다. 인공신경망을 아날로그 회로를 사용하여 구현하면 높은 병렬성과 에너지 효율성을 얻을 수 있는 장점이 있다. 하지만, 아날로그 시스템은 노이즈에 취약한 중대한 결점을 가지고 있다. 이러한 노이즈 중 하나로 공정 변이를 들 수 있는데, 이는 아날로그 회로의 적정 동작 지점을 변화시켜 심각한 성능 저하 또는 오동작을 유발하는 원인이다. 이 파트에서는 ReRAM에 기반한 고에너지 효율 아날로그 이진 인공신경망을 구현하고, 공정 변이 문제를 해결하기 위해 활성도 일치 방법을 사용한 공정 변이 보상 기법을 제안한다. 제안된 인공신경망은 1T1R 구조의 ReRAM 배열과 차동증폭기를 이용한 뉴런을 이용하여 고밀도 집적과 고에너지 효율 동작이 가능하게 구성되었다. 또한, 아날로그 뉴런 회로의 공정 변이 취약성 문제를 해결하기 위해 이상적인 뉴런의 활성도와 동일한 활성도를 갖도록 뉴런의 바이어스를 조절하는 방법을 소개한다. 제안된 방법을 사용하여 32nm 공정에서 구현된 인공신경망은 3-sigma 지점에서 50% 문턱 전압 변이와 15%의 저항값 변이가 있는 상황에서도 MNIST에서 98.55%, CIFAR-10에서 89.63%의 정확도를 달성하였으며, 970 TOPS/W에 달하는 매우 높은 에너지 효율성을 달성하였다.Recently, deep learning has shown astounding performances on specific tasks such as image classification, speech recognition, and reinforcement learning. Some of the state-of-the-art deep neural networks have already gone over humans ability. However, neural networks involve tremendous number of high precision computations and frequent off-chip memory accesses with millions of parameters. It incurs problems of large area and exploding energy consumption, which hinder neural networks from being exploited in embedded systems. To cope with the problem, techniques for designing energy efficient neural networks are proposed. The first part of this dissertation addresses the design of spiking neural networks with weighted spikes which has advantages of shorter inference latency and smaller energy consumption compared to the conventional spiking neural networks. Spiking neural networks are being regarded as one of the promising alternative techniques to overcome the high energy costs of artificial neural networks. It is supported by many researches showing that a deep convolutional neural network can be converted into a spiking neural network with near zero accuracy loss. However, the advantage on energy consumption of spiking neural networks comes at a cost of long classification latency due to the use of Poisson-distributed spike trains (rate coding), especially in deep networks. We propose to use weighted spikes, which can greatly reduce the latency by assigning a different weight to a spike depending on which time phase it belongs. Experimental results on MNIST, SVHN, CIFAR-10, and CIFAR-100 show that the proposed spiking neural networks with weighted spikes achieve significant reduction in classification latency and number of spikes, which leads to faster and more energy-efficient spiking neural networks than the conventional spiking neural networks with rate coding. We also show that one of the state-of-the-art networks the deep residual network can be converted into spiking neural network without accuracy loss. The second part of this dissertation focuses on the design of highly energy-efficient analog neural networks in the presence of variations. Analog hardware accelerators for deep neural networks have taken center stage in the aspect of high parallelism and energy efficiency. However, a critical weakness of the analog hardware systems is vulnerability to noise. One of the biggest noise sources is a process variation. It is a big obstacle to using analog circuits since the variation shifts various parameters of analog circuits from the correct operating points, which causes severe performance degradation or even malfunction. To achieve high energy efficiency with analog neural networks, we propose resistive random access memory (ReRAM) based analog implementation of binarized neural networks (BNNs) with a novel variation compensation technique through activation matching (VCAM). The proposed architecture consists of 1-transistor-1-resistor (1T1R) structured ReRAM synaptic arrays and differential amplifier based neurons, which leads to high-density integration and energy efficiency. To cope with the vulnerability of analog neurons due to process variation, the biases of all neurons are adjusted in the direction that matches average output activation of ideal neurons without variation. The technique effectively restores the classification accuracy degraded by the variation. Experimental results on 32nm technology show that the proposed architecture achieves the classification accuracy of 98.55% on MNIST and 89.63% on CIFAR-10 in the presence of 50% threshold voltage variation and 15% resistance variation at 3-sigma point. It also achieves 970 TOPS/W energy efficiency with MLP on MNIST.1 Introduction 1 1.1 Deep Neural Networks with Weighted Spikes . . . . . . . . . . . . . 2 1.2 VCAM: Variation Compensation through Activation Matching for Analog Binarized Neural Networks . . . . . . . . . . . . . . . . . . . . . 5 2 Background 8 2.1 Spiking neural network . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Spiking neuron model . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Rate coding in SNNs . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Binarized neural networks . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Resistive random access memory . . . . . . . . . . . . . . . . . . . . 18 3 RelatedWork 22 3.1 Training SNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 SNNs with various spike coding schemes . . . . . . . . . . . . . . . 25 3.3 BNN implementations . . . . . . . . . . . . . . . . . . . . . . . . . 28 4 Deep Neural Networks withWeighted Spikes 33 4.1 SNN with weighted spikes . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.1 Weighted spikes . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.2 Spiking neuron model for weighted spikes . . . . . . . . . . . 35 4.1.3 Noise spike . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.4 Approximation of the ReLU activation . . . . . . . . . . . . 39 4.1.5 ANN-to-SNN conversion . . . . . . . . . . . . . . . . . . . . 41 4.2 Optimization techniques . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2.1 Skipping initial input currents in the output layer . . . . . . . 45 4.2.2 The number of phases in a period . . . . . . . . . . . . . . . 47 4.2.3 Accuracy-energy trade-off by early decision . . . . . . . . . . 50 4.2.4 Consideration on hardware implementation . . . . . . . . . . 52 4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 Comparison between SNN-RC and SNN-WS . . . . . . . . . 56 4.4.2 Trade-off by early decision . . . . . . . . . . . . . . . . . . . 64 4.4.3 Comparison with other algorithms . . . . . . . . . . . . . . . 67 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 VCAM: Variation Compensation through Activation Matching for Analog Binarized Neural Networks 71 5.1 Modification of Binarized Neural Network . . . . . . . . . . . . . . . 72 5.1.1 Binarized Neural Network . . . . . . . . . . . . . . . . . . . 72 5.1.2 Use of 0 and 1 Activations . . . . . . . . . . . . . . . . . . . 72 5.1.3 Removal of Batch Normalization Layer . . . . . . . . . . . . 73 5.2 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 ReRAM Synaptic Array . . . . . . . . . . . . . . . . . . . . 75 5.2.2 Neuron Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.3 Issues with Neuron Circuit . . . . . . . . . . . . . . . . . . . 82 5.3 Variation Compensation . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3.1 Variation Modeling . . . . . . . . . . . . . . . . . . . . . . . 85 5.3.2 Impact of VT Variation . . . . . . . . . . . . . . . . . . . . . 87 5.3.3 Variation Compensation Techniques . . . . . . . . . . . . . . 88 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.2 Accuracy of the Modified BNN Algorithm . . . . . . . . . . 94 5.4.3 Variation Compensation . . . . . . . . . . . . . . . . . . . . 95 5.4.4 Performance Comparison . . . . . . . . . . . . . . . . . . . . 99 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6 Conclusion 102Docto

SNU Open Repository and Archive