182 research outputs found

    On-chip memory reduction in CNN hardware design for image super-resolution

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2019. 2. 이혁재.Single image super-resolution (SISR) 을 위한 convolutional neural network (CNN) 는 영상 분류용 CNN과 달리 고해상도의 영상을 입력 받아 고해상도의 중간 연산 결과인 feature map을 생성 한다. SISR용 CNN을 가속하기 위한 하드웨어는 주로 디스플레이 장치에 적용이 되며 외부 메모리 접근이 불가능한 스트리밍 구조를 갖는다. 이는 on-chip 메모리의 용량이 제한적인 하드웨어의 특성상 구현의 어려움을 야기한다. 기존의 연구들은 on-chip 메모리를 감소하기 위해 성능 저하 또는 압축 모듈을 추가한다. 본 논문은 성능 저하 없이 SISR용 CNN 하드웨어의 on-chip 메모리 감소 및 하드웨어를 설계하기 위한 방법을 제안한다. CNN 하드웨어는 VDSR (Very deep neural network for super-resolution) 구조를 기반으로 한다. 기존 CNN 하드웨어의 SRAM에 읽기 및 쓰기 접근이 동시에 발생하는 래스터 스캔 순서를 부분적 수직 순서로 변경 함으로 읽기 및 쓰기 접근 타이밍을 분리한다. 부분적 수직 순서는 기존의 CNN 하드웨어가 사용하는 듀얼 포트 SRAM 대신 싱글 포트 SRAM을 사용하도록 하며 이는 on-chip 메모리를 절반으로 감소한다. 두 번째 방법으로 VDSR의 필터의 형태를 변경하는 방법을 적용한다. On-chip 메모리의 크기는 컨볼루션 필터의 높이에 비례한다. 그러나 VDSR의 필터는 대칭 구조 중 가장 작은 필터 모양이므로 해당 문제를 해결하기 위해 컨텍스트 보존 1D 필터 구성 방법 및 컨텍스트를 기반한 세로 필터 감소 방법을 적용하여 SRAM의 크기를 절반으로 추가적으로 감소한다. CNN 하드웨어 구조가 확정 된 이후 CNN의 SISR 성능을 개선 하기 위한 CNN학습 방법을 자연 영상 (natural image)와 텍스트 영상 (text image)에 대해 각각 제안한다. SRGAN (Super-resolution generative adversarial networks) 는 판별자 네트워크 (discriminator network)로부터 발생하는 손실으로 SISR용 CNN이 실제 영상처럼 보이는 자연 영상을 출력하도록 한다. 그러나 SRGAN은 과선명화로 인한 시각적 결함을 발생하는 문제가 있다. 본 논문은 SRGAN의 시각적 결함을 제거하는 두 가지 방법을 제안한다. 첫 번째는 판별자 네트워크의 구조를 변경하여 판별자 네트워크 내에서 영상의 세부 정보 손실을 방지하는 해상도 유지 판별자 네트워크 구조를 제안 한다. 두 번째는 콘텐트 손실을 발생하는 VGG 네트워크의 구조상 영상의 세부적인 정보를 손실하는 문제를 해결하기 위한 해상도 유지 콘텐트 손실 방법을 제안한다. 텍스트 영상은 자연 영상이 아닌 합성 영상으로 영상 내 폰트와 배경의 색상 조합을 다양하게 변경될 수 있다. 기존의 CNN 학습 방법은 네트워크의 일반화를 위해 다양한 종류의 영상을 학습 시키는 방법을 사용한다. 그러나 모든 종류의 색상 조합을 CNN에 학습 시키는 것은 불가능하다. 본 논문은 영상 압축에 사용되는 De-colorization 방법을 차용하여 CNN이 학습할 영상을 검은 폰트와 흰색 배경으로 이루어진 영상으로 한정 함으로 학습되지 않은 영상의 폰트 및 배경 색상 조합에도 시각적 결함 없이 SISR 연산을 수행 하는 방법을 제안 한다.Unlike convolutional neural network (CNN) for image classification, CNN for single image super-resolution (SISR) receives high-resolution image and generates feature maps which are high-resolution intermediate results. The hardware for accelerating the CNN for SISR is mainly applied to the display device, and the CNN hardware has a streaming architecture in which external memory access is impossible. This causes implementation difficulties due to the limited hardware capacity of the on-chip memory. This paper proposes two methods for designing CNN hardware for SISR using limited hardware resources. CNN hardware is based on a very deep neural network for super-resolution (VDSR) architecture. By using the partially-vertical order for the convolution layers, simultaneous read and write accesses to SRAM are prevented. The proposed order makes CNN use single-port SRAM instead of dual-port SRAM, and it reduces on-chip memory area by half. The second method is to change the shape of the filter in VDSR. The size of the on-chip memory is proportional to the height of the convolution filter. However, since the filter of VDSR is the smallest of the symmetric shape, it is impossible to reduce the filter height of the VDSR. To solve this problem, a method of constructing a context-preserving 1D filter and a method of decreasing a vertical filter based on the context are proposed. These proposed methods reduce the size of the SRAM in half. Two CNN training methods for SISR of natural image and that of text image are proposed. These methods improve SISR performance after the CNN hardware architecture is confirmed. SRGAN (super-resolution generative adversarial networks) is trained by the help of discriminator network to generate realistic natural images. However, SRGAN has the problem of causing visual defects due to over-sharpening. This paper proposes two methods to eliminate the visual defects of SRGAN. First, the resolution-preserving discriminator network structure is proposed. This discriminator network prevents detailed information loss in the network by changing the structure of it. Second, the resolution-preserving content loss is proposed to solve the problem of loss of detailed information of image due to the structure of VGG19 network that causes content loss. The text image is not a natural image but a synthetic image. The color combination of the font and the background in the image can be variously changed. The existing CNN learning method uses a method of learning various kinds of images to generalize the network. However, it is impossible to learn all kinds of color combinations on CNN. This paper uses the de-colorization method used in image compression to limit the image to be learned by CNN to a black font and a white background image. As a result, CNN performs SISR operation without visual flaws in the font and background color combination image of the trained image.제 1 장 서 론 1 1.1 연구의 배경 1 1.2 연구의 내용 5 1.3 논문의 구성 8 제 2 장 이전 연구 9 2.1 SISR CNN 알고리즘 9 2.2 스트리밍 구조의 SISR 하드웨어 14 2.3 기존 CNN 하드웨어의 on-chip 메모리 감소 방법 15 2.4 De-colorization 17 제 3 장 컨볼루션 뉴럴 네트워크의 SRAM 면적 감소를 위한 연산 순서 변경 20 3.1 부분적 수직 순서 컨볼루션 연산 20 3.2 ifmap을 저장하기 위한 레지스터 24 3.3 CNN의 첫 번째 및 마지막 컨볼루션 레이어 SRAM 구성 26 3.4 fmap의 SRAM 다채널 공유를 위한 부분적 수직 순서 28 3.5 부분적 수직 순서의 적용 가능 CNN 구조 33 3.5 실험 결과 36 제 4 장 영상의 컨텍스트 보존을 위한 필터 재구성 및 CNN 하드웨어 설계 42 4.1 SRAM 감소를 위한 제안 알고리즘 43 4.2 SISR용 CNN 하드웨어 구조 49 4.3 실험 결과 55 제 5 장 SISR을 위한 해상도 보존 생산적 적대 신경망 구조 64 5.1 해상도 보존 판별 신경망 구조 64 5.2 해상도 보존 콘텐트 손실 68 5.3 실험 결과 70 제 6 장 De-colorization을 적용한 text SISR 84 6.1 Text de-colorization을 적용한 CNN 학습 84 6.2 실험 결과 86 제 7 장 결론 95 참고문헌 98 Abstract 105Docto

    Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review

    Get PDF
    Modern hyperspectral imaging systems produce huge datasets potentially conveying a great abundance of information; such a resource, however, poses many challenges in the analysis and interpretation of these data. Deep learning approaches certainly offer a great variety of opportunities for solving classical imaging tasks and also for approaching new stimulating problems in the spatial–spectral domain. This is fundamental in the driving sector of Remote Sensing where hyperspectral technology was born and has mostly developed, but it is perhaps even more true in the multitude of current and evolving application sectors that involve these imaging technologies. The present review develops on two fronts: on the one hand, it is aimed at domain professionals who want to have an updated overview on how hyperspectral acquisition techniques can combine with deep learning architectures to solve specific tasks in different application fields. On the other hand, we want to target the machine learning and computer vision experts by giving them a picture of how deep learning technologies are applied to hyperspectral data from a multidisciplinary perspective. The presence of these two viewpoints and the inclusion of application fields other than Remote Sensing are the original contributions of this review, which also highlights some potentialities and critical issues related to the observed development trends

    Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture

    Full text link

    Implementazione ed ottimizzazione di algoritmi per l'analisi di Biomedical Big Data

    Get PDF
    Big Data Analytics poses many challenges to the research community who has to handle several computational problems related to the vast amount of data. An increasing interest involves Biomedical data, aiming to get the so-called personalized medicine, where therapy plans are designed on the specific genotype and phenotype of an individual patient and algorithm optimization plays a key role to this purpose. In this work we discuss about several topics related to Biomedical Big Data Analytics, with a special attention to numerical issues and algorithmic solutions related to them. We introduce a novel feature selection algorithm tailored on omics datasets, proving its efficiency on synthetic and real high-throughput genomic datasets. We tested our algorithm against other state-of-art methods obtaining better or comparable results. We also implemented and optimized different types of deep learning models, testing their efficiency on biomedical image processing tasks. Three novel frameworks for deep learning neural network models development are discussed and used to describe the numerical improvements proposed on various topics. In the first implementation we optimize two Super Resolution models showing their results on NMR images and proving their efficiency in generalization tasks without a retraining. The second optimization involves a state-of-art Object Detection neural network architecture, obtaining a significant speedup in computational performance. In the third application we discuss about femur head segmentation problem on CT images using deep learning algorithms. The last section of this work involves the implementation of a novel biomedical database obtained by the harmonization of multiple data sources, that provides network-like relationships between biomedical entities. Data related to diseases and other biological relates were mined using web-scraping methods and a novel natural language processing pipeline was designed to maximize the overlap between the different data sources involved in this project

    Convolutional neural networks for the segmentation of small rodent brain MRI

    Get PDF
    Image segmentation is a common step in the analysis of preclinical brain MRI, often performed manually. This is a time-consuming procedure subject to inter- and intra- rater variability. A possible alternative is the use of automated, registration-based segmentation, which suffers from a bias owed to the limited capacity of registration to adapt to pathological conditions such as Traumatic Brain Injury (TBI). In this work a novel method is developed for the segmentation of small rodent brain MRI based on Convolutional Neural Networks (CNNs). The experiments here presented show how CNNs provide a fast, robust and accurate alternative to both manual and registration-based methods. This is demonstrated by accurately segmenting three large datasets of MRI scans of healthy and Huntington disease model mice, as well as TBI rats. MU-Net and MU-Net-R, the CCNs here presented, achieve human-level accuracy while eliminating intra-rater variability, alleviating the biases of registration-based segmentation, and with an inference time of less than one second per scan. Using these segmentation masks I designed a geometric construction to extract 39 parameters describing the position and orientation of the hippocampus, and later used them to classify epileptic vs. non-epileptic rats with a balanced accuracy of 0.80, five months after TBI. This clinically transferable geometric approach detects subjects at high-risk of post-traumatic epilepsy, paving the way towards subject stratification for antiepileptogenesis studies

    Data Mining

    Get PDF
    Data mining is a branch of computer science that is used to automatically extract meaningful, useful knowledge and previously unknown, hidden, interesting patterns from a large amount of data to support the decision-making process. This book presents recent theoretical and practical advances in the field of data mining. It discusses a number of data mining methods, including classification, clustering, and association rule mining. This book brings together many different successful data mining studies in various areas such as health, banking, education, software engineering, animal science, and the environment

    Tractographie de la matière blanche par réseaux de neurones récurrents

    Get PDF
    La matière blanche du cerveau fait encore l'objet de nombreuses études. Grâce à l'IRM de diffusion, on peut étudier de façon non invasive la connectivité du cerveau avec une précision sans précédent. La reconstruction de la matière blanche --- la tractographie --- n'est pas parfaite cependant. En effet, la tractographie tend à reconstruire tous les chemins possibles au sein de la matière blanche; l'expertise des neuroanatomistes est donc requise pour distinguer les chemins qui sont possibles anatomiquement de ceux qui résultent d'une mauvaise reconstruction. Cette connaissance est difficile à exprimer et à codifier sous forme de règles logiques. L'intelligence artificielle a refait surface dans les années 1990 --- suite à une amélioration remarquable de la vitesse des processeurs --- en tant que solution viable à plusieurs problèmes qui étaient considérés comme fondamentalement > et quasi impossibles à résoudre pour une machine. Celle-ci représente un outil unique pour intégrer l'expertise des neuroanatomistes dans le processus de reconstruction de la matière blanche, sans avoir à fournir de règles explicitement. Un modèle peut ainsi apprendre la définition d'un chemin valide à partir d'exemples valides, pour ensuite reproduire ce qu'il a appris, sans répéter les erreurs classiques. Plus particulièrement, les réseaux de neurones récurrents sont une famille de modèles créés spécifiquement pour le traitement de séquences de données. Comme une fibre de matière blanche est représentée par une séquence de points, le lien se fait naturellement. Malgré leur potentiel énorme, l'application des réseaux récurrents à la tractographie fait face à plusieurs problèmes techniques. Cette thèse se veut très exploratoire, et détaille donc les débuts de l'utilisation des réseaux de neurones récurrents pour la tractographie par apprentissage, des problèmes qui sont apparus suite à la création d'une multitude d'algorithmes basés sur l'intelligence artificielle, ainsi que des solutions développées pour répondre à ces problèmes. Les résultats de cette thèse ont démontré le potentiel des réseaux de neurones récurrents pour la reconstruction de la matière blanche, en plus de contribuer à l’avancement du domaine grâce à la création d’une base de données publique pour la tractographie par apprentissage
    corecore