Search CORE

500 research outputs found

심층신경망을 이용한 자동화된 치과 의료영상 분석

Author: 차준영
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 치과대학 치의과학과, 2021.8. 한중석.목 적: 치과 영역에서도 심층신경망(Deep Neural Network) 모델을 이용한 방사선사진에서의 임플란트 분류, 병소 위치 탐지 등의 연구들이 진행되었으나, 최근 개발된 키포인트 탐지(keypoint detection) 모델 또는 전체적 구획화(panoptic segmentation) 모델을 의료분야에 적용한 연구는 아직 미비하다. 본 연구의 목적은 치근단 방사선사진에서 키포인트 탐지를 이용해 임플란트 골 소실 정도를 파악하는 모델과 panoptic segmentation을 파노라마영상에 적용하여 다양한 구조물들을 구획화하는 모델을 학습시켜 진료에 보조적으로 활용되도록 만들어보고, 이 모델들의 추론결과를 평가해보는 것이다. 방 법: 객체 탐지 및 구획화에 있어 널리 연구된 합성곱 신경망 모델인 Mask-RCNN을 키포인트 탐지가 가능한 형태로 준비하여 치근단 방사선사진에서 임플란트의 top, apex, 그리고 bone level 지점을 좌우로 총 6지점 탐지하게끔 학습시킨 뒤, 학습에 사용되지 않은 시험 데이터셋을 대상으로 탐지시킨다. 키포인트 탐지 평가용 지표인 object keypoint similarity (OKS) 및 이를 이용한 average precision (AP) 값을 계산하고, 평균 OKS값을 통해 모델 및 치과의사의 결과를 비교한다. 또한, 탐지된 키포인트를 바탕으로 방사선사진상에서의 골 소실 정도를 수치화한다. Panoptic segmentation을 위해서는 기존의 벤치마크에서 우수한 성적을 거둔 신경망 모델인 Panoptic DeepLab을 파노라마영상에서 주요 구조물(상악동, 상악골, 하악관, 하악골, 자연치, 치료된 치아, 임플란트)을 구획화하도록 학습시킨 뒤, 시험 데이터셋에서의 구획화 결과에 panoptic / semantic / instance segmentation 각각의 평가지표들을 적용하고, 픽셀들의 정답(ground truth) 클래스와 모델이 추론한 클래스에 대한 confusion matrix를 계산한다. 결 과: OKS값을 기반으로 계산한 키포인트 탐지 AP는, 모든 OKS threshold에 대한 평균의 경우, 상악 임플란트에서는 0.761, 하악 임플란트에서는 0.786이었다. 평균 OKS는 모델이 0.8885, 치과의사가 0.9012로, 통계적으로 유의미한 차이가 없었다 (p = 0.41). 모델의 평균 OKS 값은 사람의 키포인트 어노테이션 정규분포상에서 상위 66.92% 수준이었다. 파노라마영상 구조물 구획화에서는, panoptic segmentation 평가지표인 panoptic quality 값의 경우 모든 클래스의 평균은 80.47이었으며, 치료된 치아가 57.13으로 가장 낮았고 하악관이 65.97로 두번째로 낮은 값을 보였다. Semantic segmentation 평가지표인 global한 Intersection over Union (IoU) 값은 모든 클래스 평균 0.795였으며, 하악관이 0.639로 가장 낮았고 치료된 치아가 0.656으로 두번째로 낮은 값을 보였다. Confusion matrix 계산 결과, ground truth 픽셀들 중 올바르게 추론된 픽셀들의 비율은 하악관이 0.802로 가장 낮았다. 개별 객체에 대한 IoU를 기반으로 계산한 Instance segmentation 평가지표인 AP값은, 모든 IoU threshold에 대한 평균의 경우, 치료된 치아가 0.316, 임플란트가 0.414, 자연치가 0.520이었다. 결 론: 키포인트 탐지 신경망 모델을 이용하여, 치근단 방사선사진에서 임플란트의 주요 지점을 사람과 다소 유사한 수준으로 탐지할 수 있었다. 또한, 탐지된 지점들을 기반으로 방사선사진상에서의 임플란트 주위 골 소실 비율 계산을 자동화할 수 있고, 이 값은 임플란트 주위염의 심도 분류에 사용될 수 있다. 파노라마 영상에서는 panoptic segmentation이 가능한 신경망 모델을 이용하여 상악동과 하악관을 포함한 주요 구조물들을 구획화할 수 있었다. 따라서, 이와 같이 각 작업에 맞는 심층신경망을 적절한 데이터로 학습시킨다면 진료 보조 수단으로 활용될 수 있다.Purpose: In dentistry, deep neural network models have been applied in areas such as implant classification or lesion detection in radiographs. However, few studies have applied the recently developed keypoint detection model or panoptic segmentation model to medical or dental images. The purpose of this study is to train two neural network models to be used as aids in clinical practice and evaluate them: a model to determine the extent of implant bone loss using keypoint detection in periapical radiographs and a model that segments various structures on panoramic radiographs using panoptic segmentation. Methods: Mask-RCNN, a widely studied convolutional neural network for object detection and instance segmentation, was constructed in a form that is capable of keypoint detection, and trained to detect six points of an implant in a periapical radiograph: left and right of the top, apex, and bone level. Next, a test dataset was used to evaluate the inference results. Object keypoint similarity (OKS), a metric to evaluate the keypoint detection task, and average precision (AP), based on the OKS values, were calculated. Furthermore, the results of the model and those arrived at by a dentist were compared using the mean OKS. Based on the detected keypoint, the peri-implant bone loss ratio was obtained from the radiograph. For panoptic segmentation, Panoptic DeepLab, a neural network model ranked high in the previous benchmark, was trained to segment key structures in panoramic radiographs: maxillary sinus, maxilla, mandibular canal, mandible, natural tooth, treated tooth, and dental implant. Then, each evaluation metric of panoptic, semantic, and instance segmentation was applied to the inference results of the test dataset. Finally, the confusion matrix for the ground truth class of pixels and the class inferred by the model was obtained. Results: The AP of keypoint detection for the average of all OKS thresholds was 0.761 for the upper implants and 0.786 for the lower implants. The mean OKS was 0.8885 for the model and 0.9012 for the dentist; thus, the difference was not statistically significant (p = 0.41). The mean OKS of the model was in the top 66.92% of the normal distribution of human keypoint annotations. In panoramic radiograph segmentation, the average panoptic quality (PQ) of all classes was 80.47. The treated teeth showed the lowest PQ of 57.13, and the mandibular canal showed the second lowest PQ of 65.97. The Intersection over Union (IoU) was 0.795 on average for all classes, where the mandibular canal showed the lowest IoU of 0.639, and the treated tooth showed the second lowest IoU of 0.656. In the confusion matrix, the proportion of correctly inferred pixels among the ground truth pixels was the lowest in the mandibular canal at 0.802. The AP, averaged for all IoU thresholds, was 0.316 for the treated tooth, 0.414 for the dental implant, and 0.520 for the normal tooth. Conclusion: Using the keypoint detection neural network model, it was possible to detect major landmarks around dental implants in periapical radiographs to a degree similar to that of human experts. In addition, it was possible to automate the calculation of the peri-implant bone loss ratio on periapical radiographs based on the detected keypoints, and this value could be used to classify the degree of peri-implantitis. In panoramic radiographs, the major structures including the maxillary sinus and the mandibular canal could be segmented using a neural network model capable of panoptic segmentation. Thus, if deep neural networks suitable for each task are trained using suitable datasets, the proposed approach can be used to assist dental clinicians.Chapter 1. Introduction 1 Chapter 2. Materials and methods 5 Chapter 3. Results 23 Chapter 4. Discussion 32 Chapter 5. Conclusions 45 Published papers related to this study 46 References 47 Abbreviations 52 Abstract in Korean 53 Acknowledgements 56박

SNU Open Repository and Archive

Scene understanding from 3D point clouds and RGB images for autonomous driving

Author: Rolando de Sousa Chichorro Avides Moreira
Publication venue
Publication date: 22/07/2021
Field of study

Autonomous cars are often equipped with 3D data acquisition sensors and devices, e.g., LiDAR, which provide a 3D point cloud that describes the surroundings. Direct acquisition of 3D data from these sensors is commonly used for obstacle avoidance and mapping. Analysing 3D point clouds is complex since point clouds are unstructured, unordered, and contain a varying number of points. The most common approach used for scene understanding in images is the Convolutional Neural Network. Although CNNs achieve high performance in image analysis, they cannot be applied naturally on point clouds. Several methods for extending CNNs to 3D point cloud analysis have been proposed, such as rasterization into a 3D voxel grid to use directly a CNN or using a Graph Convolutional Network. The main goal of this dissertation is to study and compare different approaches for scene understanding from 3D point clouds within the scope of driving automation systems. Moreover, the project contemplates the study of sensor fusion approaches, namely how to combine 3D point clouds and images. In light of this, this project uses a sensor fusion technique called pointpainting, which uses images segmentation to enhance 3D object detection on point clouds

Repositório Aberto da Universidade do Porto

Computer vision for plant and animal inventory

Author: Chen Guang
Publication venue: University of Missouri--Columbia
Publication date
Field of study

The population, composition, and spatial distribution of the plants and animals in certain regions are always important data for natural resource management, conservation and farming. The traditional ways to acquire such data require human participation. The procedure of data processing by human is usually cumbersome, expensive and time-consuming. Hence the algorithms for automatic animal and plant inventory show their worth and become a hot topic. We propose a series of computer vision methods for automated plant and animal inventory, to recognize, localize, categorize, track and count different objects of interest, including vegetation, trees, fishes and livestock animals. We make use of different sensors, hardware platforms, neural network architectures and pipelines to deal with the varied properties and challenges of these objects. (1) For vegetation analysis, we propose a fast multistage method to estimate the coverage. The reference board is localized based on its edge and texture features. And then a K-means color model of the board is generated. Finally, the vegetation is segmented at pixel level using the color model. The proposed method is robust to lighting condition changes. (2) For tree counting in aerial images, we propose a novel method called density transformer, or DENT, to learn and predict the density of the trees at different positions. DENT uses an efficient multi-receptive field network to extract visual features from different positions. A transformer encoder is applied to filter and transfer useful contextual information across different spatial positions. DENT significantly outperformed the existing state-of-art CNN detectors and regressors on both the dataset built by ourselves and an existing cross-site dataset. (3) We propose a framework of fish classification system using boat cameras. The framework contains two branches. A branch extracts the contextual information from the whole image. The other branch localizes all the individual fish and normalizes their poses. The classification results from the two branches are weighted based on the clearness of the image and the familiarness of the context. Our system achieved the top 1 percent rank in the competition of The Nature Conservancy Fisheries Monitoring. (4) We also propose a video-based pig counting algorithm using an inspection robot. We adopt a novel bottom-up keypoint tracking method and a novel spatial-aware temporal response filtering method to count the pigs. The proposed approach outperformed the other methods and even human competitors in the experiments.Includes bibliographical references

University of Missouri: MOspace

Detection and Mosaicing through Deep Learning Models for Low-Quality Retinal Images

Author: Correia Tales Veríssimo Souza
Publication venue
Publication date: 16/06/2023
Field of study

Glaucoma is a severe eye disease that is asymptomatic in the initial stages and can lead to blindness, due to its degenerative characteristic. There isn’t any available cure for it, and it is the second most common cause of blindness in the world. Most of the people affected by it only discovers the disease when it is already too late. Regular visits to the ophthalmologist are the best way to prevent or contain it, with a precise diagnosis performed with professional equipment. From another perspective, for some individuals or populations, this task can be difficult to accomplish, due to several restrictions, such as low incoming resources, geographical adversities, and travelling restrictions (distance, lack of means of transportation, etc.). Also, logistically, due to its dimensions, relocating the professional equipment can be expensive, thus becoming not viable to bring them to remote areas. In the market, low-cost products like the D-Eye lens offer an alternative to meet this need. The D-Eye lens can be attached to a smartphone to capture fundus images, but it presents a major drawback in terms of lower-quality imaging when compared to professional equipment. This work presents and evaluates methods for eye reading with D-Eye recordings. This involves exposing the retina in two steps: object detection and summarization via object mosaicing. Deep learning methods, such as the YOLO family architecture, were used for retina registration as an object detector. The summarization methods presented and inferred in this work mosaiced the best retina images together to produce a more detailed resultant image. After selecting the best workflow from these methods, a final inference was performed and visually evaluated, the results were not rich enough to serve as a pre-screening medical assessment, determining that improvements in the actual algorithm and technology are needed to retrieve better imaging

IC-online

Nucleus segmentation : towards automated solutions

Author: Hollandi Reka
Horvath Peter
Moshkov Nikita
Paavolainen Lassi
Piccinini Filippo
Tasnadi Ervin
Publication venue
Publication date: 01/01/2022
Field of study

Single nucleus segmentation is a frequent challenge of microscopy image processing, since it is the first step of many quantitative data analysis pipelines. The quality of tracking single cells, extracting features or classifying cellular phenotypes strongly depends on segmentation accuracy. Worldwide competitions have been held, aiming to improve segmentation, and recent years have definitely brought significant improvements: large annotated datasets are now freely available, several 2D segmentation strategies have been extended to 3D, and deep learning approaches have increased accuracy. However, even today, no generally accepted solution and benchmarking platform exist. We review the most recent single-cell segmentation tools, and provide an interactive method browser to select the most appropriate solution.Peer reviewe

Repository of the Academy's Library

Helsingin yliopiston digitaalinen arkisto

Deep learning for real-world object detection

Author: WU Xiongwei
Publication venue: Singapore Management University
Publication date: 01/07/2020
Field of study

Institutional Knowledge at Singapore Management University

Hybrid model for Single-Stage Multi-Person Pose Estimation

Author: Im Wonhyeok
Jin Lanying
Kim Bosang
Kim Jonghyun
Kim Jungpyo
Kwon Dowoo
Lee Hyotae
Lee Jungho
Publication venue
Publication date: 01/05/2023
Field of study

In general, human pose estimation methods are categorized into two approaches according to their architectures: regression (i.e., heatmap-free) and heatmap-based methods. The former one directly estimates precise coordinates of each keypoint using convolutional and fully-connected layers. Although this approach is able to detect overlapped and dense keypoints, unexpected results can be obtained by non-existent keypoints in a scene. On the other hand, the latter one is able to filter the non-existent ones out by utilizing predicted heatmaps for each keypoint. Nevertheless, it suffers from quantization error when obtaining the keypoint coordinates from its heatmaps. In addition, unlike the regression one, it is difficult to distinguish densely placed keypoints in an image. To this end, we propose a hybrid model for single-stage multi-person pose estimation, named HybridPose, which mutually overcomes each drawback of both approaches by maximizing their strengths. Furthermore, we introduce self-correlation loss to inject spatial dependencies between keypoint coordinates and their visibility. Therefore, HybridPose is capable of not only detecting densely placed keypoints, but also filtering the non-existent keypoints in an image. Experimental results demonstrate that proposed HybridPose exhibits the keypoints visibility without performance degradation in terms of the pose estimation accuracy

arXiv.org e-Print Archive