20 research outputs found

    Fast object detection in compressed JPEG Images

    Full text link
    Object detection in still images has drawn a lot of attention over past few years, and with the advent of Deep Learning impressive performances have been achieved with numerous industrial applications. Most of these deep learning models rely on RGB images to localize and identify objects in the image. However in some application scenarii, images are compressed either for storage savings or fast transmission. Therefore a time consuming image decompression step is compulsory in order to apply the aforementioned deep models. To alleviate this drawback, we propose a fast deep architecture for object detection in JPEG images, one of the most widespread compression format. We train a neural network to detect objects based on the blockwise DCT (discrete cosine transform) coefficients {issued from} the JPEG compression algorithm. We modify the well-known Single Shot multibox Detector (SSD) by replacing its first layers with one convolutional layer dedicated to process the DCT inputs. Experimental evaluations on PASCAL VOC and industrial dataset comprising images of road traffic surveillance show that the model is about 2×2\times faster than regular SSD with promising detection performances. To the best of our knowledge, this paper is the first to address detection in compressed JPEG images

    Подход к анализу изображений в системах мониторинга

    Get PDF
    В данной работе предлагается подход к анализу изображений в системах мониторинга. Основное внимание уделяется построению семантической модели изображения. Результаты экспериментов по языковой интерпретации полученной модели показывают улучшение скорости обработки и качества аннотирования изображений до 60% (метрика METEOR) по сравнению с нейросетевыми методами. Также, использование данной модели позволяет очистить и нормализовать данные для обучения, в том числе нейросетевых архитектур, применяющихся в анализе изображений. Рассматриваются перспективы использования данной методики в ситуационном мониторинге. In this paper the approach to image analysis in monitoring systems is proposed. Main focus is on the construction of the semantic model of the image. Experimental results with language interpretation of the model show the increase of the processing speed and the quality of image captioning up to 60% (METEOR metric) in comparison to pure neural network based methods. The usage of the model also allows to clean and normalize data for training neural network architectures specialized on image analysis among others. The perspectives of this technique employment in situational monitoring are considered

    RSS-Net: Weakly-Supervised Multi-Class Semantic Segmentation with FMCW Radar

    Full text link
    This paper presents an efficient annotation procedure and an application thereof to end-to-end, rich semantic segmentation of the sensed environment using FMCW scanning radar. We advocate radar over the traditional sensors used for this task as it operates at longer ranges and is substantially more robust to adverse weather and illumination conditions. We avoid laborious manual labelling by exploiting the largest radar-focused urban autonomy dataset collected to date, correlating radar scans with RGB cameras and LiDAR sensors, for which semantic segmentation is an already consolidated procedure. The training procedure leverages a state-of-the-art natural image segmentation system which is publicly available and as such, in contrast to previous approaches, allows for the production of copious labels for the radar stream by incorporating four camera and two LiDAR streams. Additionally, the losses are computed taking into account labels to the radar sensor horizon by accumulating LiDAR returns along a pose-chain ahead and behind of the current vehicle position. Finally, we present the network with multi-channel radar scan inputs in order to deal with ephemeral and dynamic scene objects.Comment: submitted to IEEE Intelligent Vehicles Symposium (IV) 202

    DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

    Full text link
    Although YOLOv2 approach is extremely fast on object detection; its backbone network has the low ability on feature extraction and fails to make full use of multi-scale local region features, which restricts the improvement of object detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating the object detection accuracy of YOLOv2. Specifically, the dense connection of convolution layers is employed in the backbone network of YOLOv2 to strengthen the feature extraction and alleviate the vanishing-gradient problem. Moreover, an improved spatial pyramid pooling is introduced to pool and concatenate the multi-scale local region features, so that the network can learn the object features more comprehensively. The DC-SPP-YOLO model is established and trained based on a new loss function composed of mean square error and cross entropy, and the object detection is realized. Experiments demonstrate that the mAP (mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and using the multi-scale local region features.Comment: 23 pages, 9 figures, 9 table

    The Use of a Convolutional Neural Network in Detecting Soldering Faults from a Printed Circuit Board Assembly

    Get PDF
    Automatic Optical Inspection (AOI) is any method of detecting defects during a Printed Circuit Board (PCB) manufacturing process. Early AOI methods were based on classic image processing algorithms using a reference PCB. The traditional methods require very complex and inflexible preprocessing stages. With recent advances in the field of deep learning, especially Convolutional Neural Networks (CNN), automating various computer vision tasks has been established. Limited research has been carried out in the past on using CNN for AOI. The present systems are inflexible and require a lot of preprocessing steps or a complex illumination system to improve the accuracy. This paper studies the effectiveness of using CNN to detect soldering bridge faults in a PCB assembly. The paper presents a method for designing an optimized CNN architecture to detect soldering faults in a PCBA. The proposed CNN architecture is compared with the state-of-the-art object detection architecture, namely YOLO, with respect to detection accuracy, processing time, and memory requirement. The results of our experiments show that the proposed CNN architecture has a 3.0% better average precision, has 50% less number of parameters and infers in half the time as YOLO. The experimental results prove the effectiveness of using CNN in AOI by using images of a PCB assembly without any reference image, any complex preprocessing stage, or a complex illumination system. Doi: 10.28991/HIJ-2022-03-01-01 Full Text: PD

    포토리소그래피 검사 시스템의 이미지 분할을 위한 새로운 깊은 아키텍처

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(지능형융합시스템전공), 2021.8. 홍성수.In semiconductor manufacturing, defect detection is critical to maintain high yield. Typically, the defects of semiconductor wafer may be generated from the manufacturing process. Most computer vision systems used in semiconductor photolithography process inspection still have adopt to image processing algorithm, which often occur inspection faults due to sensitivity to external environment changes. Therefore, we intend to tackle this problem by means of converging the advantages of image processing algorithm and deep learning. In this dissertation, we propose Image Segmentation Detector (ISD) to extract the enhanced feature-maps under the situations where training dataset is limited in the specific industry domain, such as semiconductor photolithography inspection. ISD is used as a novel backbone network of state-of-the-art Mask R-CNN framework for image segmentation. ISD consists of four dense blocks and four transition layers. Especially, each dense block in ISD has the shortcut connection and the concatenation of the feature-maps produced in layer with dynamic growth rate for more compactness. ISD is trained from scratch without using recently approached transfer learning method. Additionally, ISD is trained with image dataset pre-processed by means of our designed image filter to extract the better enhanced feature map of Convolutional Neural Network (CNN). In ISD, one of the key design principles is the compactness, plays a critical role for addressing real-time problem and for application on resource bounded devices. To empirically demonstrate the model, this dissertation uses the existing image obtained from the computer vision system embedded in the currently operating semiconductor manufacturing equipment. ISD achieves consistently better results than state-of-the-art methods at the standard mean average precision which is the most common metric used to measure the accuracy of the instance detection. Significantly, our ISD outperforms baseline method DenseNet, while requiring only 1/4 parameters. We also observe that ISD can achieve comparable better results in performance than ResNet, with only much smaller 1/268 parameters, using no extra data or pre-trained models. Our experimental results show that ISD can be useful to many future image segmentation research efforts in diverse fields of semiconductor industry which is requiring real-time and good performance with only limited training dataset.반도체 제조에서 결함 검출은 높은 수율을 유지하는데 중요합니다. 전형적으로, 반도체 웨이퍼의 결함은 제조 공정에서 발생하고 있습니다. 반도체 포토리소그래피 공정 검사에 사용되는 대부분의 컴퓨터 비전 시스템들은 여전히 외부 환경 변화에 민감한 이미지 처리 알고리즘을 사용하고 있어서 검사 오류가 자주 발생하고 있습니다. 따라서, 이미지 처리 알고리즘의 장점과 딥 러닝의 장점을 융합하여 이 문제를 해결하려고 합니다. 이 논문에서 우리는 반도체 포토리소그래피 검사와 같이 훈련 데이터 세트가 제한된 상황에서 향상된 기능 맵을 추출하기 위해 이미지 분할 검출기(Image Segmentation Detector, 이하 ISD)를 제안합니다. ISD는 이미지 분할을 위한 최신 Mask R-CNN 프레임 워크의 새로운 백본 네트워크로 사용합니다. ISD는 4 개의 조밀한 블록과 4 개의 전환 레이어로 구성합니다. 특히, ISD의 각 조밀한 블록은 보다 컴팩트함을 위해 단축 연결 및 동적 성장률을 가지고 레이어에서 생성된 피쳐 맵을 결합하고 있습니다. ISD는 최근 적용하고 있는 전이 학습 방법을 사용하지 않고 처음부터 훈련합니다. 또한, ISD는 합성곱 신경망(Convolutional Neural Network, 이하 CNN)의 향상된 기능 맵을 추출하기 위해 우리가 설계한 이미지 필터를 통해 사전 처리된 이미지 데이터 세트로 훈련을 합니다. ISD의 설계 핵심 원칙 중 하나는 소형화로 실시간 문제를 해결하고 리소스에 제한이 있는 장치에 적용하는데 중요한 역할을 하게 합니다. 모델을 실증적으로 입증하기 위해 이 논문에서는 현재 운영 중인 반도체 제조 장비에 내장된 컴퓨터 비전 시스템에서 획득한 실제 이미지를 사용합니다. ISD는 가장 일반적인 성능 측정 지표인 평균 정밀도에서 최첨단 백본 네트워크 보다 일관되게 더 나은 성능을 얻습니다. 특히, ISD는 베이스 라인으로 삼은 DenseNet 보다 파라미터들이 4배 더 적지만, 성능이 우수 합니다. 우리는 또한 ISD가 Mask R-CNN 백본 네트워크로 주로 사용하는 ResNet 보다 268배 훨씬 더 적은 파라미터들을 가지고, 추가 데이터 또는 사전 훈련된 모델을 사용하지 않고, 성능에서 비슷하거나 더 나은 결과를 얻을 수 있음을 관찰합니다. 우리의 실험 결과들은 ISD가 제한된 훈련 데이터 세트만으로 실시간 및 우수한 성능을 요구하는 반도체 산업의 다양한 분야들에서 많은 미래의 이미지 분할 연구 노력에 유용할 수 있음을 보여줍니다.Chapter 1. Introduction 1 1.1. Background and Motivation 4 Chapter 2. Related Work 12 2.1. Inspection Method 12 2.2. Instance Segmentation 16 2.3. Backbone Structure 24 2.4. Enhanced Feature Map 35 2.5. Detection Performance Evaluation 47 2.6. Learning Network Model from Scratch 50 Chapter 3. Proposed Method 52 3.1. ISD Architecture 52 3.2. Pre-processing 63 3.3. Model Training 71 3.4. Training Objective 73 3.5. Setting and Configurations 75 Chapter 4. Experimental Evaluation 78 4.1. Classification Results on ISD 81 4.2. Comparison with Pre-processing 85 4.3. Image Segmentation Results on ISD 94 4.3.1. Results on Suck-back State 94 4.3.2. Results on Dispensing State 104 4.4. Comparison with State-of-the-art Methods 113 Chapter 5. Conclusion 121 Bibliography 127 초록 146박
    corecore