Search CORE

2,467 research outputs found

A CNN based hybrid approach towards automatic image registration

Author: Arun Pattathal Vijayakumar
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 26/09/2013
Field of study

Image registration is a key component of spatial analyses that involve different data sets of the same area. Automatic approaches in this domain have witnessed the application of several intelligent methodologies over the past decade; however accuracy of these approaches have been limited due to the inability to properly model shape as well as contextual information. In this paper, we investigate the possibility of an evolutionary computing based framework towards automatic image registration. Cellular Neural Network has been found to be effective in improving feature matching as well as resampling stages of registration, and complexity of the approach has been considerably reduced using corset optimization. CNN-prolog based approach has been adopted to dynamically use spectral and spatial information for representing contextual knowledge. The salient features of this work are feature point optimisation, adaptive resampling and intelligent object modelling. Investigations over various satellite images revealed that considerable success has been achieved with the procedure. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

Object Detection in 20 Years: A Survey

Author: Guo Yuhong
Shi Zhenwei
Ye Jieping
Zou Zhengxia
Publication venue
Publication date: 15/05/2019
Field of study

Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

arXiv.org e-Print Archive

CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting

Author: Li Lei
Publication venue
Publication date: 26/10/2023
Field of study

Natural scene analysis and remote sensing imagery offer immense potential for advancements in large-scale language-guided context-aware data utilization. This potential is particularly significant for enhancing performance in downstream tasks such as object detection and segmentation with designed language prompting. In light of this, we introduce the CPSeg, Chain-of-Thought Language Prompting for Finer-grained Semantic Segmentation), an innovative framework designed to augment image segmentation performance by integrating a novel "Chain-of-Thought" process that harnesses textual information associated with images. This groundbreaking approach has been applied to a flood disaster scenario. CPSeg encodes prompt texts derived from various sentences to formulate a coherent chain-of-thought. We propose a new vision-language dataset, FloodPrompt, which includes images, semantic masks, and corresponding text information. This not only strengthens the semantic understanding of the scenario but also aids in the key task of semantic segmentation through an interplay of pixel and text matching maps. Our qualitative and quantitative analyses validate the effectiveness of CPSeg.Comment: WACV 202

arXiv.org e-Print Archive

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution

Author: Itoyama Katsutoshi
Nakadai Kazuhiro
Nihal Ragib Amin
Yen Benjamin
Publication venue
Publication date: 26/01/2024
Field of study

The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology. Traditional object detection models, trained on datasets biased towards large objects, struggle to perform optimally in aerial scenarios where small, densely clustered objects are prevalent. To address this challenge, we present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture. We employ a range of datasets, including VisDrone-2023, SeaDroneSee, VEDAI, and NWPU VHR-10, to evaluate our model's performance. Our Super Resolved YOLOv5 architecture features Transformer encoder blocks, allowing the model to capture global context and context information, leading to improved detection results, especially in high-density, occluded conditions. This lightweight model not only delivers improved accuracy but also ensures efficient resource utilization, making it well-suited for real-time applications. Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects, underlining the significance of dataset choice and architectural adaptation for this specific task. In particular, the method achieves 52.5% mAP on VisDrone, exceeding top prior works. This approach promises to significantly advance object detection in aerial imagery, contributing to more accurate and reliable results in a variety of real-world applications

arXiv.org e-Print Archive

Towards Large-Scale Small Object Detection: Survey and Benchmarks

Author: Cheng Gong
Han Junwei
Yan Kebing
Yao Xiwen
Yuan Xiang
Zeng Qinghua
Publication venue
Publication date: 24/12/2022
Field of study

With the rise of deep convolutional neural networks, object detection has achieved prominent advances in past years. However, such prosperity could not camouflage the unsatisfactory situation of Small Object Detection (SOD), one of the notoriously challenging tasks in computer vision, owing to the poor visual appearance and noisy representation caused by the intrinsic structure of small targets. In addition, large-scale dataset for benchmarking small object detection methods remains a bottleneck. In this paper, we first conduct a thorough review of small object detection. Then, to catalyze the development of SOD, we construct two large-scale Small Object Detection dAtasets (SODA), SODA-D and SODA-A, which focus on the Driving and Aerial scenarios respectively. SODA-D includes 24828 high-quality traffic images and 278433 instances of nine categories. For SODA-A, we harvest 2513 high resolution aerial images and annotate 872069 instances over nine classes. The proposed datasets, as we know, are the first-ever attempt to large-scale benchmarks with a vast collection of exhaustively annotated instances tailored for multi-category SOD. Finally, we evaluate the performance of mainstream methods on SODA. We expect the released benchmarks could facilitate the development of SOD and spawn more breakthroughs in this field. Datasets and codes are available at: \url{https://shaunyuan22.github.io/SODA}

arXiv.org e-Print Archive