45,973 research outputs found

    2D Reconstruction of Small Intestine's Interior Wall

    Full text link
    Examining and interpreting of a large number of wireless endoscopic images from the gastrointestinal tract is a tiresome task for physicians. A practical solution is to automatically construct a two dimensional representation of the gastrointestinal tract for easy inspection. However, little has been done on wireless endoscopic image stitching, let alone systematic investigation. The proposed new wireless endoscopic image stitching method consists of two main steps to improve the accuracy and efficiency of image registration. First, the keypoints are extracted by Principle Component Analysis and Scale Invariant Feature Transform (PCA-SIFT) algorithm and refined with Maximum Likelihood Estimation SAmple Consensus (MLESAC) outlier removal to find the most reliable keypoints. Second, the optimal transformation parameters obtained from first step are fed to the Normalised Mutual Information (NMI) algorithm as an initial solution. With modified Marquardt-Levenberg search strategy in a multiscale framework, the NMI can find the optimal transformation parameters in the shortest time. The proposed methodology has been tested on two different datasets - one with real wireless endoscopic images and another with images obtained from Micro-Ball (a new wireless cubic endoscopy system with six image sensors). The results have demonstrated the accuracy and robustness of the proposed methodology both visually and quantitatively.Comment: Journal draf

    ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

    Full text link
    Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects.Comment: Main paper is 8 pages. Supplemental section contains a walk-through of our method (using a qualitative example) and qualitative results for WPAFB 2009 datase

    3D Registration of Aerial and Ground Robots for Disaster Response: An Evaluation of Features, Descriptors, and Transformation Estimation

    Full text link
    Global registration of heterogeneous ground and aerial mapping data is a challenging task. This is especially difficult in disaster response scenarios when we have no prior information on the environment and cannot assume the regular order of man-made environments or meaningful semantic cues. In this work we extensively evaluate different approaches to globally register UGV generated 3D point-cloud data from LiDAR sensors with UAV generated point-cloud maps from vision sensors. The approaches are realizations of different selections for: a) local features: key-points or segments; b) descriptors: FPFH, SHOT, or ESF; and c) transformation estimations: RANSAC or FGR. Additionally, we compare the results against standard approaches like applying ICP after a good prior transformation has been given. The evaluation criteria include the distance which a UGV needs to travel to successfully localize, the registration error, and the computational cost. In this context, we report our findings on effectively performing the task on two new Search and Rescue datasets. Our results have the potential to help the community take informed decisions when registering point-cloud maps from ground robots to those from aerial robots.Comment: Awarded Best Paper at the 15th IEEE International Symposium on Safety, Security, and Rescue Robotics 2017 (SSRR 2017

    Online Video Deblurring via Dynamic Temporal Blending Network

    Full text link
    State-of-the-art video deblurring methods are capable of removing non-uniform blur caused by unwanted camera shake and/or object motion in dynamic scenes. However, most existing methods are based on batch processing and thus need access to all recorded frames, rendering them computationally demanding and time consuming and thus limiting their practical use. In contrast, we propose an online (sequential) video deblurring method based on a spatio-temporal recurrent network that allows for real-time performance. In particular, we introduce a novel architecture which extends the receptive field while keeping the overall size of the network small to enable fast execution. In doing so, our network is able to remove even large blur caused by strong camera shake and/or fast moving objects. Furthermore, we propose a novel network layer that enforces temporal consistency between consecutive frames by dynamic temporal blending which compares and adaptively (at test time) shares features obtained at different time steps. We show the superiority of the proposed method in an extensive experimental evaluation.Comment: 10 page

    Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps

    Full text link
    Hyperspectral cameras can provide unique spectral signatures for consistently distinguishing materials that can be used to solve surveillance tasks. In this paper, we propose a novel real-time hyperspectral likelihood maps-aided tracking method (HLT) inspired by an adaptive hyperspectral sensor. A moving object tracking system generally consists of registration, object detection, and tracking modules. We focus on the target detection part and remove the necessity to build any offline classifiers and tune a large amount of hyperparameters, instead learning a generative target model in an online manner for hyperspectral channels ranging from visible to infrared wavelengths. The key idea is that, our adaptive fusion method can combine likelihood maps from multiple bands of hyperspectral imagery into one single more distinctive representation increasing the margin between mean value of foreground and background pixels in the fused map. Experimental results show that the HLT not only outperforms all established fusion methods but is on par with the current state-of-the-art hyperspectral target tracking frameworks.Comment: Accepted at the International Conference on Computer Vision and Pattern Recognition Workshops, 201
    corecore