70,525 research outputs found
Laparoscopic image analysis for automatic tracking of surgical tools
Laparoscopy is a surgical technique nowadays embedded in the clinical routine. Recent researches have been focused on analysing video information captured by the endoscope for extracting cues useful for surgeons, such as depth information. In particular, the 3D pose estimation of the surgical tools presents three important added values: (1) to extract objective parameters for the surgical training stage, (2) to develop an image-guided surgery based on the knowledge of the surgery tools localization, (3) to design new roboticsystems for an automatic laparoscope positioning, according to the visual feedback. Toolâs shape and orientation in the image is the key to get its 3D position. This work presents an image analysis for automatic laparoscopic toolâs detection along the recorded video without extra tool markers, using an edges detection strategy. Also, this analysis includes a previous stage of barrel distortion correction for videoendoscopic image
Identifying person re-occurrences for personal photo management applications
Automatic identification of "who" is present in individual digital images within a photo management system using only content-based analysis is an extremely difficult problem. The authors present a system which enables identification of person reoccurrences within a personal photo management application by combining image content-based analysis tools with context data from image capture. This combined system employs automatic face detection and body-patch matching techniques, which collectively facilitate identifying person re-occurrences within images grouped into events based on context data. The authors introduce a face detection approach combining a histogram-based skin detection model and a modified BDF face detection method to detect multiple frontal faces in colour images. Corresponding body patches are then automatically segmented relative to the size, location and orientation of the detected faces in the image. The authors investigate the suitability of using different colour descriptors, including MPEG-7 colour descriptors, color coherent vectors (CCV) and color correlograms for effective body-patch matching. The system has been successfully integrated into the MediAssist platform, a prototype Web-based system for personal photo management, and runs on over 13000 personal photos
Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery
Automatic multi-class object detection in remote sensing images in
unconstrained scenarios is of high interest for several applications including
traffic monitoring and disaster management. The huge variation in object scale,
orientation, category, and complex backgrounds, as well as the different camera
sensors pose great challenges for current algorithms. In this work, we propose
a new method consisting of a novel joint image cascade and feature pyramid
network with multi-size convolution kernels to extract multi-scale strong and
weak semantic features. These features are fed into rotation-based region
proposal and region of interest networks to produce object detections. Finally,
rotational non-maximum suppression is applied to remove redundant detections.
During training, we minimize joint horizontal and oriented bounding box loss
functions, as well as a novel loss that enforces oriented boxes to be
rectangular. Our method achieves 68.16% mAP on horizontal and 72.45% mAP on
oriented bounding box detection tasks on the challenging DOTA dataset,
outperforming all published methods by a large margin (+6% and +12% absolute
improvement, respectively). Furthermore, it generalizes to two other datasets,
NWPU VHR-10 and UCAS-AOD, and achieves competitive results with the baselines
even when trained on DOTA. Our method can be deployed in multi-class object
detection applications, regardless of the image and object scales and
orientations, making it a great choice for unconstrained aerial and satellite
imagery.Comment: ACCV 201
GRAPHOS â An open-source software for photogrammetric applications
19 p.This paper reports the latest developments for the photogrammetric openâsource tool called GRAPHOS (inteGRAted PHOtogrammetric Suite). GRAPHOS includes some recent innovations in the imageâbased 3D reconstruction pipeline, from automatic feature detection/description and network orientation to dense image matching and quality control. GRAPHOS also has a strong educational component beyond its automated processing functions, reinforced with tutorials and didactic explanations about algorithms and performance. The paper highlights recent developments carried out at different levels: graphical user interface (GUI), didactic simulators for image processing, photogrammetric processing with weight parameters, dataset creation and system evaluationS
Adaptive Methods for Robust Document Image Understanding
A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge
We propose a fully automatic minutiae extractor, called MinutiaeNet, based on
deep neural networks with compact feature representation for fast comparison of
minutiae sets. Specifically, first a network, called CoarseNet, estimates the
minutiae score map and minutiae orientation based on convolutional neural
network and fingerprint domain knowledge (enhanced image, orientation field,
and segmentation map). Subsequently, another network, called FineNet, refines
the candidate minutiae locations based on score map. We demonstrate the
effectiveness of using the fingerprint domain knowledge together with the deep
networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004)
public domain fingerprint datasets provide comprehensive empirical support for
the merits of our method. Further, our method finds minutiae sets that are
better in terms of precision and recall in comparison with state-of-the-art on
these two datasets. Given the lack of annotated fingerprint datasets with
minutiae ground truth, the proposed approach to robust minutiae detection will
be useful to train network-based fingerprint matching algorithms as well as for
evaluating fingerprint individuality at scale. MinutiaeNet is implemented in
Tensorflow: https://github.com/luannd/MinutiaeNetComment: Accepted to International Conference on Biometrics (ICB 2018
Pre-classification for automatic image orientation
In this paper, we propose a novel method for automatic orientation of digital images. The approach is based on exploiting the properties of local statistics of natural scenes. In this way, we address some of the difficulties encountered in previous works in this area. The main contribution of this paper is to introduce a pre-classification step into carefully defined categories in order to simplify subsequent orientation detection. The proposed algorithm was tested on 9068 images and compared to existing state of the art in the area. Results show a significant improvement over previous work
- âŠ