20 research outputs found
Interactive Image Processing demonstrations for the web
The main goal in this project is to improve the way how image processing developers can test their algorithms, and show them to other people to demonstrate their performance.This diploma thesis aims to provide a framework for developing web applications for ImagePlus, the software develpment platform in C++ of the Image Processing Group of the Technical University of Catalonia (UPC).
These web applications are to demonstrate the functionality of the image processing algorithms to any visitor to the group website. Developers are also benefited from this graphical user interface because they can easily create Graphical User Interfaces (GUIs) for the processing algorithms
UPC at MediaEval 2013 Hyperlinking Task
These working notes paper present the contribution of the
UPC team to the Hyperlinking sub-task of the Search and
Hyperlinking Task in MediaEval 2013. Our contribution ex-
plores the potential of a solution based only on visual cues.
In particular, every automatically generated shot is repre-
sented by a keyframe. The linking between video segments is
based on the visual similarity of the keyframes they contain.
Visual similarity is assessed with the intersection of bag of
features histograms generated with the SURF descriptor.Postprint (published version
Retrieval and Registration of Long-Range Overlapping Frames for Scalable Mosaicking of In Vivo Fetoscopy
Purpose: The standard clinical treatment of Twin-to-Twin Transfusion Syndrome
consists in the photo-coagulation of undesired anastomoses located on the
placenta which are responsible to a blood transfer between the two twins. While
being the standard of care procedure, fetoscopy suffers from a limited
field-of-view of the placenta resulting in missed anastomoses. To facilitate
the task of the clinician, building a global map of the placenta providing a
larger overview of the vascular network is highly desired. Methods: To overcome
the challenging visual conditions inherent to in vivo sequences (low contrast,
obstructions or presence of artifacts, among others), we propose the following
contributions: (i) robust pairwise registration is achieved by aligning the
orientation of the image gradients, and (ii) difficulties regarding long-range
consistency (e.g. due to the presence of outliers) is tackled via a bag-of-word
strategy, which identifies overlapping frames of the sequence to be registered
regardless of their respective location in time. Results: In addition to visual
difficulties, in vivo sequences are characterised by the intrinsic absence of
gold standard. We present mosaics motivating qualitatively our methodological
choices and demonstrating their promising aspect. We also demonstrate
semi-quantitatively, via visual inspection of registration results, the
efficacy of our registration approach in comparison to two standard baselines.
Conclusion: This paper proposes the first approach for the construction of
mosaics of placenta in in vivo fetoscopy sequences. Robustness to visual
challenges during registration and long-range temporal consistency are
proposed, offering first positive results on in vivo data for which standard
mosaicking techniques are not applicable.Comment: Accepted for publication in International Journal of Computer
Assisted Radiology and Surgery (IJCARS
Deep Sequential Mosaicking of Fetoscopic Videos
Twin-to-twin transfusion syndrome treatment requires fetoscopic laser
photocoagulation of placental vascular anastomoses to regulate blood flow to
both fetuses. Limited field-of-view (FoV) and low visual quality during
fetoscopy make it challenging to identify all vascular connections. Mosaicking
can align multiple overlapping images to generate an image with increased FoV,
however, existing techniques apply poorly to fetoscopy due to the low visual
quality, texture paucity, and hence fail in longer sequences due to the drift
accumulated over time. Deep learning techniques can facilitate in overcoming
these challenges. Therefore, we present a new generalized Deep Sequential
Mosaicking (DSM) framework for fetoscopic videos captured from different
settings such as simulation, phantom, and real environments. DSM extends an
existing deep image-based homography model to sequential data by proposing
controlled data augmentation and outlier rejection methods. Unlike existing
methods, DSM can handle visual variations due to specular highlights and
reflection across adjacent frames, hence reducing the accumulated drift. We
perform experimental validation and comparison using 5 diverse fetoscopic
videos to demonstrate the robustness of our framework.Comment: Accepted at MICCAI 201
Reconeixement d'objectes sense context amb eSIFT i BoW+S
Currently, there are highly competitive results in the field of object recognition based on the aggregation of point-based features [4, 26, 5, 6]. The aggregation process, typically with an average or max-pooling of the features generates a single vector that represents the image or region that contains the object [7]. The aggregated point-based features typically describe the texture around the points with descriptors such as SIFT. These descriptors present limitations for wired and textureless objects. A possible solution is the addition of shape-based information. [9, 6, 2, 12]. Shape descriptors have been previously used to encode shape information and thus, recognise those types of objects. But generally an alignment step is required in order to match every point from one shape to other ones. The computational cost of the similarity assessment is high. We purpose to enrich location and texture-based features with shape-based ones. Two main architectures are explored: On the one side, to enrich the SIFT descriptors with shape information before they are aggregated. On the other side, to create the standard Bag of Words [7] histogram and concatenate a shape histogram, classifying them as a single vector. We evaluate the proposed techniques and the novel features on the Caltech-101 dataset. Results show that shape features increase the final performance. Our extension of the Bag of Words with a shape-based histogram(BoW+S) results in better performance. However, for a high number of shape features, BoW+S and enriched SIFT architectures tend to converge
Efficient Drift-Free Mosaicking of Fetoscopic Videos Using an Electromagnetic Tracker
Twin-to-Twin Transfusion Syndrome is a fetal illness in which twins share vascular connections in the placenta. This results in an imbalance in the blood flow that might be fatal for both twins. Surgeons need to be very well trained to be able to localise and further photo-coagulate the anastomoses in the placenta given the small field-of-view of the fetoscope and lack of texture of the imagery in the fetoscopic site. We investigate mosaicking as a means of expanding the field-of-view to a larger image of the explored area as the camera moves. However, the complexity of fetoscopic data makes current state-of-the-art algorithms lack robustness. Additionally, vision-only mosaicking algorithms suffer from drift. The main focus of this thesis is to study the incorporation of an electromagnetic tracker into the mosaicking pipeline. We demonstrate that the guidance of the electromagnetic tracker can mitigate the drift in twp steps: First exploring a dynamic state-space model that reduces the drift. Despite being suitable for online operation, we investigate a fully probabilistic model that completely eliminates the drift. Then, we use the proposed algorithms to investigate pruning strategies, aiming to a achieve mosaics built at clinically acceptable update rates. The results suggest that the joint use of the electromagnetic tracker with the imagery can improve the accuracy, robustness, and efficiency with respect to algorithms that use exclusively imagery. We believe the inclusion of the electromagnetic tracker is a step forward towards clinical translation of mosaicking for fetoscopy
Reconeixement d'objectes sense context amb eSIFT i BoW+S
Currently, there are highly competitive results in the field of object recognition based on the aggregation of point-based features [4, 26, 5, 6]. The aggregation process, typically with an average or max-pooling of the features generates a single vector that represents the image or region that contains the object [7]. The aggregated point-based features typically describe the texture around the points with descriptors such as SIFT. These descriptors present limitations for wired and textureless objects. A possible solution is the addition of shape-based information. [9, 6, 2, 12]. Shape descriptors have been previously used to encode shape information and thus, recognise those types of objects. But generally an alignment step is required in order to match every point from one shape to other ones. The computational cost of the similarity assessment is high. We purpose to enrich location and texture-based features with shape-based ones. Two main architectures are explored: On the one side, to enrich the SIFT descriptors with shape information before they are aggregated. On the other side, to create the standard Bag of Words [7] histogram and concatenate a shape histogram, classifying them as a single vector. We evaluate the proposed techniques and the novel features on the Caltech-101 dataset. Results show that shape features increase the final performance. Our extension of the Bag of Words with a shape-based histogram(BoW+S) results in better performance. However, for a high number of shape features, BoW+S and enriched SIFT architectures tend to converge
Interactive Image Processing demonstrations for the web
The main goal in this project is to improve the way how image processing developers can test their algorithms, and show them to other people to demonstrate their performance.This diploma thesis aims to provide a framework for developing web applications for ImagePlus, the software develpment platform in C++ of the Image Processing Group of the Technical University of Catalonia (UPC).
These web applications are to demonstrate the functionality of the image processing algorithms to any visitor to the group website. Developers are also benefited from this graphical user interface because they can easily create Graphical User Interfaces (GUIs) for the processing algorithms
Interactive Image Processing demonstrations for the web
The main goal in this project is to improve the way how image processing developers can test their algorithms, and show them to other people to demonstrate their performance.This diploma thesis aims to provide a framework for developing web applications for ImagePlus, the software develpment platform in C++ of the Image Processing Group of the Technical University of Catalonia (UPC).
These web applications are to demonstrate the functionality of the image processing algorithms to any visitor to the group website. Developers are also benefited from this graphical user interface because they can easily create Graphical User Interfaces (GUIs) for the processing algorithms
UPC at MediaEval 2013 Hyperlinking Task
These working notes paper present the contribution of the
UPC team to the Hyperlinking sub-task of the Search and
Hyperlinking Task in MediaEval 2013. Our contribution ex-
plores the potential of a solution based only on visual cues.
In particular, every automatically generated shot is repre-
sented by a keyframe. The linking between video segments is
based on the visual similarity of the keyframes they contain.
Visual similarity is assessed with the intersection of bag of
features histograms generated with the SURF descriptor