225 research outputs found
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
We propose a novel method for learning convolutional neural image
representations without manual supervision. We use motion cues in the form of
optical flow, to supervise representations of static images. The obvious
approach of training a network to predict flow from a single image can be
needlessly difficult due to intrinsic ambiguities in this prediction task. We
instead propose a much simpler learning goal: embed pixels such that the
similarity between their embeddings matches that between their optical flow
vectors. At test time, the learned deep network can be used without access to
video or flow information and transferred to tasks such as image
classification, detection, and segmentation. Our method, which significantly
simplifies previous attempts at using motion for self-supervision, achieves
state-of-the-art results in self-supervision using motion cues, competitive
results for self-supervision in general, and is overall state of the art in
self-supervised pretraining for semantic image segmentation, as demonstrated on
standard benchmarks
Point Cloud Registration for LiDAR and Photogrammetric Data: a Critical Synthesis and Performance Analysis on Classic and Deep Learning Algorithms
Recent advances in computer vision and deep learning have shown promising
performance in estimating rigid/similarity transformation between unregistered
point clouds of complex objects and scenes. However, their performances are
mostly evaluated using a limited number of datasets from a single sensor (e.g.
Kinect or RealSense cameras), lacking a comprehensive overview of their
applicability in photogrammetric 3D mapping scenarios. In this work, we provide
a comprehensive review of the state-of-the-art (SOTA) point cloud registration
methods, where we analyze and evaluate these methods using a diverse set of
point cloud data from indoor to satellite sources. The quantitative analysis
allows for exploring the strengths, applicability, challenges, and future
trends of these methods. In contrast to existing analysis works that introduce
point cloud registration as a holistic process, our experimental analysis is
based on its inherent two-step process to better comprehend these approaches
including feature/keypoint-based initial coarse registration and dense fine
registration through cloud-to-cloud (C2C) optimization. More than ten methods,
including classic hand-crafted, deep-learning-based feature correspondence, and
robust C2C methods were tested. We observed that the success rate of most of
the algorithms are fewer than 40% over the datasets we tested and there are
still are large margin of improvement upon existing algorithms concerning 3D
sparse corresopondence search, and the ability to register point clouds with
complex geometry and occlusions. With the evaluated statistics on three
datasets, we conclude the best-performing methods for each step and provide our
recommendations, and outlook future efforts.Comment: 7 figure
The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection
Where am I? This is one of the most critical questions that any intelligent
system should answer to decide whether it navigates to a previously visited
area. This problem has long been acknowledged for its challenging nature in
simultaneous localization and mapping (SLAM), wherein the robot needs to
correctly associate the incoming sensory data to the database allowing
consistent map generation. The significant advances in computer vision achieved
over the last 20 years, the increased computational power, and the growing
demand for long-term exploration contributed to efficiently performing such a
complex task with inexpensive perception sensors. In this article, visual loop
closure detection, which formulates a solution based solely on appearance input
data, is surveyed. We start by briefly introducing place recognition and SLAM
concepts in robotics. Then, we describe a loop closure detection system's
structure, covering an extensive collection of topics, including the feature
extraction, the environment representation, the decision-making step, and the
evaluation process. We conclude by discussing open and new research challenges,
particularly concerning the robustness in dynamic environments, the
computational complexity, and scalability in long-term operations. The article
aims to serve as a tutorial and a position paper for newcomers to visual loop
closure detection.Comment: 25 pages, 15 figure
Brain Tumor Detection and Segmentation in Multisequence MRI
Tato práce se zabĂ˝vá detekcĂ a segmentacĂ mozkovĂ©ho nádoru v multisekvenÄŤnĂch MR obrazech se zaměřenĂm na gliomy vysokĂ©ho a nĂzkĂ©ho stupnÄ› malignity. Jsou zde pro tento účel navrĹľeny tĹ™i metody. PrvnĂ metoda se zabĂ˝vá detekcĂ prezence částĂ mozkovĂ©ho nádoru v axiálnĂch a koronárnĂch Ĺ™ezech. Jedná se o algoritmus zaloĹľenĂ˝ na analĂ˝ze symetrie pĹ™i rĹŻznĂ˝ch rozlišenĂch obrazu, kterĂ˝ byl otestován na T1, T2, T1C a FLAIR obrazech. Druhá metoda se zabĂ˝vá extrakcĂ oblasti celĂ©ho mozkovĂ©ho nádoru, zahrnujĂcĂ oblast jádra tumoru a edĂ©mu, ve FLAIR a T2 obrazech. Metoda je schopna extrahovat mozkovĂ˝ nádor z 2D i 3D obrazĹŻ. Je zde opÄ›t vyuĹľita analĂ˝za symetrie, která je následována automatickĂ˝m stanovenĂm intenzitnĂho prahu z nejvĂce asymetrickĂ˝ch částĂ. TĹ™etĂ metoda je zaloĹľena na predikci lokálnĂ struktury a je schopna segmentovat celou oblast nádoru, jeho jádro i jeho aktivnà část. Metoda vyuĹľĂvá faktu, Ĺľe vÄ›tšina lĂ©kaĹ™skĂ˝ch obrazĹŻ vykazuje vysokou podobnost intenzit sousednĂch pixelĹŻ a silnou korelaci mezi intenzitami v rĹŻznĂ˝ch obrazovĂ˝ch modalitách. JednĂm ze zpĹŻsobĹŻ, jak s touto korelacĂ pracovat a pouĹľĂvat ji, je vyuĹľitĂ lokálnĂch obrazovĂ˝ch polĂ. Podobná korelace existuje takĂ© mezi sousednĂmi pixely v anotaci obrazu. Tento pĹ™Ăznak byl vyuĹľit v predikci lokálnĂ struktury pĹ™i lokálnĂ anotaci polĂ. Jako klasifikaÄŤnĂ algoritmus je v tĂ©to metodÄ› pouĹľita konvoluÄŤnĂ neuronová sĂĹĄ vzhledem k jejĂ známe schopnosti zacházet s korelacĂ mezi pĹ™Ăznaky. Všechny tĹ™i metody byly otestovány na veĹ™ejnĂ© databázi 254 multisekvenÄŤnĂch MR obrazech a byla dosáhnuta pĹ™esnost srovnatelná s nejmodernÄ›jšĂmi metodami v mnohem kratšĂm vĂ˝poÄŤetnĂm ÄŤase (v řádu sekund pĹ™i pouĹľitĂ˝ CPU), coĹľ poskytuje moĹľnost manuálnĂch Ăşprav pĹ™i interaktivnĂ segmetaci.This work deals with the brain tumor detection and segmentation in multisequence MR images with particular focus on high- and low-grade gliomas. Three methods are propose for this purpose. The first method deals with the presence detection of brain tumor structures in axial and coronal slices. This method is based on multi-resolution symmetry analysis and it was tested for T1, T2, T1C and FLAIR images. The second method deals with extraction of the whole brain tumor region, including tumor core and edema, in FLAIR and T2 images and is suitable to extract the whole brain tumor region from both 2D and 3D. It also uses the symmetry analysis approach which is followed by automatic determination of the intensity threshold from the most asymmetric parts. The third method is based on local structure prediction and it is able to segment the whole tumor region as well as tumor core and active tumor. This method takes the advantage of a fact that most medical images feature a high similarity in intensities of nearby pixels and a strong correlation of intensity profiles across different image modalities. One way of dealing with -- and even exploiting -- this correlation is the use of local image patches. In the same way, there is a high correlation between nearby labels in image annotation, a feature that has been used in the ``local structure prediction'' of local label patches. Convolutional neural network is chosen as a learning algorithm, as it is known to be suited for dealing with correlation between features. All three methods were evaluated on a public data set of 254 multisequence MR volumes being able to reach comparable results to state-of-the-art methods in much shorter computing time (order of seconds running on CPU) providing means, for example, to do online updates when aiming at an interactive segmentation.
A Computational Framework for Learning from Complex Data: Formulations, Algorithms, and Applications
Many real-world processes are dynamically changing over time. As a consequence, the observed complex data generated by these processes also evolve smoothly. For example, in computational biology, the expression data matrices are evolving, since gene expression controls are deployed sequentially during development in many biological processes. Investigations into the spatial and temporal gene expression dynamics are essential for understanding the regulatory biology governing development. In this dissertation, I mainly focus on two types of complex data: genome-wide spatial gene expression patterns in the model organism fruit fly and Allen Brain Atlas mouse brain data. I provide a framework to explore spatiotemporal regulation of gene expression during development. I develop evolutionary co-clustering formulation to identify co-expressed domains and the associated genes simultaneously over different temporal stages using a mesh-generation pipeline. I also propose to employ the deep convolutional neural networks as a multi-layer feature extractor to generate generic representations for gene expression pattern in situ hybridization (ISH) images. Furthermore, I employ the multi-task learning method to fine-tune the pre-trained models with labeled ISH images. My proposed computational methods are evaluated using synthetic data sets and real biological data sets including the gene expression data from the fruit fly BDGP data sets and Allen Developing Mouse Brain Atlas in comparison with baseline existing methods. Experimental results indicate that the proposed representations, formulations, and methods are efficient and effective in annotating and analyzing the large-scale biological data sets
- …