    Sublinear Computation Paradigm

    This open access book gives an overview of cutting-edge work on a new paradigm called the “sublinear computation paradigm,” which was proposed in the large multiyear academic research project “Foundations of Innovative Algorithms for Big Data.” That project ran from October 2014 to March 2020, in Japan. To handle the unprecedented explosion of big data sets in research, industry, and other areas of society, there is an urgent need to develop novel methods and approaches for big data analysis. To meet this need, innovative changes in algorithm theory for big data are being pursued. For example, polynomial-time algorithms have thus far been regarded as “fast,” but if a quadratic-time algorithm is applied to a petabyte-scale or larger big data set, problems are encountered in terms of computational resources or running time. To deal with this critical computational and algorithmic bottleneck, linear, sublinear, and constant time algorithms are required. The sublinear computation paradigm is proposed here in order to support innovation in the big data era. A foundation of innovative algorithms has been created by developing computational procedures, data structures, and modelling techniques for big data. The project is organized into three teams that focus on sublinear algorithms, sublinear data structures, and sublinear modelling. The work has provided high-level academic research results of strong computational and algorithmic interest, which are presented in this book. The book consists of five parts: Part I, which consists of a single chapter on the concept of the sublinear computation paradigm; Parts II, III, and IV review results on sublinear algorithms, sublinear data structures, and sublinear modelling, respectively; Part V presents application results. The information presented here will inspire the researchers who work in the field of modern algorithms

    Optimisation for image processing

    The main purpose of optimisation in image processing is to compensate for missing, corrupted image data, or to find good correspondences between input images. We note that image data essentially has infinite dimensionality that needs to be discretised at certain levels of resolution. Most image processing methods find a suboptimal solution, given the characteristics of the problem. While the general optimisation literature is vast, there does not seem to be an accepted universal method for all image problems. In this thesis, we consider three interrelated optimisation approaches to exploit problem structures of various relaxations to three common image processing problems: 1. The first approach to the image registration problem is based on the nonlinear programming model. Image registration is an ill-posed problem and suffers from many undesired local optima. In order to remove these unwanted solutions, certain regularisers or constraints are needed. In this thesis, prior knowledge of rigid structures of the images is included in the problem using linear and bilinear constraints. The aim is to match two images while maintaining the rigid structure of certain parts of the images. A sequential quadratic programming algorithm is used, employing dimensional reduction, to solve the resulting discretised constrained optimisation problem. We show that pre-processing of the constraints can reduce problem dimensionality. Experimental results demonstrate better performance of our proposed algorithm compare to the current methods. 2. The second approach is based on discrete Markov Random Fields (MRF). MRF has been successfully used in machine learning, artificial intelligence, image processing, including the image registration problem. In the discrete MRF model, the domain of the image problem is fixed (relaxed) to a certain range. Therefore, the optimal solution to the relaxed problem could be found in the predefined domain. The original discrete MRF is NP hard and relaxations are needed to obtain a suboptimal solution in polynomial time. One popular approach is the linear programming (LP) relaxation. However, the LP relaxation of MRF (LP-MRF) is excessively high dimensional and contains sophisticated constraints. Therefore, even one iteration of a standard LP solver (e.g. interior-point algorithm), may take too long to terminate. Dual decomposition technique has been used to formulate a convex-nondifferentiable dual LP-MRF that has geometrical advantages. This has led to the development of first order methods that take into account the MRF structure. The methods considered in this thesis for solving the dual LP-MRF are the projected subgradient and mirror descent using nonlinear weighted distance functions. An analysis of the convergence properties of the method is provided, along with improved convergence rate estimates. The experiments on synthetic data and an image segmentation problem show promising results. 3. The third approach employs a hierarchy of problem's models for computing the search directions. The first two approaches are specialised methods for image problems at a certain level of discretisation. As input images are infinite-dimensional, all computational methods require their discretisation at some levels. Clearly, high resolution images carry more information but they lead to very large scale and ill-posed optimisation problems. By contrast, although low level discretisation suffers from the loss of information, it benefits from low computational cost. In addition, a coarser representation of a fine image problem could be treated as a relaxation to the problem, i.e. the coarse problem is less ill-conditioned. Therefore, propagating a solution of a good coarse approximation to the fine problem could potentially improve the fine level. With the aim of utilising low level information within the high level process, we propose a multilevel optimisation method to solve the convex composite optimisation problem. This problem consists of the minimisation of the sum of a smooth convex function and a simple non-smooth convex function. The method iterates between fine and coarse levels of discretisation in the sense that the search direction is computed using information from either the gradient or a solution of the coarse model. We show that the proposed algorithm is a contraction on the optimal solution and demonstrate excellent performance on experiments with image restoration problems.Open Acces

    Study of Computational Image Matching Techniques: Improving Our View of Biomedical Image Data

    Image matching techniques are proven to be necessary in various fields of science and engineering, with many new methods and applications introduced over the years. In this PhD thesis, several computational image matching methods are introduced and investigated for improving the analysis of various biomedical image data. These improvements include the use of matching techniques for enhancing visualization of cross-sectional imaging modalities such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), denoising of retinal Optical Coherence Tomography (OCT), and high quality 3D reconstruction of surfaces from Scanning Electron Microscope (SEM) images. This work greatly improves the process of data interpretation of image data with far reaching consequences for basic sciences research. The thesis starts with a general notion of the problem of image matching followed by an overview of the topics covered in the thesis. This is followed by introduction and investigation of several applications of image matching/registration in biomdecial image processing: a) registration-based slice interpolation, b) fast mesh-based deformable image registration and c) use of simultaneous rigid registration and Robust Principal Component Analysis (RPCA) for speckle noise reduction of retinal OCT images. Moving towards a different notion of image matching/correspondence, the problem of view synthesis and 3D reconstruction, with a focus on 3D reconstruction of microscopic samples from 2D images captured by SEM, is considered next. Starting from sparse feature-based matching techniques, an extensive analysis is provided for using several well-known feature detector/descriptor techniques, namely ORB, BRIEF, SURF and SIFT, for the problem of multi-view 3D reconstruction. This chapter contains qualitative and quantitative comparisons in order to reveal the shortcomings of the sparse feature-based techniques. This is followed by introduction of a novel framework using sparse-dense matching/correspondence for high quality 3D reconstruction of SEM images. As will be shown, the proposed framework results in better reconstructions when compared with state-of-the-art sparse-feature based techniques. Even though the proposed framework produces satisfactory results, there is room for improvements. These improvements become more necessary when dealing with higher complexity microscopic samples imaged by SEM as well as in cases with large displacements between corresponding points in micrographs. Therefore, based on the proposed framework, a new approach is proposed for high quality 3D reconstruction of microscopic samples. While in case of having simpler microscopic samples the performance of the two proposed techniques are comparable, the new technique results in more truthful reconstruction of highly complex samples. The thesis is concluded with an overview of the thesis and also pointers regarding future directions of the research using both multi-view and photometric techniques for 3D reconstruction of SEM images

    Video Object Segmentation by Tracking Structured Key Points and Contours

    In this thesis, we tackle the problem of video object segmentation where we have to classify every pixel of every frame in a video sequence into background and foreground classes. Our algorithms fall in the semi-supervised category, i.e., they start with the object of interest annotated in the first frame and then they track and segment that object in the following frames. The first algorithm that we have implemented describes the object of interest in terms of a set of points distributed on the object and then tracks them in the following frames. To make the tracking robust, we impose that the spatial distribution of these points is stable along the frames. To do so, we place a mesh on top of the mask of the object, whose vertices are the interest points to track, and the edges define the spatial structure within them. We then compute a descriptor of the appearance of each of the points and look for the displacements that bring those points in the following frame to a point with a similar descriptor. We enforce that the displacements of neighboring points are similar, which favors coherent deformations of the object. This algorithm may experience difficulties at the contours of the objects as the point descriptors might be influenced by the background. To overcome this problem, our second algorithm is based on the idea of tracking the contour of the object by imposing smooth deformations between frames. Starting from a polygonal representation of the contour of the object,we look for the locations at the following frame that have a strong response of an edge detector while minimizing the deformation of the shape. Specifically, we build a multiscale pyramid of segments of the contour polygon and look for the displacement of every segment that matches the edge response while being coherent with the rest of elements of the pyramid. This second algorithm can be understood as complementary to the first one, since it might fail in object with low-contrasted contours or with cluttered background. As an overall trade off, we propose a combination of the two algorithms that tries to make the most out of each of them and compensate their weaknesses. In order to validate our approaches, we perform an extensive validation on a recently-published database called DAVIS that provides fifty sequences with the ground truth annotated in each of their frames. We sweep all the different parameters of the algorithms in order to achieve the best performance in this database. The results show that the contour algorithm outperforms the mesh algorithm, so the weaknesses presented in the previous paragraph are more prominent in the mesh algorithm. Once we combine both of them, although we have not been able to do a full search in the parameter space, the results obtained are promising and an increase in the parameter space search suggests that we would outperform any of the standalone methods. We also perform a comparison against six state-of-the-art algorithms which shows that although we are still behind the better-performing ones, our approach might be competitive with further tuning and experimentation

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    Resilient visual perception for multiagent systems

    There has been an increasing interest in visual sensors and vision-based solutions for single and multi-robot systems. Vision-based sensors, e.g., traditional RGB cameras, grant rich semantic information and accurate directional measurements at a relatively low cost; however, such sensors have two major drawbacks. They do not generally provide reliable depth estimates, and typically have a limited field of view. These limitations considerably increase the complexity of controlling multiagent systems. This thesis studies some of the underlying problems in vision-based multiagent control and mapping. The first contribution of this thesis is a method for restoring bearing rigidity in non-rigid networks of robots. We introduce means to determine which bearing measurements can improve bearing rigidity in non-rigid graphs and provide a greedy algorithm that restores rigidity in 2D with a minimum number of added edges. The focus of the second part is on the formation control problem using only bearing measurements. We address the control problem for consensus and formation control through non-smooth Lyapunov functions and differential inclusion. We provide a stability analysis for undirected graphs and investigate the derived controllers for directed graphs. We also introduce a newer notion of bearing persistence for pure bearing-based control in directed graphs. The third part is concerned with the bearing-only visual homing problem with a limited field of view sensor. In essence, this problem is a special case of the formation control problem where there is a single moving agent with fixed neighbors. We introduce a navigational vector field composed of two orthogonal vector fields that converges to the goal position and does not violate the field of view constraints. Our method does not require the landmarks' locations and is robust to the landmarks' tracking loss. The last part of this dissertation considers outlier detection in pose graphs for Structure from Motion (SfM) and Simultaneous Localization and Mapping (SLAM) problems. We propose a method for detecting incorrect orientation measurements before pose graph optimization by checking their geometric consistency in cycles. We use Expectation-Maximization to fine-tune the noise's distribution parameters and propose a new approximate graph inference procedure specifically designed to take advantage of evidence on cycles with better performance than standard approaches. These works will help enable multi-robot systems to overcome visual sensors' limitations in collaborative tasks such as navigation and mapping

    Semantic Localization and Mapping in Robot Vision

    Integration of human semantics plays an increasing role in robotics tasks such as mapping, localization and detection. Increased use of semantics serves multiple purposes, including giving computers the ability to process and present data containing human meaningful concepts, allowing computers to employ human reasoning to accomplish tasks. This dissertation presents three solutions which incorporate semantics onto visual data in order to address these problems. First, on the problem of constructing topological maps from sequence of images. The proposed solution includes a novel image similarity score which uses dynamic programming to match images using both appearance and relative positions of local features simultaneously. An MRF is constructed to model the probability of loop-closures and a locally optimal labeling is found using Loopy-BP. The recovered loop closures are then used to generate a topological map. Results are presented on four urban sequences and one indoor sequence. The second system uses video and annotated maps to solve localization. Data association is achieved through detection of object classes, annotated in prior maps, rather than through detection of visual features. To avoid the caveats of object recognition, a new representation of query images is introduced consisting of a vector of detection scores for each object class. Using soft object detections, hypotheses about pose are refined through particle filtering. Experiments include both small office spaces, and a large open urban rail station with semantically ambiguous places. This approach showcases a representation that is both robust and can exploit the plethora of existing prior maps for GPS-denied environments while avoiding the data association problems encountered when matching point clouds or visual features. Finally, a purely vision-based approach for constructing semantic maps given camera pose and simple object exemplar images. Object response heatmaps are combined with known pose to back-project detection information onto the world. These update the world model, integrating information over time as the camera moves. The approach avoids making hard decisions on object recognition, and aggregates evidence about objects in the world coordinate system. These solutions simultaneously showcase the contribution of semantics in robotics and provide state of the art solutions to these fundamental problems

    Deformable Medical Image Registration: A Survey

    Deformable image registration is a fundamental task in medical image processing. Among its most important applications, one may cite: i) multi-modality fusion, where information acquired by different imaging devices or protocols is fused to facilitate diagnosis and treatment planning; ii) longitudinal studies, where temporal structural or anatomical changes are investigated; and iii) population modeling and statistical atlases used to study normal anatomical variability. In this technical report, we attempt to give an overview of deformable registration methods, putting emphasis on the most recent advances in the domain. Additional emphasis has been given to techniques applied to medical images. In order to study image registration methods in depth, their main components are identified and studied independently. The most recent techniques are presented in a systematic fashion. The contribution of this technical report is to provide an extensive account of registration techniques in a systematic manner.Le recalage déformable d'images est une des tâches les plus fondamentales dans l'imagerie médicale. Parmi ses applications les plus importantes, on compte: i) la fusion d' information provenant des différents types de modalités a n de faciliter le diagnostic et la planification du traitement; ii) les études longitudinales, oú des changements structurels ou anatomiques sont étudiées en fonction du temps; et iii) la modélisation de la variabilité anatomique normale d'une population et les atlas statistiques. Dans ce rapport de recherche, nous essayons de donner un aperçu des différentes méthodes du recalage déformables, en mettant l'accent sur les avancées les plus récentes du domaine. Nous avons particulièrement insisté sur les techniques appliquées aux images médicales. A n d'étudier les méthodes du recalage d'images, leurs composants principales sont d'abord identifiés puis étudiées de manière indépendante, les techniques les plus récentes étant classifiées en suivant un schéma logique déterminé. La contribution de ce rapport de recherche est de fournir un compte rendu détaillé des techniques de recalage d'une manière systématique