491 research outputs found

    Optimizing Geometry Compression using Quantum Annealing

    Full text link
    The compression of geometry data is an important aspect of bandwidth-efficient data transfer for distributed 3d computer vision applications. We propose a quantum-enabled lossy 3d point cloud compression pipeline based on the constructive solid geometry (CSG) model representation. Key parts of the pipeline are mapped to NP-complete problems for which an efficient Ising formulation suitable for the execution on a Quantum Annealer exists. We describe existing Ising formulations for the maximum clique search problem and the smallest exact cover problem, both of which are important building blocks of the proposed compression pipeline. Additionally, we discuss the properties of the overall pipeline regarding result optimality and described Ising formulations.Comment: 6 pages, 3 figure

    A Survey of Methods for Converting Unstructured Data to CSG Models

    Full text link
    The goal of this document is to survey existing methods for recovering CSG representations from unstructured data such as 3D point-clouds or polygon meshes. We review and discuss related topics such as the segmentation and fitting of the input data. We cover techniques from solid modeling and CAD for polyhedron to CSG and B-rep to CSG conversion. We look at approaches coming from program synthesis, evolutionary techniques (such as genetic programming or genetic algorithm), and deep learning methods. Finally, we conclude with a discussion of techniques for the generation of computer programs representing solids (not just CSG models) and higher-level representations (such as, for example, the ones based on sketch and extrusion or feature based operations).Comment: 29 page

    Analysis and development of the Bees Algorithm for primitive fitting in point cloud models

    Get PDF
    This work addresses the problem of fitting a geometrical primitive to a point cloud as a numerical optimisation problem. Intelligent Optimisation Techniques like Evolutionary Algorithms and the Bees Algorithm were here adapted to select the most fit primitive out of a population of solutions, and the results compared. The necessity of understanding the dynamics of the Bees Algorithm to improve its performances and applicability led to an in-depth analysis of its key parts. A new mathematical definition of the algorithm led to the discovery and formalisation of several properties, many of which provided a mathematical answer to behaviours so far only observed in empirical tests. The implications of heuristics commonly used in the Bees Algorithm, like site abandonment and neighbourhood shrinking, were statistically analysed. The probability of a premature stalling of the local search at a site has been quantified under certain conditions. The effect of the choice of shape for the local neighbourhood on the exploitative search of the Bees Algorithm was analysed. The study revealed that this commonly overlooked aspect has profound consequences on the effectiveness of the local search, and practical applications have been suggested to address specific search problems. The results of the primitive fitting study, and the analysis of the Bees Algorithm, inspired the creation of a new algorithm for problems where multiple solutions are sought (multi-solution optimisation). This new algorithm is an ex- tension of the Bees Algorithm to multi-solution optimisation. It uses topological information on the search space gathered during the cycles of local search at a site, which is normally discarded, to alter the fitness function. The function is altered to discourage further search in already explored regions of the fitness landscape, and force the algorithm to discover new optima. This new algorithm found immediate application on the multi-shape variant of the primitive fitting problem. In a series of experimental tests, the new algorithm obtained promising results, showing its ability to find many shapes in a point cloud. It also showed its suitability as a general technique for the multi-solution optimisation problem

    Cell Image Segmentation with Kernel-Based Dynamic Clustering and an Ellipsoidal Cell Shape Model

    Get PDF
    AbstractIn this paper, we propose a novel approach to cell image segmentation under severe noise conditions by combining kernel-based dynamic clustering and a genetic algorithm. Our method incorporates a priori knowledge about cell shape. That is, an elliptical cell contour model is introduced to describe the boundary of the cell. Our method consists of the following components: (1) obtain the gradient image; (2) use the gradient image to obtain points which possibly belong to cell boundaries; (3) adjust the parameters of the elliptical cell boundary model to match the cell contour using a genetic algorithm. The method is tested on images of noisy human thyroid and small intestine cells

    FACE IMAGE RECOGNITION BASED ON PARTIAL FACE MATCHING USING GENETIC ALGORITHM

    Get PDF
    In various real-world face recognition applications such as forensics and surveillance, only partial face image is available. Hence, template matching and recognition are strongly needed. In this paper, a genetic algorithm to match a pattern of an image and then recognize this image by this pattern is proposed. This algorithm can use any pattern of an image such as eye, mouth or ear to recognize the image. The proposed genetic algorithm uses a small length chromosome to decrease the search space, and hence the results could be obtained in a short time. Two datasets were used to test the proposed method which are AR Face database and LFW database of face, the overall matching and recognition accuracy were calculated based on conducting sequences of experiments on random sub-datasets, where the overall matching and recognition accuracy was 91.7% and 90% respectively. The results of the proposed algorithm demonstrate the robustness and efficiency compared with other state-of-the-art algorithm

    Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description

    Get PDF
    In this study, a new multi-view human action recognition approach is proposed by exploiting low-dimensional motion information of actions. Before feature extraction, pre-processing steps are performed to remove noise from silhouettes, incurred due to imperfect, but realistic segmentation. Two-dimensional motion templates based on motion history image (MHI) are computed for each view/action video. Histograms of oriented gradients (HOGs) are used as an efficient description of the MHIs which are classified using nearest neighbor (NN) classifier. As compared with existing approaches, the proposed method has three advantages: (i) does not require a fixed number of cameras setup during training and testing stages hence missing camera-views can be tolerated, (ii) requires less memory and bandwidth requirements and hence (iii) is computationally efficient which makes it suitable for real-time action recognition. As far as the authors know, this is the first report of results on the MuHAVi-uncut dataset having a large number of action categories and a large set of camera-views with noisy silhouettes which can be used by future workers as a baseline to improve on. Experimentation results on multi-view with this dataset gives a high-accuracy rate of 95.4% using leave-one-sequence-out cross-validation technique and compares well to similar state-of-the-art approachesSergio A Velastin acknowledges the Chilean National Science and Technology Council (CONICYT) for its funding under grant CONICYT-Fondecyt Regular no. 1140209 (“OBSERVE”). He is currently funded by the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement nº 600371, el Ministerio de Economía y Competitividad (COFUND2013-51509) and Banco Santander

    Mapping beyond what you can see: Predicting the layout of rooms behind closed doors

    Get PDF
    The availability of maps of indoor environments is often fundamental for autonomous mobile robots to efficiently operate in industrial, office, and domestic applications. When robots build such maps, some areas of interest could be inaccessible, for instance, due to closed doors. As a consequence, these areas are not represented in the maps, possibly causing limitations in robot localization and navigation. In this paper, we provide a method that completes 2D grid maps by adding the predicted layout of the rooms behind closed doors. The main idea of our approach is to exploit the underlying geometrical structure of indoor environments to estimate the shape of unobserved rooms. Results show that our method is accurate in completing maps also when large portions of environments cannot be accessed by the robot during map building. We experimentally validate the quality of the completed maps by using them to perform path planning tasks.(c) 2022 Elsevier B.V. All rights reserved

    Toward Understanding Human Expression in Human-Robot Interaction

    Get PDF
    Intelligent devices are quickly becoming necessities to support our activities during both work and play. We are already bound in a symbiotic relationship with these devices. An unfortunate effect of the pervasiveness of intelligent devices is the substantial investment of our time and effort to communicate intent. Even though our increasing reliance on these intelligent devices is inevitable, the limits of conventional methods for devices to perceive human expression hinders communication efficiency. These constraints restrict the usefulness of intelligent devices to support our activities. Our communication time and effort must be minimized to leverage the benefits of intelligent devices and seamlessly integrate them into society. Minimizing the time and effort needed to communicate our intent will allow us to concentrate on tasks in which we excel, including creative thought and problem solving. An intuitive method to minimize human communication effort with intelligent devices is to take advantage of our existing interpersonal communication experience. Recent advances in speech, hand gesture, and facial expression recognition provide alternate viable modes of communication that are more natural than conventional tactile interfaces. Use of natural human communication eliminates the need to adapt and invest time and effort using less intuitive techniques required for traditional keyboard and mouse based interfaces. Although the state of the art in natural but isolated modes of communication achieves impressive results, significant hurdles must be conquered before communication with devices in our daily lives will feel natural and effortless. Research has shown that combining information between multiple noise-prone modalities improves accuracy. Leveraging this complementary and redundant content will improve communication robustness and relax current unimodal limitations. This research presents and evaluates a novel multimodal framework to help reduce the total human effort and time required to communicate with intelligent devices. This reduction is realized by determining human intent using a knowledge-based architecture that combines and leverages conflicting information available across multiple natural communication modes and modalities. The effectiveness of this approach is demonstrated using dynamic hand gestures and simple facial expressions characterizing basic emotions. It is important to note that the framework is not restricted to these two forms of communication. The framework presented in this research provides the flexibility necessary to include additional or alternate modalities and channels of information in future research, including improving the robustness of speech understanding. The primary contributions of this research include the leveraging of conflicts in a closed-loop multimodal framework, explicit use of uncertainty in knowledge representation and reasoning across multiple modalities, and a flexible approach for leveraging domain specific knowledge to help understand multimodal human expression. Experiments using a manually defined knowledge base demonstrate an improved average accuracy of individual concepts and an improved average accuracy of overall intents when leveraging conflicts as compared to an open-loop approach
    corecore