10,344 research outputs found

    Ball-Scale Based Hierarchical Multi-Object Recognition in 3D Medical Images

    Full text link
    This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest recognition. (3) Scale yields useful information about the relationship between the model assembly and any given image such that the recognition results in a placement of the model close to the actual pose without doing any elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.Comment: This paper was published and presented in SPIE Medical Imaging 201

    Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding

    Get PDF
    Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Deep Learning based 3D Segmentation: A Survey

    Full text link
    3D object segmentation is a fundamental and challenging problem in computer vision with applications in autonomous driving, robotics, augmented reality and medical image analysis. It has received significant attention from the computer vision, graphics and machine learning communities. Traditionally, 3D segmentation was performed with hand-crafted features and engineered methods which failed to achieve acceptable accuracy and could not generalize to large-scale data. Driven by their great success in 2D computer vision, deep learning techniques have recently become the tool of choice for 3D segmentation tasks as well. This has led to an influx of a large number of methods in the literature that have been evaluated on different benchmark datasets. This paper provides a comprehensive survey of recent progress in deep learning based 3D segmentation covering over 150 papers. It summarizes the most commonly used pipelines, discusses their highlights and shortcomings, and analyzes the competitive results of these segmentation methods. Based on the analysis, it also provides promising research directions for the future.Comment: Under review of ACM Computing Surveys, 36 pages, 10 tables, 9 figure
    • …
    corecore