11 research outputs found

    Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

    Full text link
    People detection in single 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly model those ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-art algorithms on challenging scenes

    A Projected Gradient Descent Method for CRF Inference allowing End-To-End Training of Arbitrary Pairwise Potentials

    Full text link
    Are we using the right potential functions in the Conditional Random Field models that are popular in the Vision community? Semantic segmentation and other pixel-level labelling tasks have made significant progress recently due to the deep learning paradigm. However, most state-of-the-art structured prediction methods also include a random field model with a hand-crafted Gaussian potential to model spatial priors, label consistencies and feature-based image conditioning. In this paper, we challenge this view by developing a new inference and learning framework which can learn pairwise CRF potentials restricted only by their dependence on the image pixel values and the size of the support. Both standard spatial and high-dimensional bilateral kernels are considered. Our framework is based on the observation that CRF inference can be achieved via projected gradient descent and consequently, can easily be integrated in deep neural networks to allow for end-to-end training. It is empirically demonstrated that such learned potentials can improve segmentation accuracy and that certain label class interactions are indeed better modelled by a non-Gaussian potential. In addition, we compare our inference method to the commonly used mean-field algorithm. Our framework is evaluated on several public benchmarks for semantic segmentation with improved performance compared to previous state-of-the-art CNN+CRF models.Comment: Presented at EMMCVPR 2017 conferenc

    Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach

    Full text link
    We propose a fully differentiable architecture for simultaneous semantic and instance segmentation (a.k.a. panoptic segmentation) consisting of a convolutional neural network and an asymmetric multiway cut problem solver. The latter solves a combinatorial optimization problem that elegantly incorporates semantic and boundary predictions to produce a panoptic labeling. Our formulation allows to directly maximize a smooth surrogate of the panoptic quality metric by backpropagating the gradient through the optimization problem. Experimental evaluation shows improvement by backpropagating through the optimization problem w.r.t. comparable approaches on Cityscapes and COCO datasets. Overall, our approach shows the utility of using combinatorial optimization in tandem with deep learning in a challenging large scale real-world problem and showcases benefits and insights into training such an architecture.Comment: To be presented at NeurIPS 202

    An improved data classification framework based on fractional particle swarm optimization

    Get PDF
    Particle Swarm Optimization (PSO) is a population based stochastic optimization technique which consist of particles that move collectively in iterations to search for the most optimum solutions. However, conventional PSO is prone to lack of convergence and even stagnation in complex high dimensional-search problems with multiple local optima. Therefore, this research proposed an improved Mutually-Optimized Fractional PSO (MOFPSO) algorithm based on fractional derivatives and small step lengths to ensure convergence to global optima by supplying a fine balance between exploration and exploitation. The proposed algorithm is tested and verified for optimization performance comparison on ten benchmark functions against six existing established algorithms in terms of Mean of Error and Standard Deviation values. The proposed MOFPSO algorithm demonstrated lowest Mean of Error values during the optimization on all benchmark functions through all 30 runs (Ackley = 0.2, Rosenbrock = 0.2, Bohachevsky = 9.36E-06, Easom = -0.95, Griewank = 0.01, Rastrigin = 2.5E-03, Schaffer = 1.31E-06, Schwefel 1.2 = 3.2E-05, Sphere = 8.36E-03, Step = 0). Furthermore, the proposed MOFPSO algorithm is hybridized with Back-Propagation (BP), Elman Recurrent Neural Networks (RNN) and Levenberg-Marquardt (LM) Artificial Neural Networks (ANNs) to propose an enhanced data classification framework, especially for data classification applications. The proposed classification framework is then evaluated for classification accuracy, computational time and Mean Squared Error on five benchmark datasets against seven existing techniques. It can be concluded from the simulation results that the proposed MOFPSO-ERNN classification algorithm demonstrated good classification performance in terms of classification accuracy (Breast Cancer = 99.01%, EEG = 99.99%, PIMA Indian Diabetes = 99.37%, Iris = 99.6%, Thyroid = 99.88%) as compared to the existing hybrid classification techniques. Hence, the proposed technique can be employed to improve the overall classification accuracy and reduce the computational time in data classification applications

    Joint training of generic CNN-CRF models with stochastic optimization

    No full text
    We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrary CNN and CRF architectures and potential functions; (ii) scalable, i.e. it has a low memory footprint and straightforwardly parallelizes on GPUs; (iii) easy in implementation. Additionally, the unified CNN-CRF optimization approach simplifies a potential hardware implementation. We empirically evaluate our method on the task of semantic labeling of body parts in depth images and show that it compares favorably to competing techniques

    End-to-End Learning of Deep Structured Models for Semantic Segmentation

    Get PDF
    The task of semantic segmentation aims at understanding an image at a pixel level. This means assigning a label to each pixel of an image, describing the object it is depicting. Due to its applicability in many areas, such as autonomous vehicles, robotics and medical surgery assistance, semantic segmentation has become an essential task in image analysis. During the last few years a lot of progress have been made for image segmentation algorithms, mainly due to the introduction of deep learning methods, in particular the use of Convolutional Neural Networks (CNNs). CNNs are powerful for modeling complex connections between input and output data but lack the ability to directly model dependent output structures, for instance, enforcing properties such as label smoothness and coherence. This drawback motivates the use of Conditional Random Fields (CRFs), widely applied as a post-processing step in semantic segmentation.This thesis summarizes the content of three papers, all of them presenting solutions to semantic segmentation problems. The applications have varied widely and several different types of data have been considered, ranging from 3D CT images to RGB images of horses. The main focus has been on developing robust and accurate models to solve these problems. The models consist of a CNN capable of learning complex image features coupled with a CRF capable of learning dependencies between output variables. Emphasis has been on creating models that are possible to train end-to-end, as well as developing corresponding optimization methods needed to enable efficient training. End-to-end training gives the CNN and the CRF a chance to learn how to interact and exploit complementary information to achieve better performance
    corecore