593,587 research outputs found

    Greedy Structure Learning of Hierarchical Compositional Models

    Get PDF
    In this work, we consider the problem of learning a hierarchical generative model of an object from a set of im-ages which show examples of the object in the presenceof variable background clutter. Existing approaches tothis problem are limited by making strong a-priori assump-tions about the object’s geometric structure and require seg-mented training data for learning. In this paper, we pro-pose a novel framework for learning hierarchical compo-sitional models (HCMs) which do not suffer from the men-tioned limitations. We present a generalized formulation ofHCMs and describe a greedy structure learning frameworkthat consists of two phases: Bottom-up part learning andtop-down model composition. Our framework integratesthe foreground-background segmentation problem into thestructure learning task via a background model. As a result, we can jointly optimize for the number of layers in thehierarchy, the number of parts per layer and a foreground-background segmentation based on class labels only. Weshow that the learned HCMs are semantically meaningfuland achieve competitive results when compared to othergenerative object models at object classification on a stan-dard transfer learning dataset

    Learning objects and learning designs: an integrated system for reusable, adaptive and shareable learning content

    Get PDF
    This paper proposes a system, the Smart Learning Design Framework, designed to support the development of pedagogically sound learning material within an integrated, platform-independent data structure. The system supports sharing, reuse and adaptation of learning material via a metadata-driven philosophy that enables the technicalities of the system to be imperceptible to the author and consumer. The system proposes the use of pedagogically focused metadata to support and guide the author and to adapt and deliver the content to the targeted consumer. A prototype of the proposed system, which provides proof of concept for the novel processes involved, has been developed. The paper describes the Smart Learning Design Framework and places it within the context of alternative learning object models and frameworks to highlight similarities, differences and advantages of the proposed system

    Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

    Get PDF
    Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which leverages the strengths of multiple machine learning methods, viz deep learning, probabilistic models and kernel methods to obtain state-of-art performance on Microsoft COCO, consisting of non-iconic images. We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input. We learn the CLTM tree structure using conditional pairwise probabilities for object co-occurrences, estimated through kernel methods, and we learn its node and edge potentials by training a new 3-layer neural network, which takes fc7 features as input. Object classification is carried out via inference on the learnt conditional tree model, and we obtain significant gain in precision-recall and F-measures on MS-COCO, especially for difficult object categories. Moreover, the latent variables in the CLTM capture scene information: the images with top activations for a latent node have common themes such as being a grasslands or a food scene, and on on. In addition, we show that a simple k-means clustering of the inferred latent nodes alone significantly improves scene classification performance on the MIT-Indoor dataset, without the need for any retraining, and without using scene labels during training. Thus, we present a unified framework for multi-object classification and unsupervised scene understanding

    Multidimensional Membership Mixture Models

    Full text link
    We present the multidimensional membership mixture (M3) models where every dimension of the membership represents an independent mixture model and each data point is generated from the selected mixture components jointly. This is helpful when the data has a certain shared structure. For example, three unique means and three unique variances can effectively form a Gaussian mixture model with nine components, while requiring only six parameters to fully describe it. In this paper, we present three instantiations of M3 models (together with the learning and inference algorithms): infinite, finite, and hybrid, depending on whether the number of mixtures is fixed or not. They are built upon Dirichlet process mixture models, latent Dirichlet allocation, and a combination respectively. We then consider two applications: topic modeling and learning 3D object arrangements. Our experiments show that our M3 models achieve better performance using fewer topics than many classic topic models. We also observe that topics from the different dimensions of M3 models are meaningful and orthogonal to each other.Comment: 9 pages, 7 figure

    Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models

    Full text link
    Given large amount of real photos for training, Convolutional neural network shows excellent performance on object recognition tasks. However, the process of collecting data is so tedious and the background are also limited which makes it hard to establish a perfect database. In this paper, our generative model trained with synthetic images rendered from 3D models reduces the workload of data collection and limitation of conditions. Our structure is composed of two sub-networks: semantic foreground object reconstruction network based on Bayesian inference and classification network based on multi-triplet cost function for avoiding over-fitting problem on monotone surface and fully utilizing pose information by establishing sphere-like distribution of descriptors in each category which is helpful for recognition on regular photos according to poses, lighting condition, background and category information of rendered images. Firstly, our conjugate structure called generative model with metric learning utilizing additional foreground object channels generated from Bayesian rendering as the joint of two sub-networks. Multi-triplet cost function based on poses for object recognition are used for metric learning which makes it possible training a category classifier purely based on synthetic data. Secondly, we design a coordinate training strategy with the help of adaptive noises acting as corruption on input images to help both sub-networks benefit from each other and avoid inharmonious parameter tuning due to different convergence speed of two sub-networks. Our structure achieves the state of the art accuracy of over 50\% on ShapeNet database with data migration obstacle from synthetic images to real photos. This pipeline makes it applicable to do recognition on real images only based on 3D models.Comment: 14 page
    • …
    corecore