766 research outputs found

    To go deep or wide in learning?

    Full text link
    To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.Comment: 9 pages, 1 figure, Accepted for publication in Seventeenth International Conference on Artificial Intelligence and Statistic

    Learning to segment with image-level supervision

    Full text link
    Deep convolutional networks have achieved the state-of-the-art for semantic image segmentation tasks. However, training these networks requires access to densely labeled images, which are known to be very expensive to obtain. On the other hand, the web provides an almost unlimited source of images annotated at the image level. How can one utilize this much larger weakly annotated set for tasks that require dense labeling? Prior work often relied on localization cues, such as saliency maps, objectness priors, bounding boxes etc., to address this challenging problem. In this paper, we propose a model that generates auxiliary labels for each image, while simultaneously forcing the output of the CNN to satisfy the mean-field constraints imposed by a conditional random field. We show that one can enforce the CRF constraints by forcing the distribution at each pixel to be close to the distribution of its neighbors. This is in stark contrast with methods that compute a recursive expansion of the mean-field distribution using a recurrent architecture and train the resultant distribution. Instead, the proposed model adds an extra loss term to the output of the CNN, and hence, is faster than recursive implementations. We achieve the state-of-the-art for weakly supervised semantic image segmentation on VOC 2012 dataset, assuming no manually labeled pixel level information is available. Furthermore, the incorporation of conditional random fields in CNN incurs little extra time during training.Comment: Published in WACV 201
    • …
    corecore