71,230 research outputs found

    DeViL: Decoding Vision features into Language

    Full text link
    Post-hoc explanation methods have often been criticised for abstracting away the decision-making process of deep neural networks. In this work, we would like to provide natural language descriptions for what different layers of a vision backbone have learned. Our DeViL method decodes vision features into language, not only highlighting the attribution locations but also generating textual descriptions of visual features at different layers of the network. We train a transformer network to translate individual image features of any vision layer into a prompt that a separate off-the-shelf language model decodes into natural language. By employing dropout both per-layer and per-spatial-location, our model can generalize training on image-text pairs to generate localized explanations. As it uses a pre-trained language model, our approach is fast to train, can be applied to any vision backbone, and produces textual descriptions at different layers of the vision network. Moreover, DeViL can create open-vocabulary attribution maps corresponding to words or phrases even outside the training scope of the vision model. We demonstrate that DeViL generates textual descriptions relevant to the image content on CC3M surpassing previous lightweight captioning models and attribution maps uncovering the learned concepts of the vision backbone. Finally, we show DeViL also outperforms the current state-of-the-art on the neuron-wise descriptions of the MILANNOTATIONS dataset. Code available at https://github.com/ExplainableML/DeViLComment: Accepted at GCPR 2023 (Oral

    An algorithm for energy-efficient bluetooth scatternet formation and maintenance

    Get PDF
    We discuss an energy-efficient, distributed Bluetooth Scatternet Formation algorithm based on Device and Link characteristics (SF-DeviL). SF-DeviL forms multihop scatternets with tree topologies and increases battery lif etimes of devices by using device types, battery levels and received signal strengths. The topology is dynamically reconfigured in SF-DeviL by depleting battery levels and it is shown through simulations that the network lifetime is increased by at least 32% compared to LMS algorithm [1]

    The Devil is in the Tails: Fine-grained Classification in the Wild

    Get PDF
    The world is long-tailed. What does this mean for computer vision and visual recognition? The main two implications are (1) the number of categories we need to consider in applications can be very large, and (2) the number of training examples for most categories can be very small. Current visual recognition algorithms have achieved excellent classification accuracy. However, they require many training examples to reach peak performance, which suggests that long-tailed distributions will not be dealt with well. We analyze this question in the context of eBird, a large fine-grained classification dataset, and a state-of-the-art deep network classification algorithm. We find that (a) peak classification performance on well-represented categories is excellent, (b) given enough data, classification performance suffers only minimally from an increase in the number of classes, (c) classification performance decays precipitously as the number of training examples decreases, (d) surprisingly, transfer learning is virtually absent in current methods. Our findings suggest that our community should come to grips with the question of long tails

    The Devil is in the Decoder: Classification, Regression and GANs

    Full text link
    Many machine vision applications, such as semantic segmentation and depth prediction, require predictions for every pixel of the input image. Models for such problems usually consist of encoders which decrease spatial resolution while learning a high-dimensional representation, followed by decoders who recover the original input resolution and result in low-dimensional predictions. While encoders have been studied rigorously, relatively few studies address the decoder side. This paper presents an extensive comparison of a variety of decoders for a variety of pixel-wise tasks ranging from classification, regression to synthesis. Our contributions are: (1) Decoders matter: we observe significant variance in results between different types of decoders on various problems. (2) We introduce new residual-like connections for decoders. (3) We introduce a novel decoder: bilinear additive upsampling. (4) We explore prediction artifacts

    Field Measurements of Terrestrial and Martian Dust Devils

    Get PDF
    Surface-based measurements of terrestrial and martian dust devils/convective vortices provided from mobile and stationary platforms are discussed. Imaging of terrestrial dust devils has quantified their rotational and vertical wind speeds, translation speeds, dimensions, dust load, and frequency of occurrence. Imaging of martian dust devils has provided translation speeds and constraints on dimensions, but only limited constraints on vertical motion within a vortex. The longer mission durations on Mars afforded by long operating robotic landers and rovers have provided statistical quantification of vortex occurrence (time-of-sol, and recently seasonal) that has until recently not been a primary outcome of more temporally limited terrestrial dust devil measurement campaigns. Terrestrial measurement campaigns have included a more extensive range of measured vortex parameters (pressure, wind, morphology, etc.) than have martian opportunities, with electric field and direct measure of dust abundance not yet obtained on Mars. No martian robotic mission has yet provided contemporaneous high frequency wind and pressure measurements. Comparison of measured terrestrial and martian dust devil characteristics suggests that martian dust devils are larger and possess faster maximum rotational wind speeds, that the absolute magnitude of the pressure deficit within a terrestrial dust devil is an order of magnitude greater than a martian dust devil, and that the time-of-day variation in vortex frequency is similar. Recent terrestrial investigations have demonstrated the presence of diagnostic dust devil signals within seismic and infrasound measurements; an upcoming Mars robotic mission will obtain similar measurement types

    Spartan Daily September 22, 2010

    Get PDF
    Volume 135, Issue 13https://scholarworks.sjsu.edu/spartandaily/1176/thumbnail.jp

    The Devil of Face Recognition is in the Noise

    Full text link
    The growing scale of face recognition datasets empowers us to train strong convolutional networks for face recognition. While a variety of architectures and loss functions have been devised, we still have a limited understanding of the source and consequence of label noise inherent in existing datasets. We make the following contributions: 1) We contribute cleaned subsets of popular face databases, i.e., MegaFace and MS-Celeb-1M datasets, and build a new large-scale noise-controlled IMDb-Face dataset. 2) With the original datasets and cleaned subsets, we profile and analyze label noise properties of MegaFace and MS-Celeb-1M. We show that a few orders more samples are needed to achieve the same accuracy yielded by a clean subset. 3) We study the association between different types of noise, i.e., label flips and outliers, with the accuracy of face recognition models. 4) We investigate ways to improve data cleanliness, including a comprehensive user study on the influence of data labeling strategies to annotation accuracy. The IMDb-Face dataset has been released on https://github.com/fwang91/IMDb-Face.Comment: accepted to ECCV'1
    corecore