15,669 research outputs found

    A Reverse Hierarchy Model for Predicting Eye Fixations

    Full text link
    A number of psychological and physiological evidences suggest that early visual attention works in a coarse-to-fine way, which lays a basis for the reverse hierarchy theory (RHT). This theory states that attention propagates from the top level of the visual hierarchy that processes gist and abstract information of input, to the bottom level that processes local details. Inspired by the theory, we develop a computational model for saliency detection in images. First, the original image is downsampled to different scales to constitute a pyramid. Then, saliency on each layer is obtained by image super-resolution reconstruction from the layer above, which is defined as unpredictability from this coarse-to-fine reconstruction. Finally, saliency on each layer of the pyramid is fused into stochastic fixations through a probabilistic model, where attention initiates from the top layer and propagates downward through the pyramid. Extensive experiments on two standard eye-tracking datasets show that the proposed method can achieve competitive results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVPR 201

    Scale Stain: Multi-Resolution Feature Enhancement in Pathology Visualization

    Full text link
    Digital whole-slide images of pathological tissue samples have recently become feasible for use within routine diagnostic practice. These gigapixel sized images enable pathologists to perform reviews using computer workstations instead of microscopes. Existing workstations visualize scanned images by providing a zoomable image space that reproduces the capabilities of the microscope. This paper presents a novel visualization approach that enables filtering of the scale-space according to color preference. The visualization method reveals diagnostically important patterns that are otherwise not visible. The paper demonstrates how this approach has been implemented into a fully functional prototype that lets the user navigate the visualization parameter space in real time. The prototype was evaluated for two common clinical tasks with eight pathologists in a within-subjects study. The data reveal that task efficiency increased by 15% using the prototype, with maintained accuracy. By analyzing behavioral strategies, it was possible to conclude that efficiency gain was caused by a reduction of the panning needed to perform systematic search of the images. The prototype system was well received by the pathologists who did not detect any risks that would hinder use in clinical routine

    Locally Non-rigid Registration for Mobile HDR Photography

    Full text link
    Image registration for stack-based HDR photography is challenging. If not properly accounted for, camera motion and scene changes result in artifacts in the composite image. Unfortunately, existing methods to address this problem are either accurate, but too slow for mobile devices, or fast, but prone to failing. We propose a method that fills this void: our approach is extremely fast---under 700ms on a commercial tablet for a pair of 5MP images---and prevents the artifacts that arise from insufficient registration quality

    StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

    Full text link
    One-stage object detectors such as SSD or YOLO already have shown promising accuracy with small memory footprint and fast speed. However, it is widely recognized that one-stage detectors have difficulty in detecting small objects while they are competitive with two-stage methods on large objects. In this paper, we investigate how to alleviate this problem starting from the SSD framework. Due to their pyramidal design, the lower layer that is responsible for small objects lacks strong semantics(e.g contextual information). We address this problem by introducing a feature combining module that spreads out the strong semantics in a top-down manner. Our final model StairNet detector unifies the multi-scale representations and semantic distribution effectively. Experiments on PASCAL VOC 2007 and PASCAL VOC 2012 datasets demonstrate that StairNet significantly improves the weakness of SSD and outperforms the other state-of-the-art one-stage detectors

    Multiscale approaches to music audio feature learning

    Get PDF
    Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset
    • …
    corecore