11,527 research outputs found

    Contextual Object Detection with a Few Relevant Neighbors

    Full text link
    A natural way to improve the detection of objects is to consider the contextual constraints imposed by the detection of additional objects in a given scene. In this work, we exploit the spatial relations between objects in order to improve detection capacity, as well as analyze various properties of the contextual object detection problem. To precisely calculate context-based probabilities of objects, we developed a model that examines the interactions between objects in an exact probabilistic setting, in contrast to previous methods that typically utilize approximations based on pairwise interactions. Such a scheme is facilitated by the realistic assumption that the existence of an object in any given location is influenced by only few informative locations in space. Based on this assumption, we suggest a method for identifying these relevant locations and integrating them into a mostly exact calculation of probability based on their raw detector responses. This scheme is shown to improve detection results and provides unique insights about the process of contextual inference for object detection. We show that it is generally difficult to learn that a particular object reduces the probability of another, and that in cases when the context and detector strongly disagree this learning becomes virtually impossible for the purposes of improving the results of an object detector. Finally, we demonstrate improved detection results through use of our approach as applied to the PASCAL VOC and COCO datasets

    Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

    Full text link
    Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos.Comment: Submitted for publicatio

    The More You Know: Using Knowledge Graphs for Image Classification

    Full text link
    One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world. Humans can learn about the characteristics of objects and the relationships that occur between them to learn a large variety of visual concepts, often with few examples. This paper investigates the use of structured prior knowledge in the form of knowledge graphs and shows that using this knowledge improves performance on image classification. We build on recent work on end-to-end learning on graphs, introducing the Graph Search Neural Network as a way of efficiently incorporating large knowledge graphs into a vision classification pipeline. We show in a number of experiments that our method outperforms standard neural network baselines for multi-label classification.Comment: CVPR 201

    Finding Streams in Knowledge Graphs to Support Fact Checking

    Full text link
    The volume and velocity of information that gets generated online limits current journalistic practices to fact-check claims at the same rate. Computational approaches for fact checking may be the key to help mitigate the risks of massive misinformation spread. Such approaches can be designed to not only be scalable and effective at assessing veracity of dubious claims, but also to boost a human fact checker's productivity by surfacing relevant facts and patterns to aid their analysis. To this end, we present a novel, unsupervised network-flow based approach to determine the truthfulness of a statement of fact expressed in the form of a (subject, predicate, object) triple. We view a knowledge graph of background information about real-world entities as a flow network, and knowledge as a fluid, abstract commodity. We show that computational fact checking of such a triple then amounts to finding a "knowledge stream" that emanates from the subject node and flows toward the object node through paths connecting them. Evaluation on a range of real-world and hand-crafted datasets of facts related to entertainment, business, sports, geography and more reveals that this network-flow model can be very effective in discerning true statements from false ones, outperforming existing algorithms on many test cases. Moreover, the model is expressive in its ability to automatically discover several useful path patterns and surface relevant facts that may help a human fact checker corroborate or refute a claim.Comment: Extended version of the paper in proceedings of ICDM 201

    Adaptive Nonparametric Image Parsing

    Get PDF
    In this paper, we present an adaptive nonparametric solution to the image parsing task, namely annotating each image pixel with its corresponding category label. For a given test image, first, a locality-aware retrieval set is extracted from the training data based on super-pixel matching similarities, which are augmented with feature extraction for better differentiation of local super-pixels. Then, the category of each super-pixel is initialized by the majority vote of the kk-nearest-neighbor super-pixels in the retrieval set. Instead of fixing kk as in traditional non-parametric approaches, here we propose a novel adaptive nonparametric approach which determines the sample-specific k for each test image. In particular, kk is adaptively set to be the number of the fewest nearest super-pixels which the images in the retrieval set can use to get the best category prediction. Finally, the initial super-pixel labels are further refined by contextual smoothing. Extensive experiments on challenging datasets demonstrate the superiority of the new solution over other state-of-the-art nonparametric solutions.Comment: 11 page
    • …
    corecore