38,517 research outputs found

    Towards a quantitative measure of rareness

    Get PDF
    Within the context of detection of incongruent events, an often overlooked aspect is how a system should react to the detection. The set of all the possible actions is certainly conditioned by the task at hand, and by the embodiment of the artificial cognitive system under consideration. Still, we argue that a desirable action that does not depend from these factors is to update the internal model and learn the new detected event. This paper proposes a recent transfer learning algorithm as the way to address this issue. A notable feature of the proposed model is its capability to learn from small samples, even a single one. This is very desirable in this context, as we cannot expect to have too many samples to learn from, given the very nature of incongruent events. We also show that one of the internal parameters of the algorithm makes it possible to quantitatively measure incongruence of detected events. Experiments on two different datasets support our claim

    K-Space at TRECVid 2007

    Get PDF
    In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance. The first of the two systems was a ā€˜shotā€™ based interface, where the results from a query were presented as a ranked list of shots. The second interface was ā€˜broadcastā€™ based, where results were presented as a ranked list of broadcasts. Both systems made use of the outputs of our high-level feature submission as well as low-level visual features

    High-level feature detection from video in TRECVid: a 5-year retrospective of achievements

    Get PDF
    Successful and effective content-based access to digital video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip. The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work done on the TRECVid high-level feature task, showing the progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can achieve large-scale, fast and reliable high-level feature detection on video

    Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub Humanoid

    Full text link
    Deep neural networks have been widely adopted in recent years, exhibiting impressive performances in several application domains. It has however been shown that they can be fooled by adversarial examples, i.e., images altered by a barely-perceivable adversarial noise, carefully crafted to mislead classification. In this work, we aim to evaluate the extent to which robot-vision systems embodying deep-learning algorithms are vulnerable to adversarial examples, and propose a computationally efficient countermeasure to mitigate this threat, based on rejecting classification of anomalous inputs. We then provide a clearer understanding of the safety properties of deep networks through an intuitive empirical analysis, showing that the mapping learned by such networks essentially violates the smoothness assumption of learning algorithms. We finally discuss the main limitations of this work, including the creation of real-world adversarial examples, and sketch promising research directions.Comment: Accepted for publication at the ICCV 2017 Workshop on Vision in Practice on Autonomous Robots (ViPAR

    A brief network analysis of Artificial Intelligence publication

    Full text link
    In this paper, we present an illustration to the history of Artificial Intelligence(AI) with a statistical analysis of publish since 1940. We collected and mined through the IEEE publish data base to analysis the geological and chronological variance of the activeness of research in AI. The connections between different institutes are showed. The result shows that the leading community of AI research are mainly in the USA, China, the Europe and Japan. The key institutes, authors and the research hotspots are revealed. It is found that the research institutes in the fields like Data Mining, Computer Vision, Pattern Recognition and some other fields of Machine Learning are quite consistent, implying a strong interaction between the community of each field. It is also showed that the research of Electronic Engineering and Industrial or Commercial applications are very active in California. Japan is also publishing a lot of papers in robotics. Due to the limitation of data source, the result might be overly influenced by the number of published articles, which is to our best improved by applying network keynode analysis on the research community instead of merely count the number of publish.Comment: 18 pages, 7 figure

    Compensating inaccurate annotations to train 3D facial landmark localisation models

    Get PDF
    In this paper we investigate the impact of inconsistency in manual annotations when they are used to train automatic models for 3D facial landmark localization. We start by showing that it is possible to objectively measure the consistency of annotations in a database, provided that it contains replicates (i.e. repeated scans from the same person). Applying such measure to the widely used FRGC database we ļ¬nd that manual annotations currently available are suboptimal and can strongly impair the accuracy of automatic models learnt therefrom. To address this issue, we present a simple algorithm to automatically correct a set of annotations and show that it can help to signiļ¬cantly improve the accuracy of the models in terms of landmark localization errors. This improvement is observed even when errors are measured with respect to the original (not corrected) annotations. However, we also show that if errors are computed against an alternative set of manual annotations with higher consistency, the accuracy of the models constructed using the corrections from the presented algorithm tends to converge to the one achieved by building the models on the alternative,more consistent set
    • ā€¦
    corecore