22,203 research outputs found

    Weakly Supervised Localization using Deep Feature Maps

    Full text link
    Object localization is an important computer vision problem with a variety of applications. The lack of large scale object-level annotations and the relative abundance of image-level labels makes a compelling case for weak supervision in the object localization task. Deep Convolutional Neural Networks are a class of state-of-the-art methods for the related problem of object recognition. In this paper, we describe a novel object localization algorithm which uses classification networks trained on only image labels. This weakly supervised method leverages local spatial and semantic patterns captured in the convolutional layers of classification networks. We propose an efficient beam search based approach to detect and localize multiple objects in images. The proposed method significantly outperforms the state-of-the-art in standard object localization data-sets with a 8 point increase in mAP scores

    The CoNLL 2007 shared task on dependency parsing

    Get PDF
    The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results

    Seeing What You're Told: Sentence-Guided Activity Recognition In Video

    Get PDF
    We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, thereby providing a medium, not only for top-down and bottom-up integration, but also for multi-modal integration between vision and language. We show how the roles played by participants (nouns), their characteristics (adjectives), the actions performed (verbs), the manner of such actions (adverbs), and changing spatial relations between participants (prepositions) in the form of whole sentential descriptions mediated by a grammar, guides the activity-recognition process. Further, the utility and expressiveness of our framework is demonstrated by performing three separate tasks in the domain of multi-activity videos: sentence-guided focus of attention, generation of sentential descriptions of video, and query-based video search, simply by leveraging the framework in different manners.Comment: To appear in CVPR 201

    HAZARD PERCEPTION TRAINING FOR ADOLESCENTS WITH AUTISM SPECTRUM DISORDER ON THE INTERACTIVE DRIVING SIMULATOR: USING EYE TRACKING TECHNOLOGY TO DETERMINE EFFECTIVENESS

    Get PDF
    Rationale: Driving is an important developmental milestone for all adolescents as it increases their independence and ability to participate in vehicle-dependent activities. However, adolescents with high functioning autism spectrum disorder (HFASD) are less likely to obtain licenses and drive independently due to characteristics related to their diagnosis. Although current research exists exploring the efficacy of driving simulator training for adolescent drivers with HFASD and eye tracking, there is a gap in the literature related to training on the simulator and its effects on overall driving performance and hazard perception and response in this population. Purpose: This pilot study utilized a training protocol on the simulator that included hazard perception to determine its effect on overall driving performance. Eye tracking technology was used to determine if there was a change in hazard perception and response to non-social and social hazards after training. Design: This study was a one group, pretest-posttest intervention design. Methods: There were 17 participants between the ages of 15 and 22 with a self-reported diagnosis of ASD and a desire to learn to drive independently. Each participant completed a pre-test and post-test on the driving simulator while wearing eye tracking technology. Each participant completed a protocol of 30 learning modules with scenarios related to driving skills and hazard detection and response in one-to-one training. Analysis: Driving performance was measured by a quantitative score from a standardized observational tool for driving. Eye tracking measures including fixation duration, fixation count, and time to first fixation were analyzed using a Wilcoxon Signed Rank Test. Results: Participants significantly increased their overall driving performance scores pre-test to post-test. Results of hazard perception using eye tracking technology tended towards improvement overall, but specific hazard results were inconsistent and varied for both non-social and social hazards in terms of fixation duration, fixation count, and time to first fixation. Discussion: Findings from this study indicate driving simulator training related to hazard perception was effective in improving overall driving simulator performance in adolescents with HFASD. Additionally, findings indicate hazard perception and response differs for this population after hazard perception training, but specific eye tracking measures may increase or decrease, and results may not be specific to non-social or social hazards

    Multiple Retrieval Models and Regression Models for Prior Art Search

    Get PDF
    This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend
    corecore