18,118 research outputs found

    Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features

    Full text link
    We propose a simple yet effective approach to the problem of pedestrian detection which outperforms the current state-of-the-art. Our new features are built on the basis of low-level visual features and spatial pooling. Incorporating spatial pooling improves the translational invariance and thus the robustness of the detection process. We then directly optimise the partial area under the ROC curve (\pAUC) measure, which concentrates detection performance in the range of most practical importance. The combination of these factors leads to a pedestrian detector which outperforms all competitors on all of the standard benchmark datasets. We advance state-of-the-art results by lowering the average miss rate from 13%13\% to 11%11\% on the INRIA benchmark, 41%41\% to 37%37\% on the ETH benchmark, 51%51\% to 42%42\% on the TUD-Brussels benchmark and 36%36\% to 29%29\% on the Caltech-USA benchmark.Comment: 16 pages. Appearing in Proc. European Conf. Computer Vision (ECCV) 201

    A First Step Towards Nuance-Oriented Interfaces for Virtual Environments

    Get PDF
    Designing usable interfaces for virtual environments (VEs) is not a trivial task. Much of the difficulty stems from the complexity and volume of the input data. Many VEs, in the creation of their interfaces, ignore much of the input data as a result of this. Using machine learning (ML), we introduce the notion of a nuance that can be used to increase the precision and power of a VE interface. An experiment verifying the existence of nuances using a neural network (NN) is discussed and a listing of guidelines to follow is given. We also review reasons why traditional ML techniques are difficult to apply to this problem

    Using micro-analysis in interviewer training: 'continuers' and interviewer positioning

    Get PDF
    Despite the recent growth of interest in the interactional construction of research interviews and advances made in our understanding of the nature of such encounters, relatively little attention has been paid to the implications of this for interviewer training, with the result that advice on interviewing techniques tends to be very general. Drawing on analyses of a feature of research interviews that is usually treated as analytically insignificant, this article makes a case for more interactionally sensitive approaches to interviewer training. It focuses on interviewer recipiency in a database of over 40 research interviews conducted by academics and research students to show how apparently insignificant shifts in receipt tokens can have important implications in terms of the developing talk. The implications of this for researcher training are discussed and the article makes recommendations for ways in which attention can be drawn to the discoursal dimension in interviewing practice

    Tangible user interfaces : past, present and future directions

    Get PDF
    In the last two decades, Tangible User Interfaces (TUIs) have emerged as a new interface type that interlinks the digital and physical worlds. Drawing upon users' knowledge and skills of interaction with the real non-digital world, TUIs show a potential to enhance the way in which people interact with and leverage digital information. However, TUI research is still in its infancy and extensive research is required in or- der to fully understand the implications of tangible user interfaces, to develop technologies that further bridge the digital and the physical, and to guide TUI design with empirical knowledge. This paper examines the existing body of work on Tangible User In- terfaces. We start by sketching the history of tangible user interfaces, examining the intellectual origins of this eld. We then present TUIs in a broader context, survey application domains, and review frame- works and taxonomies. We also discuss conceptual foundations of TUIs including perspectives from cognitive sciences, phycology, and philoso- phy. Methods and technologies for designing, building, and evaluating TUIs are also addressed. Finally, we discuss the strengths and limita- tions of TUIs and chart directions for future research

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Transcribing Content from Structural Images with Spotlight Mechanism

    Full text link
    Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., structured symbols), which often follow a fine-grained grammar. To this end, in this paper, we propose a hierarchical Spotlight Transcribing Network (STN) framework followed by a two-stage "where-to-what" solution. Specifically, we first decide "where-to-look" through a novel spotlight mechanism to focus on different areas of the original image following its structure. Then, we decide "what-to-write" by developing a GRU based network with the spotlight areas for transcribing the content accordingly. Moreover, we propose two implementations on the basis of STN, i.e., STNM and STNR, where the spotlight movement follows the Markov property and Recurrent modeling, respectively. We also design a reinforcement method to refine the framework by self-improving the spotlight mechanism. We conduct extensive experiments on many structural image datasets, where the results clearly demonstrate the effectiveness of STN framework.Comment: Accepted by KDD2018 Research Track. In proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18
    • …
    corecore