4,049 research outputs found

    A practical guide on using SPOT-GPR, a freeware tool implementing a SAP-DoA technique

    Get PDF
    This is a software paper, which main objective is to provide practical information on how to use SPOT-GPR release 1.0, a MATLAB®-based software for the analysis of ground penetrating radar (GPR) profiles. The software allows detecting targets and estimating their position in a two-dimensional scenario, it has a graphical user interface and implements an innovative sub-array processing method. SPOT-GPR was developed in the framework of the COST Action TU1208 “Civil Engineering Applications of Ground Penetrating Radar” and is available for free download on the website of the Action (www.GPRadar.eu)

    SPOT-GPR: a freeware tool for target detection and localizationin GPR data developed within the COST action TU1208

    Get PDF
    SPOT-GPR (release 1.0) is a new freeware tool implementing an innovative Sub-Array Processing method, for the analysis of Ground-Penetrating Radar (GPR) data with the main purposes of detecting and localizing targets. The software is implemented in Matlab, it has a graphical user interface and a short manual. This work is the outcome of a series of three Short-Term Scientific Missions (STSMs) funded by European COoperation in Science and Technology (COST) and carried out in the framework of the COST Action TU1208 “Civil Engineering Applications of Ground Penetrating Radar” (www.GPRadar.eu). The input of the software is a GPR radargram (B-scan). The radargram is partitioned in subradargrams, composed of a few traces (A-scans) each. The multi-frequency information enclosed in each trace is exploited and a set of dominant Directions of Arrival (DoA) of the electromagnetic field is calculated for each sub-radargram. The estimated angles are triangulated, obtaining a pattern of crossings that are condensed around target locations. Such pattern is filtered, in order to remove a noisy background of unwanted crossings, and is then processed by applying a statistical procedure. Finally, the targets are detected and their positions are predicted. For DoA estimation, the MUltiple SIgnal Classification (MUSIC) algorithm is employed, in combination with the matched filter technique. To the best of our knowledge, this is the first time the matched filter technique is used for the processing of GPR data. The software has been tested on GPR synthetic radargrams, calculated by using the finite-difference timedomain simulator gprMax, with very good results

    Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization

    Full text link
    State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video which are the most relevant to the actions being searched for. To address this need, we propose the new problem of action spotting in video, which we define as finding a specific action in a video while observing a small portion of that video. Inspired by the observation that humans are extremely efficient and accurate in spotting and finding action instances in video, we propose Action Search, a novel Recurrent Neural Network approach that mimics the way humans spot actions. Moreover, to address the absence of data recording the behavior of human annotators, we put forward the Human Searches dataset, which compiles the search sequences employed by human annotators spotting actions in the AVA and THUMOS14 datasets. We consider temporal action localization as an application of the action spotting problem. Experiments on the THUMOS14 dataset reveal that our model is not only able to explore the video efficiently (observing on average 17.3% of the video) but it also accurately finds human activities with 30.8% mAP.Comment: Accepted to ECCV 201

    What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision

    Get PDF
    We present a novel method for aligning a sequence of instructions to a video of someone carrying out a task. In particular, we focus on the cooking domain, where the instructions correspond to the recipe. Our technique relies on an HMM to align the recipe steps to the (automatically generated) speech transcript. We then refine this alignment using a state-of-the-art visual food detector, based on a deep convolutional neural network. We show that our technique outperforms simpler techniques based on keyword spotting. It also enables interesting applications, such as automatically illustrating recipes with keyframes, and searching within a video for events of interest.Comment: To appear in NAACL 201

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Objective Classes for Micro-Facial Expression Recognition

    Full text link
    Micro-expressions are brief spontaneous facial expressions that appear on a face when a person conceals an emotion, making them different to normal facial expressions in subtlety and duration. Currently, emotion classes within the CASME II dataset are based on Action Units and self-reports, creating conflicts during machine learning training. We will show that classifying expressions using Action Units, instead of predicted emotion, removes the potential bias of human reporting. The proposed classes are tested using LBP-TOP, HOOF and HOG 3D feature descriptors. The experiments are evaluated on two benchmark FACS coded datasets: CASME II and SAMM. The best result achieves 86.35\% accuracy when classifying the proposed 5 classes on CASME II using HOG 3D, outperforming the result of the state-of-the-art 5-class emotional-based classification in CASME II. Results indicate that classification based on Action Units provides an objective method to improve micro-expression recognition.Comment: 11 pages, 4 figures and 5 tables. This paper will be submitted for journal revie

    How to improve TTS systems for emotional expressivity

    Get PDF
    Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor the parameters needed for emotional expressivity. This paper describes a technique for an automatic prosodic parameterization based on affective clues. This technique recognizes the affective information conveyed in a text and, accordingly to its emotional connotation, assigns appropriate pitch accents and other prosodic parameters by XML-tagging. This pre-processing assists the TTS system to generate synthesized speech that contains emotional clues. The experimental results are encouraging and suggest the possibility of suitable emotional expressivity in speech synthesis
    corecore