207,353 research outputs found

    Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition

    Full text link
    Far-field speech recognition in noisy and reverberant conditions remains a challenging problem despite recent deep learning breakthroughs. This problem is commonly addressed by acquiring a speech signal from multiple microphones and performing beamforming over them. In this paper, we propose to use a recurrent neural network with long short-term memory (LSTM) architecture to adaptively estimate real-time beamforming filter coefficients to cope with non-stationary environmental noise and dynamic nature of source and microphones positions which results in a set of timevarying room impulse responses. The LSTM adaptive beamformer is jointly trained with a deep LSTM acoustic model to predict senone labels. Further, we use hidden units in the deep LSTM acoustic model to assist in predicting the beamforming filter coefficients. The proposed system achieves 7.97% absolute gain over baseline systems with no beamforming on CHiME-3 real evaluation set.Comment: in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP

    Integrating model checking with HiP-HOPS in model-based safety analysis

    Get PDF
    The ability to perform an effective and robust safety analysis on the design of modern safety–critical systems is crucial. Model-based safety analysis (MBSA) has been introduced in recent years to support the assessment of complex system design by focusing on the system model as the central artefact, and by automating the synthesis and analysis of failure-extended models. Model checking and failure logic synthesis and analysis (FLSA) are two prominent MBSA paradigms. Extensive research has placed emphasis on the development of these techniques, but discussion on their integration remains limited. In this paper, we propose a technique in which model checking and Hierarchically Performed Hazard Origin and Propagation Studies (HiP-HOPS) – an advanced FLSA technique – can be applied synergistically with benefit for the MBSA process. The application of the technique is illustrated through an example of a brake-by-wire system

    Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

    Get PDF
    We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks

    Quantitative shadowgraphy and proton radiography for large intensity modulations

    Get PDF
    Shadowgraphy is a technique widely used to diagnose objects or systems in various fields in physics and engineering. In shadowgraphy, an optical beam is deflected by the object and then the intensity modulation is captured on a screen placed some distance away. However, retrieving quantitative information from the shadowgrams themselves is a challenging task because of the non-linear nature of the process. Here, a novel method to retrieve quantitative information from shadowgrams, based on computational geometry, is presented for the first time. This process can be applied to proton radiography for electric and magnetic field diagnosis in high-energy-density plasmas and has been benchmarked using a toroidal magnetic field as the object, among others. It is shown that the method can accurately retrieve quantitative parameters with error bars less than 10%, even when caustics are present. The method is also shown to be robust enough to process real experimental results with simple pre- and post-processing techniques. This adds a powerful new tool for research in various fields in engineering and physics for both techniques

    An integrated neuro-mechanical model of C. elegans forward locomotion

    Get PDF
    One of the most tractable organisms for the study of nervous systems is the nematode Caenorhabditis elegans, whose locomotion in particular has been the subject of a number of models. In this paper we present a first integrated neuro-mechanical model of forward locomotion. We find that a previous neural model is robust to the addition of a body with mechanical properties, and that the integrated model produces oscillations with a more realistic frequency and waveform than the neural model alone. We conclude that the body and environment are likely to be important components of the worm’s locomotion subsystem

    Robust Spoken Language Understanding for House Service Robots

    Get PDF
    Service robotics has been growing significantly in thelast years, leading to several research results and to a numberof consumer products. One of the essential features of theserobotic platforms is represented by the ability of interactingwith users through natural language. Spoken commands canbe processed by a Spoken Language Understanding chain, inorder to obtain the desired behavior of the robot. The entrypoint of such a process is represented by an Automatic SpeechRecognition (ASR) module, that provides a list of transcriptionsfor a given spoken utterance. Although several well-performingASR engines are available off-the-shelf, they operate in a generalpurpose setting. Hence, they may be not well suited in therecognition of utterances given to robots in specific domains. Inthis work, we propose a practical yet robust strategy to re-ranklists of transcriptions. This approach improves the quality of ASRsystems in situated scenarios, i.e., the transcription of roboticcommands. The proposed method relies upon evidences derivedby a semantic grammar with semantic actions, designed tomodel typical commands expressed in scenarios that are specificto human service robotics. The outcomes obtained throughan experimental evaluation show that the approach is able toeffectively outperform the ASR baseline, obtained by selectingthe first transcription suggested by the AS
    • …
    corecore