1,410 research outputs found

    DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition

    Full text link
    The performances of automatic speech recognition (ASR) systems degrade drastically under noisy conditions. Explicit distortion modelling (EDM), as a feature compensation step, is able to enhance ASR systems under such conditions by simulating the in-domain noisy speeches from the clean counterparts. Yet, existing distortion models are either non-trainable or unexplainable and often lack controllability and generalization ability. In this paper, we propose a fully explainable and controllable model: DENT-DDSP to achieve EDM. DENT-DDSP utilizes novel differentiable digital signal processing (DDSP) components and requires only 10 seconds of training data to achieve high fidelity. The experiment shows that the simulated noisy data from DENT-DDSP achieves the highest simulation fidelity compared to other baseline models in terms of multi-scale spectral loss (MSSL). Moreover, to validate whether the data simulated by DENT-DDSP are able to replace the scarce in-domain noisy data in the noise-robust ASR tasks, several downstream ASR models with the same architecture are trained using the simulated data and the real data. The experiment shows that the model trained with the simulated noisy data from DENT-DDSP achieves similar performances to the benchmark with a 2.7\% difference in terms of word error rate (WER). The code of the model is released online

    Technical Workshop: Advanced Helicopter Cockpit Design

    Get PDF
    Information processing demands on both civilian and military aircrews have increased enormously as rotorcraft have come to be used for adverse weather, day/night, and remote area missions. Applied psychology, engineering, or operational research for future helicopter cockpit design criteria were identified. Three areas were addressed: (1) operational requirements, (2) advanced avionics, and (3) man-system integration

    Identification of aircrew tasks for using direct voice input (DVI) to reduce pilot workload in the AH-64D Apache Longbow

    Get PDF
    Advances in helicopter design continue to saturate the pilot\u27s visual channel and produce remarkable increases in cognitive workload for the pilot. This study investigates the potential implementation of Direct Voice Input (DVI) as an alternative control for interacting with onboard systems of the AH-64D Apache, in an attempt to reduce pilot workload during a hands on the controls and eyes out condition. The intent is to identify AH-64D cockpit tasks performed through Multi Purpose Displays (MPDs) that when converted to DVI will provide the greatest reduction in task execution time and workload. A brief description of applicable AH-64D audio and visual displays are provided. A review of current trends in state-of-the-art voice recognition technology is presented, as well as previous and current voice input cockpit identification studies. To identify tasks in the AH-64D, a methodology was developed consisting of a detailed analysis of the aircraft\u27s mission and on-board systems. A pilot questionnaire was developed and administered to operational AH-64D pilots to assess their input on DVI implementation. Findings indicate DVI would be most useful for displaying selected MPD pages and performing tasks pertaining to the Tactical Situation Display (TSD), weapons, and communications. Six of the candidate DVI tasks were performed in the AH-64D simulator using the manual input method and a simulated voice input method. Two different pilots made objective and subjective evaluations. Task execution times and workload rating were lower using a simulated means of voice input. Overall, DVI shows limited potential for workload reduction and warrants further simulator testing before proceeding to the flight environment

    Increasing Air Traffic Control simulations realism through voice transformation

    Get PDF
    International audienceImproving realism in simulations is a critical issue. In some air traffic control (ATC) simulations we use a pseudo-pilot which pilots up to fifteen aircraft. Thus, having the same voice for different aircraft in the case of pseudo-pilot decreases the realism of the simulation and may be confusing for the controllers especially in study context. In research context, a virtual aircraft piloted in a flight simulator is sometime needed in addition to the pseudo pilot. For simulation needs, the flight simulator aircraft must be merged with pseudo-pilot's one. This is not possible without voice modification since the controller can distinguish the pilot voice. In this paper we propose a method for transforming the voices of the pilot and the pseudo-pilot in order to have one particular voice and cabin noise for each aircraft. The two experiments that have been conducted show that, through our voice modification algorithm, the realism of the simulation is enhanced and the voice biases disappear

    UAS Concept of Operations and Vehicle Technologies Demonstration

    Get PDF
    In 2017 and 2018, under National Aeronautics and Space Administration (NASA) sponsorship, the New York Unmanned Aircraft Systems (UAS) Test Site and Northeast UAS Airspace Integration Research (NUAIR) Alliance conducted a year-long research project that culminated in a UAS technology flight demonstration. The research project included the creation of a concept of operations, and development and demonstration of UAS technologies. The concept of operations was focused on an unmanned aircraft transiting from cruise through Class E airspace into a high-density urban terminal environment. The terminal environment in which the test was conducted was Griffiss International Airport, under Syracuse Air Traffic Control (ATC) approach control and Griffiss control tower. Employing an Aurora Centaur optionally piloted aircraft (OPA), this project explored six scenarios aimed at advancing UAS integration into the National Airspace System (NAS) under both nominal and off-nominal conditions. Off-nominal conditions were defined to include complete loss of the communications link between the remote pilots control station on the ground and the aircraft. The off-nominal scenarios that were investigated included lost-link conditions with and without link recovery, an automated ATC initiated go-around, autonomous rerouting around a dynamic airspace obstruction (in this case simulated weather), and autonomous taxi operations to clear the runway

    Future Directions in Aerospace Technologies

    Get PDF
    From an assessment of the estimated needs for civil and military aircraft which arise from the predicted future growth of civil aviation and the known needs of the world's defense forces, it is shown that there are compelling reasons for international and inter-company collaboration to meet the demands. From that evidence and two British reports concerning the future of the UK and European aeronautics industry it has been possible to indicate the technology acquisition strategies proposed for future success. Underpinning these strategies are three categories of technology, namely Foundation Enhancing and Supporting. These categories are discussed before presenting a consideration of appropriate aerospace technologies for the future including the propulsion cycle, the use of double fuselage aircraft, the blended wing body concept, the design of long haul subsonic aircraft, new cockpit technology and air traffic management systems

    Advanced flight deck/crew station simulator functional requirements

    Get PDF
    This report documents a study of flight deck/crew system research facility requirements for investigating issues involved with developing systems, and procedures for interfacing transport aircraft with air traffic control systems planned for 1985 to 2000. Crew system needs of NASA, the U.S. Air Force, and industry were investigated and reported. A matrix of these is included, as are recommended functional requirements and design criteria for simulation facilities in which to conduct this research. Methods of exploiting the commonality and similarity in facilities are identified, and plans for exploiting this in order to reduce implementation costs and allow efficient transfer of experiments from one facility to another are presented

    ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

    Full text link
    Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community.Comment: Manuscript under review; The code will be available at https://github.com/idiap/atco2-corpu

    A STUDY OF THE USE OF MIXED REALITY FOR CAPTURING HUMAN OBSERVATION AND INFERENCES IN PRODUCTION ENVIRONMENTS

    Get PDF
    Augmented and mixed reality is already considered as needful technology of the modern production systems. It is primarily employed to virtualize proper digital content, mainly related to 3D objects, into the human visual field allowing people to visualize and understand complex spatial shapes, their mutual relations, and positioning. Yet, the huge potential of the technology is waiting to be revealed in its usage for collecting and recording human observations and inferences about the context of the production environment. Its bi-directional interface makes it the most direct and the most efficient knowledge capturing means to date. The paper presents the challenges and benefits that come from the usage of a conceptual interface of an mixed reality application that is designed to collect data, semantics and knowledge about the production context directly from the man-in-process. As a production environment for the development, implementation, and testing of mixed reality applications for this purpose, various processes for the assembly and maintenance of medium-voltage equipment were used

    Air Force Institute of Technology Contributions to Air Force Research and Development, Calendar Year 1979

    Get PDF
    This report provides the listing of the Master of Science Theses, Doctoral dissertations, faculty consultations, and selected faculty publications completed during the 1979 calendar year at the Air Force Institute of Technology, at Wright-Patterson Air Force Base, Ohio
    corecore