1,410 research outputs found
DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition
The performances of automatic speech recognition (ASR) systems degrade
drastically under noisy conditions. Explicit distortion modelling (EDM), as a
feature compensation step, is able to enhance ASR systems under such conditions
by simulating the in-domain noisy speeches from the clean counterparts. Yet,
existing distortion models are either non-trainable or unexplainable and often
lack controllability and generalization ability. In this paper, we propose a
fully explainable and controllable model: DENT-DDSP to achieve EDM. DENT-DDSP
utilizes novel differentiable digital signal processing (DDSP) components and
requires only 10 seconds of training data to achieve high fidelity. The
experiment shows that the simulated noisy data from DENT-DDSP achieves the
highest simulation fidelity compared to other baseline models in terms of
multi-scale spectral loss (MSSL). Moreover, to validate whether the data
simulated by DENT-DDSP are able to replace the scarce in-domain noisy data in
the noise-robust ASR tasks, several downstream ASR models with the same
architecture are trained using the simulated data and the real data. The
experiment shows that the model trained with the simulated noisy data from
DENT-DDSP achieves similar performances to the benchmark with a 2.7\%
difference in terms of word error rate (WER). The code of the model is released
online
Technical Workshop: Advanced Helicopter Cockpit Design
Information processing demands on both civilian and military aircrews have increased enormously as rotorcraft have come to be used for adverse weather, day/night, and remote area missions. Applied psychology, engineering, or operational research for future helicopter cockpit design criteria were identified. Three areas were addressed: (1) operational requirements, (2) advanced avionics, and (3) man-system integration
Identification of aircrew tasks for using direct voice input (DVI) to reduce pilot workload in the AH-64D Apache Longbow
Advances in helicopter design continue to saturate the pilot\u27s visual channel and produce remarkable increases in cognitive workload for the pilot. This study investigates the potential implementation of Direct Voice Input (DVI) as an alternative control for interacting with onboard systems of the AH-64D Apache, in an attempt to reduce pilot workload during a hands on the controls and eyes out condition. The intent is to identify AH-64D cockpit tasks performed through Multi Purpose Displays (MPDs) that when converted to DVI will provide the greatest reduction in task execution time and workload. A brief description of applicable AH-64D audio and visual displays are provided. A review of current trends in state-of-the-art voice recognition technology is presented, as well as previous and current voice input cockpit identification studies. To identify tasks in the AH-64D, a methodology was developed consisting of a detailed analysis of the aircraft\u27s mission and on-board systems. A pilot questionnaire was developed and administered to operational AH-64D pilots to assess their input on DVI implementation. Findings indicate DVI would be most useful for displaying selected MPD pages and performing tasks pertaining to the Tactical Situation Display (TSD), weapons, and communications. Six of the candidate DVI tasks were performed in the AH-64D simulator using the manual input method and a simulated voice input method. Two different pilots made objective and subjective evaluations. Task execution times and workload rating were lower using a simulated means of voice input. Overall, DVI shows limited potential for workload reduction and warrants further simulator testing before proceeding to the flight environment
Increasing Air Traffic Control simulations realism through voice transformation
International audienceImproving realism in simulations is a critical issue. In some air traffic control (ATC) simulations we use a pseudo-pilot which pilots up to fifteen aircraft. Thus, having the same voice for different aircraft in the case of pseudo-pilot decreases the realism of the simulation and may be confusing for the controllers especially in study context. In research context, a virtual aircraft piloted in a flight simulator is sometime needed in addition to the pseudo pilot. For simulation needs, the flight simulator aircraft must be merged with pseudo-pilot's one. This is not possible without voice modification since the controller can distinguish the pilot voice. In this paper we propose a method for transforming the voices of the pilot and the pseudo-pilot in order to have one particular voice and cabin noise for each aircraft. The two experiments that have been conducted show that, through our voice modification algorithm, the realism of the simulation is enhanced and the voice biases disappear
UAS Concept of Operations and Vehicle Technologies Demonstration
In 2017 and 2018, under National Aeronautics and Space Administration (NASA) sponsorship, the New York Unmanned Aircraft Systems (UAS) Test Site and Northeast UAS Airspace Integration Research (NUAIR) Alliance conducted a year-long research project that culminated in a UAS technology flight demonstration. The research project included the creation of a concept of operations, and development and demonstration of UAS technologies. The concept of operations was focused on an unmanned aircraft transiting from cruise through Class E airspace into a high-density urban terminal environment. The terminal environment in which the test was conducted was Griffiss International Airport, under Syracuse Air Traffic Control (ATC) approach control and Griffiss control tower. Employing an Aurora Centaur optionally piloted aircraft (OPA), this project explored six scenarios aimed at advancing UAS integration into the National Airspace System (NAS) under both nominal and off-nominal conditions. Off-nominal conditions were defined to include complete loss of the communications link between the remote pilots control station on the ground and the aircraft. The off-nominal scenarios that were investigated included lost-link conditions with and without link recovery, an automated ATC initiated go-around, autonomous rerouting around a dynamic airspace obstruction (in this case simulated weather), and autonomous taxi operations to clear the runway
Future Directions in Aerospace Technologies
From an assessment of the estimated needs for civil and military aircraft which arise from
the predicted future growth of civil aviation and the known needs of the world's defense
forces, it is shown that there are compelling reasons for international and inter-company
collaboration to meet the demands. From that evidence and two British reports concerning
the future of the UK and European aeronautics industry it has been possible to indicate
the technology acquisition strategies proposed for future success. Underpinning these
strategies are three categories of technology, namely Foundation Enhancing and
Supporting. These categories are discussed before presenting a consideration of
appropriate aerospace technologies for the future including the propulsion cycle, the use
of double fuselage aircraft, the blended wing body concept, the design of long haul subsonic
aircraft, new cockpit technology and air traffic management systems
Advanced flight deck/crew station simulator functional requirements
This report documents a study of flight deck/crew system research facility requirements for investigating issues involved with developing systems, and procedures for interfacing transport aircraft with air traffic control systems planned for 1985 to 2000. Crew system needs of NASA, the U.S. Air Force, and industry were investigated and reported. A matrix of these is included, as are recommended functional requirements and design criteria for simulation facilities in which to conduct this research. Methods of exploiting the commonality and similarity in facilities are identified, and plans for exploiting this in order to reduce implementation costs and allow efficient transfer of experiments from one facility to another are presented
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications
Personal assistants, automatic speech recognizers and dialogue understanding
systems are becoming more critical in our interconnected digital world. A clear
example is air traffic control (ATC) communications. ATC aims at guiding
aircraft and controlling the airspace in a safe and optimal manner. These
voice-based dialogues are carried between an air traffic controller (ATCO) and
pilots via very-high frequency radio channels. In order to incorporate these
novel technologies into ATC (low-resource domain), large-scale annotated
datasets are required to develop the data-driven AI systems. Two examples are
automatic speech recognition (ASR) and natural language understanding (NLU). In
this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering
research on the challenging ATC field, which has lagged behind due to lack of
annotated data. The ATCO2 corpus covers 1) data collection and pre-processing,
2) pseudo-annotations of speech data, and 3) extraction of ATC-related named
entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set
corpus contains 4 hours of ATC speech with manual transcripts and a subset with
gold annotations for named-entity recognition (callsign, command, value). 2)
The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched
with automatic transcripts from an in-domain speech recognizer, contextual
information, speaker turn information, signal-to-noise ratio estimate and
English language detection score per sample. Both available for purchase
through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3)
The ATCO2-test-set-1h corpus is a one-hour subset from the original test set
corpus, that we are offering for free at https://www.atco2.org/data. We expect
the ATCO2 corpus will foster research on robust ASR and NLU not only in the
field of ATC communications but also in the general research community.Comment: Manuscript under review; The code will be available at
https://github.com/idiap/atco2-corpu
A STUDY OF THE USE OF MIXED REALITY FOR CAPTURING HUMAN OBSERVATION AND INFERENCES IN PRODUCTION ENVIRONMENTS
Augmented and mixed reality is already considered as needful technology of the modern production systems. It is primarily employed to virtualize proper digital content, mainly related to 3D objects, into the human visual field allowing people to visualize and understand complex spatial shapes, their mutual relations, and positioning. Yet, the huge potential of the technology is waiting to be revealed in its usage for collecting and recording human observations and inferences about the context of the production environment. Its bi-directional interface makes it the most direct and the most efficient knowledge capturing means to date. The paper presents the challenges and benefits that come from the usage of a conceptual interface of an mixed reality application that is designed to collect data, semantics and knowledge about the production context directly from the man-in-process. As a production environment for the development, implementation, and testing of mixed reality applications for this purpose, various processes for the assembly and maintenance of medium-voltage equipment were used
Air Force Institute of Technology Contributions to Air Force Research and Development, Calendar Year 1979
This report provides the listing of the Master of Science Theses, Doctoral dissertations, faculty consultations, and selected faculty publications completed during the 1979 calendar year at the Air Force Institute of Technology, at Wright-Patterson Air Force Base, Ohio
- …