Search CORE

11 research outputs found

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding

Author: Choukri Khalid
Khalil Driss
Lenders Vincent
Madikeri Srikanth
Motlicek Petr
Nigmatulina Iuliia
Prasad Amrutha
Rigault Mickael
Szoke Igor
Tart Allan
Zuluaga-Gomez Juan
Publication venue
Publication date: 01/05/2023
Field of study

Voice communication between air traffic controllers (ATCos) and pilots is critical for ensuring safe and efficient air traffic control (ATC). This task requires high levels of awareness from ATCos and can be tedious and error-prone. Recent attempts have been made to integrate artificial intelligence (AI) into ATC in order to reduce the workload of ATCos. However, the development of data-driven AI systems for ATC demands large-scale annotated datasets, which are currently lacking in the field. This paper explores the lessons learned from the ATCO2 project, a project that aimed to develop a unique platform to collect and preprocess large amounts of ATC data from airspace in real time. Audio and surveillance data were collected from publicly accessible radio frequency channels with VHF receivers owned by a community of volunteers and later uploaded to Opensky Network servers, which can be considered an "unlimited source" of data. In addition, this paper reviews previous work from ATCO2 partners, including (i) robust automatic speech recognition, (ii) natural language processing, (iii) English language identification of ATC communications, and (iv) the integration of surveillance data such as ADS-B. We believe that the pipeline developed during the ATCO2 project, along with the open-sourcing of its data, will encourage research in the ATC field. A sample of the ATCO2 corpus is available on the following website: https://www.atco2.org/data, while the full corpus can be purchased through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. We demonstrated that ATCO2 is an appropriate dataset to develop ASR engines when little or near to no ATC in-domain data is available. For instance, with the CNN-TDNNf kaldi model, we reached the performance of as low as 17.9% and 24.9% WER on public ATC datasets which is 6.6/7.6% better than "out-of-domain" but supervised CNN-TDNNf model.Comment: Manuscript under revie

arXiv.org e-Print Archive

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

Author: Cevenini Claudia
Choukri Khalid
Kocour Martin
Kolčárek Pavel
Motlicek Petr
Nigmatulina Iuliia
Prasad Amrutha
Rigault Mickael
Sarfjoo Seyyed Saeed
Szöke Igor
Tart Allan
Veselý Karel
Zuluaga-Gomez Juan
Černocký Jan
Publication venue
Publication date: 08/11/2022
Field of study

Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community.Comment: Manuscript under review; The code will be available at https://github.com/idiap/atco2-corpu

arXiv.org e-Print Archive

Some Novel Algorithms for Processing Sensor Array Output Signals. Uudsed algoritmid antenniv\uf5re v\ue4ljundsignaali t\uf6\uf6tlemiseks

Author: Tart Allan
Publication venue: TalTech Press. TalTech Kirjastus
Publication date: 11/06/2021
Field of study

Tallinna Tehnikaülikooli Raamatukogu Digikogu (Digital Collection of Tallinn University of Technology Library)

Reference Trajectories: The Dataset Enabling Gate-to-Gate Flight Analysis

Author: Allan Tart
Enrico Spinielli
John Fitzgerald
Rainer Koelle
Publication venue: 'MDPI AG'
Publication date: 28/01/2022
Field of study

Without a doubt, a publicly verifiable data is required to ensure a strong, transparent and independent air traffic management performance review system. Community sourced data (such as ADS-B/Mode S provided by OpenSky Network and others alike) has been used to analyse different aspects of air traffic management. The main drawback of such ADS-B data is the lack of crucial pieces of information that need to be inferred. On the other hand, Eurocontrol has used correlated position reports (CPRs) gathered from European Air Navigation Service Providers (ANSP) to conduct some of its actual/flown trajectory oriented performance analysis. The availability and the granularity of the CPRs vary between Eurocontrol Member States, making it difficult to perform accurate wide-scale studies. Using the strengths of both data sources would obviously result in great benefits. This paper describes the first step in creating a pan-European Flight Table (FT) and its supporting reference trajectories (RT). It is expected that the resulting dataset will be made available for the general public and that the work will continue to improve in scope and accuracy

Multidisciplinary Digital Publishing Institute

Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications

Author: Alexander Blatt
Allan Tart
Amrutha Prasad
Claudia Cevenini
Dietrich Klakow
Fabian Landis
Honza Černocký
Igor Szöke
Juan Zuluaga-Gomez
Karel Veselý
Khalid Choukri
Martin Kocour
Mickael Rigault
Pavel Kolčárek
Petr Motlicek
Saeed Sarfjoo
Publication venue: 'MDPI AG'
Publication date: 03/12/2020
Field of study

Voice communication is the main channel to exchange information between pilots and Air-Traffic Controllers (ATCos). Recently, several projects have explored the employment of speech recognition technology to automatically extract spoken key information such as call signs, commands, and values, which can be used to reduce ATCos’ workload and increase performance and safety in Air-Traffic Control (ATC)-related activities. Nevertheless, the collection of ATC speech data is very demanding, expensive, and limited to the intrinsic speakers’ characteristics. As a solution, this paper presents ATCO2, a project that aims to develop a unique platform to collect, organize, and pre-process ATC data collected from air space. Initially, the data are gathered directly through publicly accessible radio frequency channels with VHF receivers and LiveATC, which can be considered as an “unlimited-source” of low-quality data. The ATCO2 project explores employing context information such as radar and air surveillance data (collected with ADS-B and Mode S) from the OpenSky Network (OSN) to correlate call signs automatically extracted from voice communication with those available from ADS-B channels, to eventually increase the overall call sign detection rates. More specifically, the timestamp and location of the spoken command (issued by the ATCo by voice) are extracted, and a query is sent to the OSN server to retrieve the call sign tags in ICAO format for the airplanes corresponding to the given area. Then, a word sequence provided by an automatic speech recognition system is fed into a Natural Language Processing (NLP) based module together with the set of call signs available from the ADS-B channels. The NLP module extracts the call sign, command, and command arguments from the spoken utterance

Multidisciplinary Digital Publishing Institute

Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications

Author: Alexander Blatt
Cernocky Honza
Cevenini Claudia
Choukri Khalid
Juan Zuluaga-Gomez.
Klakow Dietrich
Kocour Martin
Kolcarek Pavel
Landis Fabian
Motlicek Petr
Prasad Amrutha
Rigault Mickael
Sarfjoo Seyyed Saeed
Szoke Igor
Tart Allan
Vesely Karel
Publication venue: 'MDPI AG'
Publication date: 13/04/2021
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Automatic Processing Pipeline for Collecting and Annotating Air-Traffic Voice Communication Data

Author: Alexander Blatt
Allan Tart
Amrutha Prasad
Chloe Salamin
Claudia Cevenini
Dietrich Klakow
Fabian Landis
Hicham Atassi
Igor Szöke
Iuliia Nigmatulina
Jan Černocký
Juan Zuluaga-Gomez
Karel Veselý
Khalid Choukri
Martin Kocour
Mickael Rigault
Pavel Kolčárek
Petr Motlíček
Saeed Sarfjoo
Santosh Kesiraju
Publication venue: 'MDPI AG'
Publication date: 31/12/2021
Field of study

This document describes our pipeline for automatic processing of ATCO pilot audio communication we developed as part of the ATCO2 project. So far, we collected two thousand hours of audio recordings that we either preprocessed for the transcribers or used for semi-supervised training. Both methods of using the collected data can further improve our pipeline by retraining our models. The proposed automatic processing pipeline is a cascade of many standalone components: (a) segmentation, (b) volume control, (c) signal-to-noise ratio filtering, (d) diarization, (e) ‘speech-to-text’ (ASR) module, (f) English language detection, (g) call-sign code recognition, (h) ATCO—pilot classification and (i) highlighting commands and values. The key component of the pipeline is a speech-to-text transcription system that has to be trained with real-world ATC data; otherwise, the performance is poor. In order to further improve speech-to-text performance, we apply both semi-supervised training with our recordings and the contextual adaptation that uses a list of plausible callsigns from surveillance data as auxiliary information. Downstream NLP/NLU tasks are important from an application point of view. These application tasks need accurate models operating on top of the real speech-to-text output; thus, there is a need for more data too. Creating ATC data is the main aspiration of the ATCO2 project. At the end of the project, the data will be packaged and distributed by ELDA

Multidisciplinary Digital Publishing Institute

Subjective time under altered states of consciousness in ayahuasca users in shamanistic rituals involving music

Author: Allan LG
Berlyne DE
Block R
Block RA
Block RA
Bueno JLO
Campagnoli APS
Correa-Netto NF
Don NS
Firmino EA
Firmino EA
Fraisse P
Hamill J
Hicks RE
Hoffmann E
Jones MR
Kagerer FA
Labate BC
Mabit J
Mates J
Mitrani L
Moreira P
Nichols DE
Ornstein RE
Palhano-Fontes F
Paula E
Pires APS
Ramos BM
Riba J
Sanches RF
Schenberg EE
Schultes RE
Shanon B
Shanon B
Shulgin A
Tart CT
Valle M
Wackermann J
Wittmann M
Zakay D
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Books Received

Crossref

My Life in Chaos

Crossref