Search CORE

567 research outputs found

Deep learning in remote sensing: a review

Author: Fraundorfer Friedrich
Mou Lichao
Tuia Devis
Xia Gui-Song
Xu Feng
Zhang Liangpei
Zhu Xiao Xiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

arXiv.org e-Print Archive

Carolina Digital Repository

The Conversation: Deep Audio-Visual Speech Enhancement

Author: Afouras Triantafyllos
Chung Joon Son
Zisserman Andrew
Publication venue
Publication date: 01/01/2018
Field of study

Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker's voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and unseen during training, and for unconstrained environments. We demonstrate strong quantitative and qualitative results, isolating extremely challenging real-world examples.Comment: To appear in Interspeech 2018. We provide supplementary material with interactive demonstrations on http://www.robots.ox.ac.uk/~vgg/demo/theconversatio

arXiv.org e-Print Archive

Oxford University Research Archive

SLNSpeech: solving extended speech separation problem by the help of sign language

Author: Kong Youyong
Li Taotao
Senhadji Lotfi
Shu Huazhong
Wu Jiasong
Yang Guanyu
Publication venue
Publication date: 21/07/2020
Field of study

A speech separation task can be roughly divided into audio-only separation and audio-visual separation. In order to make speech separation technology applied in the real scenario of the disabled, this paper presents an extended speech separation problem which refers in particular to sign language assisted speech separation. However, most existing datasets for speech separation are audios and videos which contain audio and/or visual modalities. To address the extended speech separation problem, we introduce a large-scale dataset named Sign Language News Speech (SLNSpeech) dataset in which three modalities of audio, visual, and sign language are coexisted. Then, we design a general deep learning network for the self-supervised learning of three modalities, particularly, using sign language embeddings together with audio or audio-visual information for better solving the speech separation task. Specifically, we use 3D residual convolutional network to extract sign language features and use pretrained VGGNet model to exact visual features. After that, an improved U-Net with skip connections in feature extraction stage is applied for learning the embeddings among the mixed spectrogram transformed from source audios, the sign language features and visual features. Experiments results show that, besides visual modality, sign language modality can also be used alone to supervise speech separation task. Moreover, we also show the effectiveness of sign language assisted speech separation when the visual modality is disturbed. Source code will be released in http://cheertt.top/homepage/Comment: 33 pages, 8 figures, 5 table

arXiv.org e-Print Archive

LW-CMDANet:a novel attention network for SAR automatic target recognition

Author: Dong Jian
Feng Cheng
Fu Xiongjun
Lang Ping
Martorella Marco
Qin Rui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/08/2022
Field of study

Automated High-resolution Earth Observation Image Interpretation: Outcome of the 2020 Gaofen Challenge

Author: Dang B.
Diao W.
Fu K.
Guo J.
Hansch R.
Lu X.
Sun X.
Wang C.
Wang P.
Wei W.
Weinmann M.
Xiang D.
Xu F.
Yan C.
Yan Z.
Yang Z.
Yokoya N.
Zhang Y.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2021
Field of study

In this article, we introduce the 2020 Gaofen Challenge and relevant scientific outcomes. The 2020 Gaofen Challenge is an international competition, which is organized by the China High-Resolution Earth Observation Conference Committee and the Aerospace Information Research Institute, Chinese Academy of Sciences and technically cosponsored by the IEEE Geoscience and Remote Sensing Society and the International Society for Photogrammetry and Remote Sensing. It aims at promoting the academic development of automated high-resolution earth observation image interpretation. Six independent tracks have been organized in this challenge, which cover the challenging problems in the field of object detection and semantic segmentation. With the development of convolutional neural networks, deep-learning-based methods have achieved good performance on image interpretation. In this article, we report the details and the best-performing methods presented so far in the scope of this challenge

Directory of Open Access Journals