567 research outputs found
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
The Conversation: Deep Audio-Visual Speech Enhancement
Our goal is to isolate individual speakers from multi-talker simultaneous
speech in videos. Existing works in this area have focussed on trying to
separate utterances from known speakers in controlled environments. In this
paper, we propose a deep audio-visual speech enhancement network that is able
to separate a speaker's voice given lip regions in the corresponding video, by
predicting both the magnitude and the phase of the target signal. The method is
applicable to speakers unheard and unseen during training, and for
unconstrained environments. We demonstrate strong quantitative and qualitative
results, isolating extremely challenging real-world examples.Comment: To appear in Interspeech 2018. We provide supplementary material with
interactive demonstrations on
http://www.robots.ox.ac.uk/~vgg/demo/theconversatio
SLNSpeech: solving extended speech separation problem by the help of sign language
A speech separation task can be roughly divided into audio-only separation
and audio-visual separation. In order to make speech separation technology
applied in the real scenario of the disabled, this paper presents an extended
speech separation problem which refers in particular to sign language assisted
speech separation. However, most existing datasets for speech separation are
audios and videos which contain audio and/or visual modalities. To address the
extended speech separation problem, we introduce a large-scale dataset named
Sign Language News Speech (SLNSpeech) dataset in which three modalities of
audio, visual, and sign language are coexisted. Then, we design a general deep
learning network for the self-supervised learning of three modalities,
particularly, using sign language embeddings together with audio or
audio-visual information for better solving the speech separation task.
Specifically, we use 3D residual convolutional network to extract sign language
features and use pretrained VGGNet model to exact visual features. After that,
an improved U-Net with skip connections in feature extraction stage is applied
for learning the embeddings among the mixed spectrogram transformed from source
audios, the sign language features and visual features. Experiments results
show that, besides visual modality, sign language modality can also be used
alone to supervise speech separation task. Moreover, we also show the
effectiveness of sign language assisted speech separation when the visual
modality is disturbed. Source code will be released in
http://cheertt.top/homepage/Comment: 33 pages, 8 figures, 5 table
Automated High-resolution Earth Observation Image Interpretation: Outcome of the 2020 Gaofen Challenge
In this article, we introduce the 2020 Gaofen Challenge and relevant scientific outcomes. The 2020 Gaofen Challenge is an international competition, which is organized by the China High-Resolution Earth Observation Conference Committee and the Aerospace Information Research Institute, Chinese Academy of Sciences and technically cosponsored by the IEEE Geoscience and Remote Sensing Society and the International Society for Photogrammetry and Remote Sensing. It aims at promoting the academic development of automated high-resolution earth observation image interpretation. Six independent tracks have been organized in this challenge, which cover the challenging problems in the field of object detection and semantic segmentation. With the development of convolutional neural networks, deep-learning-based methods have achieved good performance on image interpretation. In this article, we report the details and the best-performing methods presented so far in the scope of this challenge
- …