20 research outputs found
GMLD: A TOOL TO INVESTIGATE AND DEMONSTRATE THE USE OF ML IN VARIOUS AREAS OF GNSS DOMAIN
This paper presents relevant results achieved during the NAVISP-
EL1-035.02 project funded by the European Space Agency, which
aimed to investigate the possible uses of Machine Learning (ML)
based techniques for the processing of data in the field of Global
Navigation Satellite Systems (GNSSs). For this purpose, we
explored different kind of data present in the entire chain of the
positioning process and different kind of ML approaches. In
particular, this paper presents the system architecture and
technologies adopted for developing the GNSS ML Demonstrator
(GMLD), as well as the approaches and the results obtained for one
of the most promising GNSS implemented applications, which is
the prediction of daily maps of the ionosphere. Results show how,
based on the historical data and the time correlation of the values,
ML methods outperformed benchmark methods for the majority of
the
applications
approached,
improving
the
positioning
performance at GNSS user level. Since the GMLD has been
designed and implemented providing the general data management
and ML capabilities as part of the framework, it can be easily
reused to execute further investigation and implement new
applications
Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss
We devise a cascade GAN approach to generate talking face video, which is
robust to different face shapes, view angles, facial characteristics, and noisy
audio conditions. Instead of learning a direct mapping from audio to video
frames, we propose first to transfer audio to high-level structure, i.e., the
facial landmarks, and then to generate video frames conditioned on the
landmarks. Compared to a direct audio-to-image approach, our cascade approach
avoids fitting spurious correlations between audiovisual signals that are
irrelevant to the speech content. We, humans, are sensitive to temporal
discontinuities and subtle artifacts in video. To avoid those pixel jittering
problems and to enforce the network to focus on audiovisual-correlated regions,
we propose a novel dynamically adjustable pixel-wise loss with an attention
mechanism. Furthermore, to generate a sharper image with well-synchronized
facial movements, we propose a novel regression-based discriminator structure,
which considers sequence-level information along with frame-level information.
Thoughtful experiments on several datasets and real-world samples demonstrate
significantly better results obtained by our method than the state-of-the-art
methods in both quantitative and qualitative comparisons
Deep Constrained Dominant Sets for Person Re-Identification
In this work, we propose an end-to-end constrained clustering scheme to tackle the person re-identification (re-id) problem. Deep neural networks (DNN) have recently proven to be effective on person re-identification task. In particular, rather than leveraging solely a probe-gallery similarity, diffusing the similarities among the gallery images in an end-to-end manner has proven to be effective in yielding a robust probe-gallery affinity. However, existing methods do not apply probe image as a constraint, and are prone to noise propagation during the similarity diffusion process. To overcome this, we propose an intriguing scheme which treats person-image retrieval problem as a constrained clustering optimization problem, called deep constrained dominant sets (DCDS). Given a probe and gallery images, we re-formulate person re-id problem as finding a constrained cluster, where the probe image is taken as a constraint (seed) and each cluster corresponds to a set of images corresponding to the same person. By optimizing the constrained clustering in an end-to-end manner, we naturally leverage the contextual knowledge of a set of images corresponding to the given person-images. We further enhance the performance by integrating an auxiliary net alongside DCDS, which employs a multi-scale ResNet. To validate the effectiveness of our method we present experiments on several benchmark datasets and show that the proposed method can outperform state-of-the-art methods