354 research outputs found
Multimodal Differential Emission Measure in the Solar Corona
The Atmospheric Imaging Assembly (AIA) telescope on board the Solar Dynamics
Observatory (SDO) provides coronal EUV imaging over a broader temperature
sensitivity range than the previous generations of instruments (EUVI, EIT, and
TRACE). Differential emission measure tomography (DEMT) of the solar corona
based on AIA data is presented here for the first time. The main product of
DEMT is the three-dimensional (3D) distribution of the local differential
emission measure (LDEM). While in previous studies, based on EIT or EUVI data,
there were 3 available EUV bands, with a sensitivity range
MK, the present study is based on the 4 cooler AIA bands (aimed at studying the
quiet sun), sensitive to the range MK. The AIA filters allow
exploration of new parametric LDEM models. Since DEMT is better suited for
lower activity periods, we use data from Carrington Rotation 2099, when the Sun
was in its most quiescent state during the AIA mission. Also, we validate the
parametric LDEM inversion technique by applying it to standard bi-dimensional
(2D) differential emission measure (DEM) analysis on sets of simultaneous AIA
images, and comparing the results with DEM curves obtained using other methods.
Our study reveals a ubiquitous bimodal LDEM distribution in the quiet diffuse
corona, which is stronger for denser regions. We argue that the nanoflare
heating scenario is less likely to explain these results, and that alternative
mechanisms, such as wave dissipation appear better supported by our results.Comment: 52 pages, 18 figure
Multimodal Attention Networks for Low-Level Vision-and-Language Navigation
Vision-and-Language Navigation (VLN) is a challenging task in which an agent needs to follow a language-specified path to reach a target destination. The goal gets even harder as the actions available to the agent get simpler and move towards low-level, atomic interactions with the environment. This setting takes the name of low-level VLN. In this paper, we strive for the creation of an agent able to tackle three key issues: multi-modality, long-term dependencies, and adaptability towards different locomotive settings. To that end, we devise "Perceive, Transform, and Act" (PTA): a fully-attentive VLN architecture that leaves the recurrent approach behind and the first Transformer-like architecture incorporating three different modalities -- natural language, images, and low-level actions for the agent control. In particular, we adopt an early fusion strategy to merge lingual and visual information efficiently in our encoder. We then propose to refine the decoding phase with a late fusion extension between the agent's history of actions and the perceptual modalities. We experimentally validate our model on two datasets: PTA achieves promising results in low-level VLN on R2R and achieves good performance in the recently proposed R4R benchmark. Our code is publicly available at https://github.com/aimagelab/perceive-transform-and-act
One Week of Motor Adaptation Induces Structural Changes in Primary Motor Cortex That Predict Long-Term Memory One Year Later
The neural bases of motor adaptation have been extensively explored in human and non-human primates. A network including the cerebellum, primary motor and the posterior parietal cortex appears to be crucial for this type of learning. Yet, to date, it is unclear whether these regions contribute directly or indirectly to the formation of motor memories. Here we trained subjects on a complex visuomotor rotation associated with long-term memory (in the order of months) to identify potential sites of structural plasticity induced by adaptation. One week of training led to i) an increment in local gray-matter concentration over the hand area of the contralateral primary motor cortex and ii) an increase in fractional anisotropy in an area underneath this region that correlated with the speed of learning. Moreover, the change in gray matter concentration measured immediately after training predicted improvements in the speed of learning during re-adaptation one year later. Our study suggests that motor adaptation induces structural plasticity in primary motor circuits. In addition, it provides the first piece of evidence indicating that early structural changes induced by motor learning may impact on behavior up to one year after training.Fil: Landi, Sofía Mariana. Universidad de Buenos Aires. Facultad de Medicina. Departamento de Ciencias Fisiológicas. Cátedra de Fisiologia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay; ArgentinaFil: Baguear, Miguel Federico. Universidad de Buenos Aires. Facultad de Medicina. Departamento de Ciencias Fisiológicas. Cátedra de Fisiologia; ArgentinaFil: Della Maggiore, Valeria Monica. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay; Argentina. Universidad de Buenos Aires. Facultad de Medicina. Departamento de Ciencias Fisiológicas. Cátedra de Fisiologia; Argentin
Spot the Difference: A Novel Task for Embodied Agents in Changing Environments
Embodied AI is a recent research area that aims at creating intelligent
agents that can move and operate inside an environment. Existing approaches in
this field demand the agents to act in completely new and unexplored scenes.
However, this setting is far from realistic use cases that instead require
executing multiple tasks in the same environment. Even if the environment
changes over time, the agent could still count on its global knowledge about
the scene while trying to adapt its internal representation to the current
state of the environment. To make a step towards this setting, we propose Spot
the Difference: a novel task for Embodied AI where the agent has access to an
outdated map of the environment and needs to recover the correct layout in a
fixed time budget. To this end, we collect a new dataset of occupancy maps
starting from existing datasets of 3D spaces and generating a number of
possible layouts for a single environment. This dataset can be employed in the
popular Habitat simulator and is fully compliant with existing methods that
employ reconstructed occupancy maps during navigation. Furthermore, we propose
an exploration policy that can take advantage of previous knowledge of the
environment and identify changes in the scene faster and more effectively than
existing agents. Experimental results show that the proposed architecture
outperforms existing state-of-the-art models for exploration on this new
setting.Comment: Accepted by 26TH International Conference on Pattern Recognition
(ICPR 2022
Out of the Box: Embodied Navigation in the Real World
The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances?
In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world
Dress Code: High-Resolution Multi-Category Virtual Try-On
Image-based virtual try-on strives to transfer the appearance of a clothing
item onto the image of a target person. Prior work focuses mainly on upper-body
clothes (e.g. t-shirts, shirts, and tops) and neglects full-body or lower-body
items. This shortcoming arises from a main factor: current publicly available
datasets for image-based virtual try-on do not account for this variety, thus
limiting progress in the field. To address this deficiency, we introduce Dress
Code, which contains images of multi-category clothes. Dress Code is more than
3x larger than publicly available datasets for image-based virtual try-on and
features high-resolution paired images (1024 x 768) with front-view, full-body
reference models. To generate HD try-on images with high visual quality and
rich in details, we propose to learn fine-grained discriminating features.
Specifically, we leverage a semantic-aware discriminator that makes predictions
at pixel-level instead of image- or patch-level. Extensive experimental
evaluation demonstrates that the proposed approach surpasses the baselines and
state-of-the-art competitors in terms of visual quality and quantitative
results. The Dress Code dataset is publicly available at
https://github.com/aimagelab/dress-code.Comment: Dress Code - Video Demo: https://www.youtube.com/watch?v=qr6TW3uTHG
- …