186 research outputs found
TrTr: A Versatile Pre-Trained Large Traffic Model based on Transformer for Capturing Trajectory Diversity in Vehicle Population
Understanding trajectory diversity is a fundamental aspect of addressing
practical traffic tasks. However, capturing the diversity of trajectories
presents challenges, particularly with traditional machine learning and
recurrent neural networks due to the requirement of large-scale parameters. The
emerging Transformer technology, renowned for its parallel computation
capabilities enabling the utilization of models with hundreds of millions of
parameters, offers a promising solution. In this study, we apply the
Transformer architecture to traffic tasks, aiming to learn the diversity of
trajectories within vehicle populations. We analyze the Transformer's attention
mechanism and its adaptability to the goals of traffic tasks, and subsequently,
design specific pre-training tasks. To achieve this, we create a data structure
tailored to the attention mechanism and introduce a set of noises that
correspond to spatio-temporal demands, which are incorporated into the
structured data during the pre-training process. The designed pre-training
model demonstrates excellent performance in capturing the spatial distribution
of the vehicle population, with no instances of vehicle overlap and an RMSE of
0.6059 when compared to the ground truth values. In the context of time series
prediction, approximately 95% of the predicted trajectories' speeds closely
align with the true speeds, within a deviation of 7.5144m/s. Furthermore, in
the stability test, the model exhibits robustness by continuously predicting a
time series ten times longer than the input sequence, delivering smooth
trajectories and showcasing diverse driving behaviors. The pre-trained model
also provides a good basis for downstream fine-tuning tasks. The number of
parameters of our model is over 50 million.Comment: 16 pages, 6 figures, under reviewed by Transportation Research Board
Annual Meeting, work in updat
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Previous studies have explored generating accurately lip-synced talking faces
for arbitrary targets given audio conditions. However, most of them deform or
generate the whole facial area, leading to non-realistic results. In this work,
we delve into the formulation of altering only the mouth shapes of the target
person. This requires masking a large percentage of the original image and
seamlessly inpainting it with the aid of audio and reference frames. To this
end, we propose the Audio-Visual Context-Aware Transformer (AV-CAT) framework,
which produces accurate lip-sync with photo-realistic quality by predicting the
masked mouth shapes. Our key insight is to exploit desired contextual
information provided in audio and visual modalities thoroughly with delicately
designed Transformers. Specifically, we propose a convolution-Transformer
hybrid backbone and design an attention-based fusion strategy for filling the
masked parts. It uniformly attends to the textural information on the unmasked
regions and the reference frame. Then the semantic audio information is
involved in enhancing the self-attention computation. Additionally, a
refinement network with audio injection improves both image and lip-sync
quality. Extensive experiments validate that our model can generate
high-fidelity lip-synced results for arbitrary subjects.Comment: Accepted to SIGGRAPH Asia 2022 (Conference Proceedings). Project
page: https://hangz-nju-cuhk.github.io/projects/AV-CA
Species‐specific plant‐mediated effects between herbivores converge at high damage intensity
Plants are often exposed to multiple herbivores and densities of these attackers (or corresponding damage intensities) often fluctuate greatly in the field. Plant-mediated interactions vary among herbivore species and with changing feeding intensity, but little is known about how herbivore identity and density interact to determine plant responses and herbivore fitness. Here, we investigated this question using Triadica sebifera (tallow) and two common and abundant specialist insect herbivores, Bikasha collaris (flea beetle) and Heterapoderopsis bicallosicollis (weevil). By manipulating densities of leaf-feeding adults of these two herbivore species, we tested how variations in the intensity of leaf damage caused by flea beetle or weevil adults affected the performance of root-feeding flea beetle larvae and evaluated the potential of induced tallow root traits to predict flea beetle larval performance. We found that weevil adults consistently decreased the survival of flea beetle larvae with increasing leaf damage intensities. In contrast, conspecific flea beetle adults increased their larval survival at low damage then decreased larval survival at high damage, resulting in a unimodal pattern. Chemical analyses showed that increasing leaf damage from weevil adults linearly decreased root carbohydrates and increased root tannin, whereas flea beetle adults had opposite effects as weevil adults at low damage and similar effects as them at high damage. Furthermore, across all feeding treatments, flea beetle larval survival correlated positively with concentrations of carbohydrates and negatively with concentration of tannin, suggesting that root primary and secondary metabolism might underlie the observed effects on flea beetle larvae. Our study demonstrates that herbivore identity and density interact to determine systemic plant responses and plant-mediated effects on herbivores. In particular, effects are species-specific at low densities, but converge at high densities. These findings emphasize the importance of considering herbivore identity and density simultaneously when investigating factors driving plant-mediated interactions between herbivores, which advances our understanding of the structure and composition of herbivore communities and terrestrial food webs
- …