183 research outputs found
A hybrid Decoder-DeepONet operator regression framework for unaligned observation data
Deep neural operators (DNOs) have been utilized to approximate nonlinear
mappings between function spaces. However, DNOs face the challenge of increased
dimensionality and computational cost associated with unaligned observation
data. In this study, we propose a hybrid Decoder-DeepONet operator regression
framework to handle unaligned data effectively. Additionally, we introduce a
Multi-Decoder-DeepONet, which utilizes an average field of training data as
input augmentation. The consistencies of the frameworks with the operator
approximation theory are provided, on the basis of the universal approximation
theorem. Two numerical experiments, Darcy problem and flow-field around an
airfoil, are conducted to validate the efficiency and accuracy of the proposed
methods. Results illustrate the advantages of Decoder-DeepONet and
Multi-Decoder-DeepONet in handling unaligned observation data and showcase
their potentials in improving prediction accuracy.Comment: 35 pages, 10 figures, 11 table
Unified Language Representation for Question Answering over Text, Tables, and Images
When trying to answer complex questions, people often rely on multiple
sources of information, such as visual, textual, and tabular data. Previous
approaches to this problem have focused on designing input features or model
structure in the multi-modal space, which is inflexible for cross-modal
reasoning or data-efficient training. In this paper, we call for an alternative
paradigm, which transforms the images and tables into unified language
representations, so that we can simplify the task into a simpler textual QA
problem that can be solved using three steps: retrieval, ranking, and
generation, all within a language space. This idea takes advantage of the power
of pre-trained language models and is implemented in a framework called Solar.
Our experimental results show that Solar outperforms all existing methods by
10.6-32.3 pts on two datasets, MultimodalQA and MMCoQA, across ten different
metrics. Additionally, Solar achieves the best performance on the WebQA
leaderboardComment: Findings of ACL 202
DeNoising-MOT: Towards Multiple Object Tracking with Severe Occlusions
Multiple object tracking (MOT) tends to become more challenging when severe
occlusions occur. In this paper, we analyze the limitations of traditional
Convolutional Neural Network-based methods and Transformer-based methods in
handling occlusions and propose DNMOT, an end-to-end trainable DeNoising
Transformer for MOT. To address the challenge of occlusions, we explicitly
simulate the scenarios when occlusions occur. Specifically, we augment the
trajectory with noises during training and make our model learn the denoising
process in an encoder-decoder architecture, so that our model can exhibit
strong robustness and perform well under crowded scenes. Additionally, we
propose a Cascaded Mask strategy to better coordinate the interaction between
different types of queries in the decoder to prevent the mutual suppression
between neighboring trajectories under crowded scenes. Notably, the proposed
method requires no additional modules like matching strategy and motion state
estimation in inference. We conduct extensive experiments on the MOT17, MOT20,
and DanceTrack datasets, and the experimental results show that our method
outperforms previous state-of-the-art methods by a clear margin.Comment: ACM Multimedia 202
Data-Driven Modeling of Landau Damping by Physics-Informed Neural Networks
Kinetic approaches are generally accurate in dealing with microscale plasma
physics problems but are computationally expensive for large-scale or
multiscale systems. One of the long-standing problems in plasma physics is the
integration of kinetic physics into fluid models, which is often achieved
through sophisticated analytical closure terms. In this study, we successfully
construct a multi-moment fluid model with an implicit fluid closure included in
the neural network using machine learning. The multi-moment fluid model is
trained with a small fraction of sparsely sampled data from kinetic simulations
of Landau damping, using the physics-informed neural network (PINN) and the
gradient-enhanced physics-informed neural network (gPINN). The multi-moment
fluid model constructed using either PINN or gPINN reproduces the time
evolution of the electric field energy, including its damping rate, and the
plasma dynamics from the kinetic simulations. For the first time, we introduce
a new variant of the gPINN architecture, namely, gPINN to capture the Landau
damping process. Instead of including the gradients of all the equation
residuals, gPINN only adds the gradient of the pressure equation residual as
one additional constraint. Among the three approaches, the gPINN-constructed
multi-moment fluid model offers the most accurate results. This work sheds new
light on the accurate and efficient modeling of large-scale systems, which can
be extended to complex multiscale laboratory, space, and astrophysical plasma
physics problems.Comment: 11 pages, 7 figure
- …