483 research outputs found
Recommended from our members
Visualizing Morphogenesis through Instability Formation in 4-D Printing.
Heterogeneous growth in a myriad of biological systems can lead to the formation of distinct morphologies during the maturation processes of different species. We demonstrate that the distinct circumferential buckling observed in pumpkins can be reproduced by a core-shell barrel structure using four-dimensional (4D) printing, taking advantage of digital light processing (DLP)-based three-dimensional (3D) printing and stimulus-responsive hydrogels. The mechanical mismatch between the stiff core and compliant shell results in buckling instability on the surface. The initiation and development of the buckling are governed by the ratio of core/shell radius, the ratio of core/shell swelling ratios, and the mismatch between the core and shell in stiffness. Furthermore, the rigid core not only acts as a source of circumferential confinement but also sets a boundary at the poles of the entire structure. The heterogeneous structures with controllable buckling geometrically and structurally behave much like plants' fruits. This replicates the biological morphologic change and elucidates the general mechanism and dynamics of the complex instability formation of heterogeneous 3D objects
Optimal map-making with singularities
In this work, we investigate the optimal map-making technique for the linear
system while carefully taking into account
singularities that may come from either the covariance matrix \bm{A}$. We first describe the general
optimal solution, which is quite complex, and then use the modified pseudo
inverse to create a near-optimal solution, which is simple, robust, and can
significantly alleviate the unwanted noise amplification during map-making. The
effectiveness of the nearly optimal solution is then compared to that of the
naive co-adding solution and the standard pseudo inverse solution, showing
noticeable improvements. Interestingly, all one needs to get the near-optimal
solution with singularity is just a tiny change to the traditional optimal
solution that is designed for the case without singularity.Comment: 24 pages, 7 figures, and 2 appendice
Bootstrap Generalization Ability from Loss Landscape Perspective
Domain generalization aims to learn a model that can generalize well on the
unseen test dataset, i.e., out-of-distribution data, which has different
distribution from the training dataset. To address domain generalization in
computer vision, we introduce the loss landscape theory into this field.
Specifically, we bootstrap the generalization ability of the deep learning
model from the loss landscape perspective in four aspects, including backbone,
regularization, training paradigm, and learning rate. We verify the proposed
theory on the NICO++, PACS, and VLCS datasets by doing extensive ablation
studies as well as visualizations. In addition, we apply this theory in the
ECCV 2022 NICO Challenge1 and achieve the 3rd place without using any domain
invariant methods.Comment: 18 pages, 4 figure
Graph Reinforcement Learning Application to Co-operative Decision-Making in Mixed Autonomy Traffic: Framework, Survey, and Challenges
Proper functioning of connected and automated vehicles (CAVs) is crucial for
the safety and efficiency of future intelligent transport systems. Meanwhile,
transitioning to fully autonomous driving requires a long period of mixed
autonomy traffic, including both CAVs and human-driven vehicles. Thus,
collaboration decision-making for CAVs is essential to generate appropriate
driving behaviors to enhance the safety and efficiency of mixed autonomy
traffic. In recent years, deep reinforcement learning (DRL) has been widely
used in solving decision-making problems. However, the existing DRL-based
methods have been mainly focused on solving the decision-making of a single
CAV. Using the existing DRL-based methods in mixed autonomy traffic cannot
accurately represent the mutual effects of vehicles and model dynamic traffic
environments. To address these shortcomings, this article proposes a graph
reinforcement learning (GRL) approach for multi-agent decision-making of CAVs
in mixed autonomy traffic. First, a generic and modular GRL framework is
designed. Then, a systematic review of DRL and GRL methods is presented,
focusing on the problems addressed in recent research. Moreover, a comparative
study on different GRL methods is further proposed based on the designed
framework to verify the effectiveness of GRL methods. Results show that the GRL
methods can well optimize the performance of multi-agent decision-making for
CAVs in mixed autonomy traffic compared to the DRL methods. Finally, challenges
and future research directions are summarized. This study can provide a
valuable research reference for solving the multi-agent decision-making
problems of CAVs in mixed autonomy traffic and can promote the implementation
of GRL-based methods into intelligent transportation systems. The source code
of our work can be found at https://github.com/Jacklinkk/Graph_CAVs.Comment: 22 pages, 7 figures, 10 tables. Currently under review at IEEE
Transactions on Intelligent Transportation System
Lawyer LLaMA Technical Report
Large Language Models (LLMs), like LLaMA, have exhibited remarkable
performances across various tasks. Nevertheless, when deployed to specific
domains such as law or medicine, the models still confront the challenge of a
deficiency in domain-specific knowledge and an inadequate capability to
leverage that knowledge to resolve domain-related problems. In this paper, we
focus on the legal domain and explore how to inject domain knowledge during the
continual training stage and how to design proper supervised finetune tasks to
help the model tackle practical issues. Moreover, to alleviate the
hallucination problem during model's generation, we add a retrieval module and
extract relevant articles before the model answers any queries. Augmenting with
the extracted evidence, our model could generate more reliable responses. We
release our data and model at https://github.com/AndrewZhe/lawyer-llama.Comment: Work in progres
Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
This work explores an efficient approach to establish a foundational
video-text model for tasks including open-vocabulary video classification,
text-to-video retrieval, video captioning and video question-answering. We
present VideoCoCa that reuses a pretrained image-text contrastive captioner
(CoCa) model and adapt it to video-text tasks with minimal extra training.
While previous works adapt image-text models with various cross-frame fusion
modules (for example, cross-frame attention layer or perceiver resampler) and
finetune the modified architecture on video-text data, we surprisingly find
that the generative attentional pooling and contrastive attentional pooling
layers in the image-text CoCa design are instantly adaptable to ``flattened
frame embeddings'', yielding a strong zero-shot transfer baseline for many
video-text tasks. Specifically, the frozen image encoder of a pretrained
image-text CoCa takes each video frame as inputs and generates token
embeddings per frame for totally video frames. We flatten
token embeddings as a long sequence of frozen video representation and apply
CoCa's generative attentional pooling and contrastive attentional pooling on
top. All model weights including pooling layers are directly loaded from an
image-text CoCa pretrained model. Without any video or video-text data,
VideoCoCa's zero-shot transfer baseline already achieves state-of-the-art
results on zero-shot video classification on Kinetics 400/600/700, UCF101,
HMDB51, and Charades, as well as zero-shot text-to-video retrieval on MSR-VTT
and ActivityNet Captions. We also explore lightweight finetuning on top of
VideoCoCa, and achieve strong results on video question-answering (iVQA,
MSRVTT-QA, MSVD-QA) and video captioning (MSR-VTT, ActivityNet, Youcook2). Our
approach establishes a simple and effective video-text baseline for future
research.Comment: Technical repor
LATITUDE: Robotic Global Localization with Truncated Dynamic Low-pass Filter in City-scale NeRF
Neural Radiance Fields (NeRFs) have made great success in representing
complex 3D scenes with high-resolution details and efficient memory.
Nevertheless, current NeRF-based pose estimators have no initial pose
prediction and are prone to local optima during optimization. In this paper, we
present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter,
which introduces a two-stage localization mechanism in city-scale NeRF. In
place recognition stage, we train a regressor through images generated from
trained NeRFs, which provides an initial value for global localization. In pose
optimization stage, we minimize the residual between the observed image and
rendered image by directly optimizing the pose on tangent plane. To avoid
convergence to local optimum, we introduce a Truncated Dynamic Low-pass Filter
(TDLF) for coarse-to-fine pose registration. We evaluate our method on both
synthetic and real-world data and show its potential applications for
high-precision navigation in large-scale city scenes. Codes and data will be
publicly available at https://github.com/jike5/LATITUDE.Comment: 7 pages, 6 figures, submitted to ICRA 202
- …