40 research outputs found
Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions
Visual language navigation (VLN) is an embodied task demanding a wide range
of skills encompassing understanding, perception, and planning. For such a
multifaceted challenge, previous VLN methods totally rely on one model's own
thinking to make predictions within one round. However, existing models, even
the most advanced large language model GPT4, still struggle with dealing with
multiple tasks by single-round self-thinking. In this work, drawing inspiration
from the expert consultation meeting, we introduce a novel zero-shot VLN
framework. Within this framework, large models possessing distinct abilities
are served as domain experts. Our proposed navigation agent, namely DiscussNav,
can actively discuss with these experts to collect essential information before
moving at every step. These discussions cover critical navigation subtasks like
instruction understanding, environment perception, and completion estimation.
Through comprehensive experiments, we demonstrate that discussions with domain
experts can effectively facilitate navigation by perceiving
instruction-relevant information, correcting inadvertent errors, and sifting
through in-consistent movement decisions. The performances on the
representative VLN task R2R show that our method surpasses the leading
zero-shot VLN model by a large margin on all metrics. Additionally, real-robot
experiments display the obvious advantages of our method over single-round
self-thinking.Comment: Submitted to ICRA 202
DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic Graph Memory
In recent years, learning-based approaches have demonstrated significant
promise in addressing intricate navigation tasks. Traditional methods for
training deep neural network navigation policies rely on meticulously designed
reward functions or extensive teleoperation datasets as navigation
demonstrations. However, the former is often confined to simulated
environments, and the latter demands substantial human labor, making it a
time-consuming process. Our vision is for robots to autonomously learn
navigation skills and adapt their behaviors to environmental changes without
any human intervention. In this work, we discuss the self-supervised navigation
problem and present Dynamic Graph Memory (DGMem), which facilitates training
only with on-board observations. With the help of DGMem, agents can actively
explore their surroundings, autonomously acquiring a comprehensive navigation
policy in a data-efficient manner without external feedback. Our method is
evaluated in photorealistic 3D indoor scenes, and empirical studies demonstrate
the effectiveness of DGMem.Comment: 8 pages, 6 figure
Robust Navigation with Cross-Modal Fusion and Knowledge Transfer
Recently, learning-based approaches show promising results in navigation
tasks. However, the poor generalization capability and the simulation-reality
gap prevent a wide range of applications. We consider the problem of improving
the generalization of mobile robots and achieving sim-to-real transfer for
navigation skills. To that end, we propose a cross-modal fusion method and a
knowledge transfer framework for better generalization. This is realized by a
teacher-student distillation architecture. The teacher learns a discriminative
representation and the near-perfect policy in an ideal environment. By
imitating the behavior and representation of the teacher, the student is able
to align the features from noisy multi-modal input and reduce the influence of
variations on navigation policy. We evaluate our method in simulated and
real-world environments. Experiments show that our method outperforms the
baselines by a large margin and achieves robust navigation performance with
varying working conditions.Comment: Accepted by ICRA 202
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Cross-modal alignment is essential for vision-language pre-training (VLP)
models to learn the correct corresponding information across different
modalities. For this purpose, inspired by the success of masked language
modeling (MLM) tasks in the NLP pre-training area, numerous masked modeling
tasks have been proposed for VLP to further promote cross-modal interactions.
The core idea of previous masked modeling tasks is to focus on reconstructing
the masked tokens based on visible context for learning local-to-local
alignment. However, most of them pay little attention to the global semantic
features generated for the masked data, resulting in the limited cross-modal
alignment ability of global representations. Therefore, in this paper, we
propose a novel Semantic Completion Learning (SCL) task, complementary to
existing masked modeling tasks, to facilitate global-to-local alignment.
Specifically, the SCL task complements the missing semantics of masked data by
capturing the corresponding information from the other modality, promoting
learning more representative global features which have a great impact on the
performance of downstream tasks. Moreover, we present a flexible vision
encoder, which enables our model to perform image-text and video-text
multimodal tasks simultaneously. Experimental results show that our proposed
method obtains state-of-the-art performance on various vision-language
benchmarks, such as visual question answering, image-text retrieval, and
video-text retrieval
Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization
Existing methods for CWS usually rely on a large number of labeled sentences
to train word segmentation models, which are expensive and time-consuming to
annotate. Luckily, the unlabeled data is usually easy to collect and many
high-quality Chinese lexicons are off-the-shelf, both of which can provide
useful information for CWS. In this paper, we propose a neural approach for
Chinese word segmentation which can exploit both lexicon and unlabeled data.
Our approach is based on a variant of posterior regularization algorithm, and
the unlabeled data and lexicon are incorporated into model training as indirect
supervision by regularizing the prediction space of CWS models. Extensive
experiments on multiple benchmark datasets in both in-domain and cross-domain
scenarios validate the effectiveness of our approach.Comment: 7 pages, 11 figures, accepted by the 2019 World Wide Web Conference
(WWW '19
Methylprednisolone as Adjunct to Endovascular Thrombectomy for Large-Vessel Occlusion Stroke
Importance
It is uncertain whether intravenous methylprednisolone improves outcomes for patients with acute ischemic stroke due to large-vessel occlusion (LVO) undergoing endovascular thrombectomy.
Objective
To assess the efficacy and adverse events of adjunctive intravenous low-dose methylprednisolone to endovascular thrombectomy for acute ischemic stroke secondary to LVO.
Design, Setting, and Participants
This investigator-initiated, randomized, double-blind, placebo-controlled trial was implemented at 82 hospitals in China, enrolling 1680 patients with stroke and proximal intracranial LVO presenting within 24 hours of time last known to be well. Recruitment took place between February 9, 2022, and June 30, 2023, with a final follow-up on September 30, 2023.InterventionsEligible patients were randomly assigned to intravenous methylprednisolone (n = 839) at 2 mg/kg/d or placebo (n = 841) for 3 days adjunctive to endovascular thrombectomy.
Main Outcomes and Measures
The primary efficacy outcome was disability level at 90 days as measured by the overall distribution of the modified Rankin Scale scores (range, 0 [no symptoms] to 6 [death]). The primary safety outcomes included mortality at 90 days and the incidence of symptomatic intracranial hemorrhage within 48 hours.
Results
Among 1680 patients randomized (median age, 69 years; 727 female [43.3%]), 1673 (99.6%) completed the trial. The median 90-day modified Rankin Scale score was 3 (IQR, 1-5) in the methylprednisolone group vs 3 (IQR, 1-6) in the placebo group (adjusted generalized odds ratio for a lower level of disability, 1.10 [95% CI, 0.96-1.25]; P = .17). In the methylprednisolone group, there was a lower mortality rate (23.2% vs 28.5%; adjusted risk ratio, 0.84 [95% CI, 0.71-0.98]; P = .03) and a lower rate of symptomatic intracranial hemorrhage (8.6% vs 11.7%; adjusted risk ratio, 0.74 [95% CI, 0.55-0.99]; P = .04) compared with placebo.
Conclusions and Relevance
Among patients with acute ischemic stroke due to LVO undergoing endovascular thrombectomy, adjunctive methylprednisolone added to endovascular thrombectomy did not significantly improve the degree of overall disability.Trial RegistrationChiCTR.org.cn Identifier: ChiCTR210005172
Axial Cyclic Testing of Concrete-Filled Steel Tube Columns in Diagrid Structures
Inclined concrete-filled steel tube (CFST) columns in a diagrid structure system can efficiently carry large vertical loads and horizontal forces. This paper presents an experimental study of the stress characteristics of engineered inclined CFST columns under axial cyclic loading. Ten specimens were tested, including two hollow steel tube (HST) columns and eight CFST columns, and the influences of loading scheme, aspect ratio, concrete strength, and steel ratio were examined. The seismic behaviours were investigated, including mechanical behaviour, failure modes and hysteretic curves, and ductility, and the interaction between the steel tube and concrete was examined as well. Better ductility and energy dissipation capacity are achieved in the tension direction, whereas higher bearing capacity and stiffness are achieved in the compression direction. Compared with hollow steel tube columns, the supporting effect of concrete on the steel tube for CFST columns in tension and the restraining effect of the steel tube on concrete for CFST columns in compression ensure higher capacity, deformability, and energy dissipation capacity
Insights into friction properties and mechanism of self-lubricating MoVN-Ag films at high temperature
MoVN is a promising high temperature lubricating material. In this study, the MoVN-Ag films with different content of Ag are synthesized for lubrication in wide temperature range using pulsed DC reactive magnetron sputtering. The effect of Ag content on the mechanical properties and tribological behavior of the films at 25 degrees C, 300 degrees C, 500 degrees C and 700 degrees C is investigated. The results reveal that although the doping of silver is detrimental to the mechanical properties of MoVN films, it can improve the tribological properties of the films. The optimized MoVN-Ag film with Ag content of 45.6 at.% shows a promising self-lubricating performance and low wear rate at different test temperatures. The average friction coefficients are as low as about 0.19 and 0.28 at 500 degrees C and 700 degrees C, respectively. There are different friction mechanisms at the test temperatures: Ag diffused self-lubricating film at 500 degrees C, as well as Magn.eli and double oxide easy shear phases formed at 700 degrees C, dominate the low friction behavior of the MoVN-Ag films