172 research outputs found
Advancing Vision Transformers with Group-Mix Attention
Vision Transformers (ViTs) have been shown to enhance visual recognition
through modeling long-range dependencies with multi-head self-attention (MHSA),
which is typically formulated as Query-Key-Value computation. However, the
attention map generated from the Query and Key captures only token-to-token
correlations at one single granularity. In this paper, we argue that
self-attention should have a more comprehensive mechanism to capture
correlations among tokens and groups (i.e., multiple adjacent tokens) for
higher representational capacity. Thereby, we propose Group-Mix Attention (GMA)
as an advanced replacement for traditional self-attention, which can
simultaneously capture token-to-token, token-to-group, and group-to-group
correlations with various group sizes. To this end, GMA splits the Query, Key,
and Value into segments uniformly and performs different group aggregations to
generate group proxies. The attention map is computed based on the mixtures of
tokens and group proxies and used to re-combine the tokens and groups in Value.
Based on GMA, we introduce a powerful backbone, namely GroupMixFormer, which
achieves state-of-the-art performance in image classification, object
detection, and semantic segmentation with fewer parameters than existing
models. For instance, GroupMixFormer-L (with 70.3M parameters and 384^2 input)
attains 86.2% Top-1 accuracy on ImageNet-1K without external data, while
GroupMixFormer-B (with 45.8M parameters) attains 51.2% mIoU on ADE20K
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
Perception systems in modern autonomous driving vehicles typically take
inputs from complementary multi-modal sensors, e.g., LiDAR and cameras.
However, in real-world applications, sensor corruptions and failures lead to
inferior performances, thus compromising autonomous safety. In this paper, we
propose a robust framework, called MetaBEV, to address extreme real-world
environments involving overall six sensor corruptions and two extreme
sensor-missing situations. In MetaBEV, signals from multiple sensors are first
processed by modal-specific encoders. Subsequently, a set of dense BEV queries
are initialized, termed meta-BEV. These queries are then processed iteratively
by a BEV-Evolving decoder, which selectively aggregates deep features from
either LiDAR, cameras, or both modalities. The updated BEV representations are
further leveraged for multiple 3D prediction tasks. Additionally, we introduce
a new M2oE structure to alleviate the performance drop on distinct tasks in
multi-task joint learning. Finally, MetaBEV is evaluated on the nuScenes
dataset with 3D object detection and BEV map segmentation tasks. Experiments
show MetaBEV outperforms prior arts by a large margin on both full and
corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV
improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla
BEVFusion model; and when the camera signal is absent, MetaBEV still achieves
69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform
on full-modalities. Moreover, MetaBEV performs fairly against previous methods
in both canonical perception and multi-task learning settings, refreshing
state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.Comment: Project page: https://chongjiange.github.io/metabev.htm
DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving
Safety is the primary priority of autonomous driving. Nevertheless, no
published dataset currently supports the direct and explainable safety
evaluation for autonomous driving. In this work, we propose DeepAccident, a
large-scale dataset generated via a realistic simulator containing diverse
accident scenarios that frequently occur in real-world driving. The proposed
DeepAccident dataset contains 57K annotated frames and 285K annotated samples,
approximately 7 times more than the large-scale nuScenes dataset with 40k
annotated samples. In addition, we propose a new task, end-to-end motion and
accident prediction, based on the proposed dataset, which can be used to
directly evaluate the accident prediction ability for different autonomous
driving algorithms. Furthermore, for each scenario, we set four vehicles along
with one infrastructure to record data, thus providing diverse viewpoints for
accident scenarios and enabling V2X (vehicle-to-everything) research on
perception and prediction tasks. Finally, we present a baseline V2X model named
V2XFormer that demonstrates superior performance for motion and accident
prediction and 3D object detection compared to the single-vehicle model
Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training
This work aims to improve unsupervised audio-visual pre-training. Inspired by
the efficacy of data augmentation in visual contrastive learning, we propose a
novel speed co-augmentation method that randomly changes the playback speeds of
both audio and video data. Despite its simplicity, the speed co-augmentation
method possesses two compelling attributes: (1) it increases the diversity of
audio-visual pairs and doubles the size of negative pairs, resulting in a
significant enhancement in the learned representations, and (2) it changes the
strict correlation between audio-visual pairs but introduces a partial
relationship between the augmented pairs, which is modeled by our proposed
SoftInfoNCE loss to further boost the performance. Experimental results show
that the proposed method significantly improves the learned representations
when compared to vanilla audio-visual contrastive learning.Comment: Published at the CVPR 2023 Sight and Sound worksho
Comparison of Analysis and Spectral Nudging Techniques for Dynamical Downscaling with the WRF Model over China
To overcome the problem that the horizontal resolution of global climate models may be too low to resolve features which are important at the regional or local scales, dynamical downscaling has been extensively used. However, dynamical downscaling results generally drift away from large-scale driving fields. The nudging technique can be used to balance the performance of dynamical downscaling at large and small scales, but the performances of the two nudging techniques (analysis nudging and spectral nudging) are debated. Moreover, dynamical downscaling is now performed at the convection-permitting scale to reduce the parameterization uncertainty and obtain the finer resolution. To compare the performances of the two nudging techniques in this study, three sensitivity experiments (with no nudging, analysis nudging, and spectral nudging) covering a period of two months with a grid spacing of 6 km over continental China are conducted to downscale the 1-degree National Centers for Environmental Prediction (NCEP) dataset with the Weather Research and Forecasting (WRF) model. Compared with observations, the results show that both of the nudging experiments decrease the bias of conventional meteorological elements near the surface and at different heights during the process of dynamical downscaling. However, spectral nudging outperforms analysis nudging for predicting precipitation, and analysis nudging outperforms spectral nudging for the simulation of air humidity and wind speed
Relationships of beans intake with chronic kidney disease in rural adults: A large-scale cross-sectional study
Background and aimsDietary factors play an important role in the development of chronic kidney disease (CKD). However, evidence on the relationship of beans consumption with CKD remains limited and inconclusive, especially in the middle-and low-income populations. The current study aimed to investigate the relationships of beans intake with indicators of kidney injury and CKD prevalence in rural adults.MethodsA total of 20,733 rural adults from the Henan Rural Cohort Study in 2018–2022 were included. The total beans intake was collected using a validated food frequency questionnaire. Indicators of kidney injury and CKD was determined by the estimated glomerular filtration rate and the urinary albumin to creatinine ratio. Generalized linear regression and logistic regression models were applied to estimate the relationship of beans intake with continuous and dichotomized indicators of renal function, respectively.ResultsOf the 20,733 participants, 2,676 (12.91%) subjects were identified as CKD patients. After adjusting for potential confounders, participants in the higher quartiles of beans intake had a lower prevalence of CKD (odds ratio and 95% confidence interval, OR (95%CI); Q2: 0.968(0.866–1.082); Q3: 0.836(0.744–0.939); Q4: 0.854(0.751–0.970)) and albuminuria (Q2: 0.982(0.875–1.102); Q3: 0.846(0.750–0.954); Q4: 0.852 (0.746–0.973)), compared with the Q1. Per 50 g/day increment in beans intake was significantly associated with a 5 and 4% decreased prevalence of albuminuria and CKD, respectively. These inverse relationships were also significant in the subgroups of men, elder, and high-income participants (p < 0.05).ConclusionDietary beans intake was inversely associated with the prevalence of albuminuria and CKD in rural adults, suggesting that promoting soy food intake might help reduce the occurrence of CKD in rural adults
Simultaneous radical cystectomy and nephroureterectomy in the treatment of panurothelial carcinoma: a systematic review and single-arm meta-analysis
BackgroundPanurothelial carcinoma is a rare and aggressive malignancy that requires effective treatment strategies to enhance patient outcomes.MethodsWe conducted a systematic search of English publications in databases including PubMed, Embase, Cochrane Library, and Web of Science up to May 2023. The quality of the literature was assessed using the Newcastle-Ottawa Scale (NOS) and the Methodological Quality and Synthesis of Case Series and Case Reports tool. Data statistics and analysis were performed using Stata 15.1 software (StataSE, USA).ResultsSix studies involving 339 patients were included in the analysis. Meta-analysis revealed that Simultaneous Radical Cystectomy and Nephroureterectomy had 2-year and 5-year overall survival rates of 68% (95% CI 60%-76%, I2 = 12.4%, P < 0.001) and 44% (95% CI 36%-53%, I2 = 0, P < 0.001), respectively. The 2-year and 5-year progression-free survival rates were 91% (95% CI 86%-95%, I2 = 95%, P < 0.001) and 65% (95% CI 58%-73%, I2 = 91.5%, P < 0.001), respectively. The 2-year and 5-year cancer-specific survival rates were 73% (95% CI 66%-81%, I2 = 16.7%, P < 0.001) and 57% (95% CI 49%-66%, I2 = 0, P < 0.001), respectively. Additionally, the incidence of minor complications was 19% (95% CI 15%-23%, P < 0.01), major complications was 49% (95% CI 34%-63%, P < 0.01), and the intraoperative blood transfusion rate was 53% (95% CI 44%-61%, P < 0.01).ConclusionsSimultaneous radical cystectomy and nephroureterectomy represent feasible approaches for the treatment of Panurothelial carcinoma. Nonetheless, a comprehensive assessment of the surgical risks and benefits is imperative, and larger-scale prospective cohort studies are required to validate therapeutic efficacy. Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO, identifier CRD42023426401
Is first pregnancy age associated with hypertension in the Chinese rural women population?
IntroductionThe purpose of this study was to investigate the relationship between first pregnancy age and hypertension later in the life of women from Chinese rural areas.MethodsIn total, 13,493 women were enrolled in the Henan Rural Cohort study. Logistic regression and linear regression were used to evaluate the association between first pregnancy age and hypertension and blood pressure indicators [including systolic blood pressure (SBP), diastolic blood pressure (DBP), and mean arterial pressure (MAP)]. The restricted cubic spline was used to examine the dose–response relationship between the first pregnancy age and hypertension or blood pressure indicators.ResultsAfter adjusting for potential confounders, each 1-year increase in first pregnancy age was associated with a 0.221 mmHg increase in SBP values, a 0.153 mmHg increase in DBP values, and a 0.176 mmHg decrease in MAP values (all P < 0.05). The β of SBP, DBP, and MAP showed a trend of first increasing and then decreasing with increasing first pregnancy age and there was no statistical significance after first pregnancy age beyond 33 years on SBP, DBP, and MAP, respectively. A 1-year increment in first pregnancy age was associated with a 2.9% [OR (95% CI): 1.029 (1.010, 1.048)] higher odds of prevalent hypertension. The odds of hypertension increased sharply and then eventually leveled off with an increment of first pregnancy age after adjusting for potential confounders.ConclusionFirst pregnancy age might increase the risk of hypertension later in life and might be an independent risk factor for hypertension in women
- …