111 research outputs found
Association Between Sars-Cov-2 Reinfections And Anti-S Antibody Levels At The First Omicron Wave In Salvador, Brazil
Background: SARS-CoV-2, with its high transmissibility and rapid dissemination, has caused a global public health emergency. The emergence of new variants and mutations of SARSCoV-2 spike protein antigens has led to concerns about immune escape and the potential for reinfection, even in individuals who have been previously infected or vaccinated. Brazil has been severely affected by the pandemic, especially in its densely populated slum areas. Our study aimed to evaluate the association between anti-S IgG antibody levels and subsequent SARS-CoV-2 infection during the Omicron wave in a susceptible community in Salvador, Brazil, to provide insight into the antibody level necessary for effective protection against infection with heterologous variants in similar settings. Methods and findings: We conducted this study in a cohort of 1827 residents of Pau da Lima, Salvador, Brazil. We measured serum levels of IgG against the SARS-CoV-2 Spike protein between July and November 2021. From November 2021 to February 2022, during the first Omicron wave, we performed symptom-based screening and PCR testing to identify new infections. We used logistic regression to estimate the association between antibody levels and subsequent PCR-confirmed infection. Among 210 individuals in the cohort who underwent PCR testing, we did not identify any association between antibody levels and PCR-confirmed infection. Among a subset of 84 individuals who did not receive vaccination between the time of antibody measurement and the time of PCR testing, higher antibody levels were associated with increased odds of PCR-confirmed infection. Conclusion: We did not identify a protective effect of serum anti-S IgG levels on subsequent risk of infection during the Omicron wave. Further studies could address limitations of our study (sample size, confounding) and evaluate the effect of variant-specific antibodie
Accelerating Toeplitz Neural Network with Constant-time Inference Complexity
Toeplitz Neural Networks (TNNs) have exhibited outstanding performance in
various sequence modeling tasks. They outperform commonly used
Transformer-based models while benefiting from log-linear space-time
complexities. On the other hand, State Space Models (SSMs) achieve lower
performance than TNNs in language modeling but offer the advantage of constant
inference complexity. In this paper, we aim to combine the strengths of TNNs
and SSMs by converting TNNs to SSMs during inference, thereby enabling TNNs to
achieve the same constant inference complexities as SSMs. To accomplish this,
we formulate the conversion process as an optimization problem and provide a
closed-form solution. We demonstrate how to transform the target equation into
a Vandermonde linear system problem, which can be efficiently solved using the
Discrete Fourier Transform (DFT). Notably, our method requires no training and
maintains numerical stability. It can be also applied to any LongConv-based
model. To assess its effectiveness, we conduct extensive experiments on
language modeling tasks across various settings. Additionally, we compare our
method to other gradient-descent solutions, highlighting the superior numerical
stability of our approach. The source code is available at
https://github.com/OpenNLPLab/ETSC-Exact-Toeplitz-to-SSM-Conversion.Comment: Accepted to EMNLP 2023. Yiran Zhong is the corresponding author. The
source code is available at
https://github.com/OpenNLPLab/ETSC-Exact-Toeplitz-to-SSM-Conversio
Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration
Recent camera-based 3D object detection is limited by the precision of
transforming from image to 3D feature spaces, as well as the accuracy of object
localization within the 3D space. This paper aims to address such a fundamental
problem of camera-based 3D object detection: How to effectively learn depth
information for accurate feature lifting and object localization. Different
from previous methods which directly predict depth distributions by using a
supervised estimation model, we propose a cascade framework consisting of two
depth-aware learning paradigms. First, a depth estimation (DE) scheme leverages
relative depth information to realize the effective feature lifting from 2D to
3D spaces. Furthermore, a depth calibration (DC) scheme introduces depth
reconstruction to further adjust the 3D object localization perturbation along
the depth axis. In practice, the DE is explicitly realized by using both the
absolute and relative depth optimization loss to promote the precision of depth
prediction, while the capability of DC is implicitly embedded into the
detection Transformer through a depth denoising mechanism in the training
phase. The entire model training is accomplished through an end-to-end manner.
We propose a baseline detector and evaluate the effectiveness of our proposal
with +2.2%/+2.7% NDS/mAP improvements on NuScenes benchmark, and gain a
comparable performance with 55.9%/45.7% NDS/mAP. Furthermore, we conduct
extensive experiments to demonstrate its generality based on various detectors
with about +2% NDS improvements.Comment: Accepted to ICRA202
Bibliometric and visualization analysis of research trend in mental health problems of children and adolescents during the COVID-19 pandemic
ObjectivesTo analyze the evolution of research on children and adolescents mental health issues during COVID-19 pandemic and discuss research hotspots and cutting-edge developments.MethodsThe literature obtained from the web of science core collection as of June 28, 2022, was analyzed using Citespace, VOSviewer bibliometric visualization mapping software.ResultsA total of 6,039 relevant papers were found, of which 5,594 were included in the study. The number of literatures is growing since 2020; and the country, institution, and journal publications were analyzed. The co-citation analysis shows that there are more research articles among the highly cited articles and a lack of systematic reviews that use critical thinking for review. In the cluster analysis, mental health and life change were the most representative. The timeline view of the keywords shows that Online learning (#0), Public health (#1), and Mental health (#2) are the three largest clusters and shows the change over time.ConclusionThis study helped analyze the mental health of children and adolescents during the COVID-19 pandemic and identified hot trends and shortcomings, which are important references for the theoretical basis of future research and decision making and technical guidance for systematic reviews
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
In this paper, we propose a novel training strategy called SupFusion, which
provides an auxiliary feature level supervision for effective LiDAR-Camera
fusion and significantly boosts detection performance. Our strategy involves a
data enhancement method named Polar Sampling, which densifies sparse objects
and trains an assistant model to generate high-quality features as the
supervision. These features are then used to train the LiDAR-Camera fusion
model, where the fusion feature is optimized to simulate the generated
high-quality features. Furthermore, we propose a simple yet effective deep
fusion module, which contiguously gains superior performance compared with
previous fusion methods with SupFusion strategy. In such a manner, our proposal
shares the following advantages. Firstly, SupFusion introduces auxiliary
feature-level supervision which could boost LiDAR-Camera detection performance
without introducing extra inference costs. Secondly, the proposed deep fusion
could continuously improve the detector's abilities. Our proposed SupFusion and
deep fusion module is plug-and-play, we make extensive experiments to
demonstrate its effectiveness. Specifically, we gain around 2% 3D mAP
improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors.Comment: Accepted to ICCV202
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Linear attention is an efficient attention mechanism that has recently
emerged as a promising alternative to conventional softmax attention. With its
ability to process tokens in linear computational complexities, linear
attention, in theory, can handle sequences of unlimited length without
sacrificing speed, i.e., maintaining a constant training speed for various
sequence lengths with a fixed memory consumption. However, due to the issue
with cumulative summation (cumsum), current linear attention algorithms cannot
demonstrate their theoretical advantage in a causal setting. In this paper, we
present Lightning Attention-2, the first linear attention implementation that
enables linear attention to realize its theoretical computational benefits. To
achieve this, we leverage the thought of tiling, separately handling the
intra-block and inter-block components in linear attention calculation.
Specifically, we utilize the conventional attention computation mechanism for
the intra-blocks and apply linear attention kernel tricks for the inter-blocks.
A tiling technique is adopted through both forward and backward procedures to
take full advantage of the GPU hardware. We implement our algorithm in Triton
to make it IO-aware and hardware-friendly. Various experiments are conducted on
different model sizes and sequence lengths. Lightning Attention-2 retains
consistent training and inference speed regardless of input sequence length and
is significantly faster than other attention mechanisms. The source code is
available at https://github.com/OpenNLPLab/lightning-attention.Comment: Technical Report. Yiran Zhong is the corresponding author. The source
code is available at https://github.com/OpenNLPLab/lightning-attentio
All-pairs Consistency Learning for Weakly Supervised Semantic Segmentation
In this work, we propose a new transformer-based regularization to better
localize objects for Weakly supervised semantic segmentation (WSSS). In
image-level WSSS, Class Activation Map (CAM) is adopted to generate object
localization as pseudo segmentation labels. To address the partial activation
issue of the CAMs, consistency regularization is employed to maintain
activation intensity invariance across various image augmentations. However,
such methods ignore pair-wise relations among regions within each CAM, which
capture context and should also be invariant across image views. To this end,
we propose a new all-pairs consistency regularization (ACR). Given a pair of
augmented views, our approach regularizes the activation intensities between a
pair of augmented views, while also ensuring that the affinity across regions
within each view remains consistent. We adopt vision transformers as the
self-attention mechanism naturally embeds pair-wise affinity. This enables us
to simply regularize the distance between the attention matrices of augmented
image pairs. Additionally, we introduce a novel class-wise localization method
that leverages the gradients of the class token. Our method can be seamlessly
integrated into existing WSSS methods using transformers without modifying the
architectures. We evaluate our method on PASCAL VOC and MS COCO datasets. Our
method produces noticeably better class localization maps (67.3% mIoU on PASCAL
VOC train), resulting in superior WSSS performances.Comment: ICCV 2023 worksho
Linearized Relative Positional Encoding
Relative positional encoding is widely used in vanilla and linear
transformers to represent positional information. However, existing encoding
methods of a vanilla transformer are not always directly applicable to a linear
transformer, because the latter requires a decomposition of the query and key
representations into separate kernel functions. Nevertheless, principles for
designing encoding methods suitable for linear transformers remain
understudied. In this work, we put together a variety of existing linear
relative positional encoding approaches under a canonical form and further
propose a family of linear relative positional encoding algorithms via unitary
transformation. Our formulation leads to a principled framework that can be
used to develop new relative positional encoding methods that preserve linear
space-time complexity. Equipped with different models, the proposed linearized
relative positional encoding (LRPE) family derives effective encoding for
various applications. Experiments show that compared with existing methods,
LRPE achieves state-of-the-art performance in language modeling, text
classification, and image classification. Meanwhile, it emphasizes a general
paradigm for designing broadly more relative positional encoding methods that
are applicable to linear transformers. The code is available at
https://github.com/OpenNLPLab/Lrpe.Comment: Reviewed by TMLR, decision pending. Yiran Zhong is the corresponding
author. Code is available at https://github.com/OpenNLPLab/Lrp
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
We present TransNormerLLM, the first linear attention-based Large Language
Model (LLM) that outperforms conventional softmax attention-based models in
terms of both accuracy and efficiency. TransNormerLLM evolves from the previous
linear attention architecture TransNormer by making advanced modifications that
include positional embedding, linear attention acceleration, gating mechanisms,
tensor normalization, and inference acceleration and stabilization.
Specifically, we use LRPE together with an exponential decay to avoid attention
dilution issues while allowing the model to retain global interactions between
tokens. Additionally, we propose Lightning Attention, a cutting-edge technique
that accelerates linear attention by more than twice in runtime and reduces
memory usage by a remarkable four times. To further enhance the performance of
TransNormer, we leverage a gating mechanism for smooth training and a new
tensor normalization scheme to accelerate the model, resulting in an impressive
acceleration of over . Furthermore, we develop a robust inference
algorithm that ensures numerical stability and consistent inference speed,
regardless of the sequence length, showcasing superior efficiency during both
training and inference stages. We also implement an efficient model parallel
schema for TransNormerLLM, enabling seamless deployment on large-scale clusters
and facilitating expansion to even more extensive models, i.e., LLMs with 175B
parameters. We validate our model design through a series of ablations and
train models with sizes of 385M, 1B, and 7B on our self-collected corpus.
Benchmark results demonstrate that our models not only match the performance of
state-of-the-art LLMs with Transformer but are also significantly faster. Code
is released at: https://github.com/OpenNLPLab/TransnormerLLM.Comment: Technical Report. Yiran Zhong is the corresponding author. Zhen Qin,
Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen contribute equally to this
paper. Code is released at: https://github.com/OpenNLPLab/TransnormerLL
- …