40 research outputs found
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Albeit the scalable performance of vision transformers (ViTs), the dense
computational costs (training & inference) undermine their position in
industrial applications. Post-training quantization (PTQ), tuning ViTs with a
tiny dataset and running in a low-bit format, well addresses the cost issue but
unluckily bears more performance drops in lower-bit cases. In this paper, we
introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an
inclusive and stable fashion. I&S-ViT first identifies two issues in the PTQ of
ViTs: (1) Quantization inefficiency in the prevalent log2 quantizer for
post-Softmax activations; (2) Rugged and magnified loss landscape in
coarse-grained quantization granularity for post-LayerNorm activations. Then,
I&S-ViT addresses these issues by introducing: (1) A novel shift-uniform-log2
quantizer (SULQ) that incorporates a shift mechanism followed by uniform
quantization to achieve both an inclusive domain representation and accurate
distribution approximation; (2) A three-stage smooth optimization strategy
(SOS) that amalgamates the strengths of channel-wise and layer-wise
quantization to enable stable learning. Comprehensive evaluations across
diverse vision tasks validate I&S-ViT' superiority over existing PTQ of ViTs
methods, particularly in low-bit scenarios. For instance, I&S-ViT elevates the
performance of 3-bit ViT-B by an impressive 50.68%
Spatial Re-parameterization for N:M Sparsity
This paper presents a Spatial Re-parameterization (SpRe) method for the N:M
sparsity in CNNs. SpRe is stemmed from an observation regarding the restricted
variety in spatial sparsity present in N:M sparsity compared with unstructured
sparsity. Particularly, N:M sparsity exhibits a fixed sparsity rate within the
spatial domains due to its distinctive pattern that mandates N non-zero
components among M successive weights in the input channel dimension of
convolution filters. On the contrary, we observe that unstructured sparsity
displays a substantial divergence in sparsity across the spatial domains, which
we experimentally verified to be very crucial for its robust performance
retention compared with N:M sparsity. Therefore, SpRe employs the
spatial-sparsity distribution of unstructured sparsity to assign an extra
branch in conjunction with the original N:M branch at training time, which
allows the N:M sparse network to sustain a similar distribution of spatial
sparsity with unstructured sparsity. During inference, the extra branch can be
further re-parameterized into the main N:M branch, without exerting any
distortion on the sparse pattern or additional computation costs. SpRe has
achieved a commendable feat by matching the performance of N:M sparsity methods
with state-of-the-art unstructured sparsity methods across various benchmarks.
Code and models are anonymously available at
\url{https://github.com/zyxxmu/SpRe}.Comment: 11 pages, 4 figure
Shadow Removal by High-Quality Shadow Synthesis
Most shadow removal methods rely on the invasion of training images
associated with laborious and lavish shadow region annotations, leading to the
increasing popularity of shadow image synthesis. However, the poor performance
also stems from these synthesized images since they are often
shadow-inauthentic and details-impaired. In this paper, we present a novel
generation framework, referred to as HQSS, for high-quality pseudo shadow image
synthesis. The given image is first decoupled into a shadow region identity and
a non-shadow region identity. HQSS employs a shadow feature encoder and a
generator to synthesize pseudo images. Specifically, the encoder extracts the
shadow feature of a region identity which is then paired with another region
identity to serve as the generator input to synthesize a pseudo image. The
pseudo image is expected to have the shadow feature as its input shadow feature
and as well as a real-like image detail as its input region identity. To
fulfill this goal, we design three learning objectives. When the shadow feature
and input region identity are from the same region identity, we propose a
self-reconstruction loss that guides the generator to reconstruct an identical
pseudo image as its input. When the shadow feature and input region identity
are from different identities, we introduce an inter-reconstruction loss and a
cycle-reconstruction loss to make sure that shadow characteristics and detail
information can be well retained in the synthesized images. Our HQSS is
observed to outperform the state-of-the-art methods on ISTD dataset, Video
Shadow Removal dataset, and SRD dataset. The code is available at
https://github.com/zysxmu/HQSS
MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization
Arbitrary bit-width network quantization has received significant attention
due to its high adaptability to various bit-width requirements during runtime.
However, in this paper, we investigate existing methods and observe a
significant accumulation of quantization errors caused by frequent bit-width
switching of weights and activations, leading to limited performance. To
address this issue, we propose MultiQuant, a novel method that utilizes a
multi-branch topology for arbitrary bit-width quantization. MultiQuant
duplicates the network body into multiple independent branches and quantizes
the weights of each branch to a fixed 2-bit while retaining the input
activations in the expected bit-width. This approach maintains the
computational cost as the same while avoiding the switching of weight
bit-widths, thereby substantially reducing errors in weight quantization.
Additionally, we introduce an amortization branch selection strategy to
distribute quantization errors caused by activation bit-width switching among
branches to enhance performance. Finally, we design an in-place distillation
strategy that facilitates guidance between branches to further enhance
MultiQuant's performance. Extensive experiments demonstrate that MultiQuant
achieves significant performance gains compared to existing arbitrary bit-width
quantization methods. Code is at \url{https://github.com/zysxmu/MultiQuant}
Fine-grained Data Distribution Alignment for Post-Training Quantization
While post-training quantization receives popularity mostly due to its
evasion in accessing the original complete training dataset, its poor
performance also stems from scarce images. To alleviate this limitation, in
this paper, we leverage the synthetic data introduced by zero-shot quantization
with calibration dataset and propose a fine-grained data distribution alignment
(FDDA) method to boost the performance of post-training quantization. The
method is based on two important properties of batch normalization statistics
(BNS) we observed in deep layers of the trained network, (i.e.), inter-class
separation and intra-class incohesion. To preserve this fine-grained
distribution information: 1) We calculate the per-class BNS of the calibration
dataset as the BNS centers of each class and propose a BNS-centralized loss to
force the synthetic data distributions of different classes to be close to
their own centers. 2) We add Gaussian noise into the centers to imitate the
incohesion and propose a BNS-distorted loss to force the synthetic data
distribution of the same class to be close to the distorted centers. By
utilizing these two fine-grained losses, our method manifests the
state-of-the-art performance on ImageNet, especially when both the first and
last layers are quantized to the low-bit. Code is at
\url{https://github.com/zysxmu/FDDA}.Comment: ECCV202
Local and systemic therapy may be safely de-escalated in elderly breast cancer patients in China: A retrospective cohort study
BackgroundFor elderly patients with breast cancer, the treatment strategy is still controversial. In China, preoperative axillary lymph node needle biopsy is not widely used, resulting in many patients receiving axillary lymph node dissection (ALND) directly. Our study aims to determine whether local and systemic therapy can be safely de-escalated in elderly breast cancer.MethodsPatients aged ≥70 years were retrospectively enrolled from our institution’s medical records between May 2013 and July 2021. Groups were assigned according to local and systemic treatment regimens, and stratified analysis was performed by molecular subtypes. Univariate and multivariate survival analyses were used to compare the effects of different regimens on relapse-free survival (RFS).ResultsA total of 653 patients were enrolled for preliminary data analysis, and 563 patients were screened for survival analysis. The mean follow-up was 19 months (range, 1–82 months). Axillary lymph node metastases were pathologically confirmed in only 2.1% of cN0 cases and up to 97.1% of cN+ cases. In the aspect of breast surgery, RFS showed no significant difference between mastectomy and BCS group (p = 0.3078). As for axillary surgery, patients in the ALND group showed significantly better RFS than those in the sentinel lymph node biopsy (SLNB) group among pN0 patients (p = 0.0128). Among these cases, the proportion of cN+ in ALND was significantly higher than that in SLNB (6.4% vs. 0.4%, p = 0.002), which meant axillary lymph nodes (ALNs) of ALND patients were larger in imaging and more likely to be misdiagnosed as metastatic. With regard to adjuvant therapy, univariate and multivariate analyses showed that RFS in different comprehensive adjuvant regimens were similar especially among hormone receptor (HR)+/human epidermal growth factor receptor 2 (HER2)− subgroup where patients who did not receive any adjuvant therapy accounted for 15.7% (p > 0.05).ConclusionsIt is feasible to reduce some unnecessary local or systemic treatments for elderly breast cancer patients, especially in HR+/HER2− subtype. Multiple patient-related factors should be considered when making treatment plans
Recommended from our members
Genetic and metabolic links between the murine microbiome and memory.
BackgroundRecent evidence has linked the gut microbiome to host behavior via the gut-brain axis [1-3]; however, the underlying mechanisms remain unexplored. Here, we determined the links between host genetics, the gut microbiome and memory using the genetically defined Collaborative Cross (CC) mouse cohort, complemented with microbiome and metabolomic analyses in conventional and germ-free (GF) mice.ResultsA genome-wide association analysis (GWAS) identified 715 of 76,080 single-nucleotide polymorphisms (SNPs) that were significantly associated with short-term memory using the passive avoidance model. The identified SNPs were enriched in genes known to be involved in learning and memory functions. By 16S rRNA gene sequencing of the gut microbial community in the same CC cohort, we identified specific microorganisms that were significantly correlated with longer latencies in our retention test, including a positive correlation with Lactobacillus. Inoculation of GF mice with individual species of Lactobacillus (L. reuteri F275, L. plantarum BDGP2 or L. brevis BDGP6) resulted in significantly improved memory compared to uninoculated or E. coli DH10B inoculated controls. Untargeted metabolomics analysis revealed significantly higher levels of several metabolites, including lactate, in the stools of Lactobacillus-colonized mice, when compared to GF control mice. Moreover, we demonstrate that dietary lactate treatment alone boosted memory in conventional mice. Mechanistically, we show that both inoculation with Lactobacillus or lactate treatment significantly increased the levels of the neurotransmitter, gamma-aminobutyric acid (GABA), in the hippocampus of the mice.ConclusionTogether, this study provides new evidence for a link between Lactobacillus and memory and our results open possible new avenues for treating memory impairment disorders using specific gut microbial inoculants and/or metabolites. Video Abstract
A Search for Light Super Symmetric Baryons
We have searched for the production and decay of light super-symmetric
baryons produced in 800 GeV/c proton copper interactions in a charged hyperon
beam experiment. We observe no evidence for the decays R+(uud \g^~) -> S(uds
\g^~) pi+ and X-(ssd \g^~) -> S(uds \g^~) pi- in the predicted parent mass and
lifetime ranges of 1700-2500 Mev/c2 and 50-500 ps. Production upper limits for
R+ at xF=0.47, Pt=1.4 GeV/c2 and X- at xF=0.48, Pt=0.65 GeV/c2 of less than
10^-3 of all charged secondary particles produced are obtained for all but the
highest masses and shortest lifetimes predicted.Comment: 9 pages, uuencoded postscript 4 figures uuencoded, tar-compressed
file (submitted to PRL