16 research outputs found

    Revisiting Vision Transformer from the View of Path Ensemble

    Full text link
    Vision Transformers (ViTs) are normally regarded as a stack of transformer layers. In this work, we propose a novel view of ViTs showing that they can be seen as ensemble networks containing multiple parallel paths with different lengths. Specifically, we equivalently transform the traditional cascade of multi-head self-attention (MSA) and feed-forward network (FFN) into three parallel paths in each transformer layer. Then, we utilize the identity connection in our new transformer form and further transform the ViT into an explicit multi-path ensemble network. From the new perspective, these paths perform two functions: the first is to provide the feature for the classifier directly, and the second is to provide the lower-level feature representation for subsequent longer paths. We investigate the influence of each path for the final prediction and discover that some paths even pull down the performance. Therefore, we propose the path pruning and EnsembleScale skills for improvement, which cut out the underperforming paths and re-weight the ensemble components, respectively, to optimize the path combination and make the short paths focus on providing high-quality representation for subsequent paths. We also demonstrate that our path combination strategies can help ViTs go deeper and act as high-pass filters to filter out partial low-frequency signals. To further enhance the representation of paths served for subsequent paths, self-distillation is applied to transfer knowledge from the long paths to the short paths. This work calls for more future research to explain and design ViTs from new perspectives.Comment: Accepted by ICCV 2023, oral presentatio

    Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models

    Full text link
    Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained significant attention from the community. These models can be easily customized for new concepts using low-rank adaptations (LoRAs). However, the utilization of multiple concept LoRAs to jointly support multiple customized concepts presents a challenge. We refer to this scenario as decentralized multi-concept customization, which involves single-client concept tuning and center-node concept fusion. In this paper, we propose a new framework called Mix-of-Show that addresses the challenges of decentralized multi-concept customization, including concept conflicts resulting from existing single-client LoRA tuning and identity loss during model fusion. Mix-of-Show adopts an embedding-decomposed LoRA (ED-LoRA) for single-client tuning and gradient fusion for the center node to preserve the in-domain essence of single concepts and support theoretically limitless concept fusion. Additionally, we introduce regionally controllable sampling, which extends spatially controllable sampling (e.g., ControlNet and T2I-Adaptor) to address attribute binding and missing object problems in multi-concept sampling. Extensive experiments demonstrate that Mix-of-Show is capable of composing multiple customized concepts with high fidelity, including characters, objects, and scenes

    SRSF5‐Mediated Alternative Splicing of M Gene is Essential for Influenza A Virus Replication: A Host‐Directed Target Against Influenza Virus

    Get PDF
    Abstract: Splicing of influenza A virus (IAV) RNA is an essential process in the viral life cycle that involves the co‐opting of host factors. Here, it is demonstrated that induction of host serine and arginine‐rich splicing factor 5 (SRSF5) by IAV facilitated viral replication by enhancing viral M mRNA splicing. Mechanistically, SRSF5 with its RRM2 domain directly bounds M mRNA at conserved sites (M mRNA position 163, 709, and 712), and interacts with U1 small nuclear ribonucleoprotein (snRNP) to promote M mRNA splicing and M2 production. Mutations introduced to the three binding sites, without changing amino acid code, significantly attenuates virus replication and pathogenesis in vivo. Likewise, SRSF5 conditional knockout in the lung protects mice against lethal IAV challenge. Furthermore, anidulafungin, an approved antifungal drug, is identified as an inhibitor of SRSF5 that effectively blocks IAV replication in vitro and in vivo. In conclusion, SRSF5 as an activator of M mRNA splicing promotes IAV replication and is a host‐derived antiviral target

    Nonlinear Deblurring for Low-Light Saturated Image

    No full text
    Single image deblurring has achieved significant progress for natural daytime images. Saturation is a common phenomenon in blurry images, due to the low light conditions and long exposure times. However, conventional linear deblurring methods usually deal with natural blurry images well but result in severe ringing artifacts when recovering low-light saturated blurry images. To solve this problem, we formulate the saturation deblurring problem as a nonlinear model, in which all the saturated and unsaturated pixels are modeled adaptively. Specifically, we additionally introduce a nonlinear function to the convolution operator to accommodate the procedure of the saturation in the presence of the blurring. The proposed method has two advantages over previous methods. On the one hand, the proposed method achieves the same high quality of restoring the natural image as seen in conventional deblurring methods, while also reducing the estimation errors in saturated areas and suppressing ringing artifacts. On the other hand, compared with the recent saturated-based deblurring methods, the proposed method captures the formation of unsaturated and saturated degradations straightforwardly rather than with cumbersome and error-prone detection steps. Note that, this nonlinear degradation model can be naturally formulated into a maximum-a posterioriframework, and can be efficiently decoupled into several solvable sub-problems via the alternating direction method of multipliers (ADMM). Experimental results on both synthetic and real-world images demonstrate that the proposed deblurring algorithm outperforms the state-of-the-art low-light saturation-based deblurring methods

    KVT: k-NN Attention for Boosting Vision Transformers

    Full text link
    Convolutional Neural Networks (CNNs) have dominated computer vision for years, due to its ability in capturing locality and translation invariance. Recently, many vision transformer architectures have been proposed and they show promising performance. A key component in vision transformers is the fully-connected self-attention which is more powerful than CNNs in modelling long range dependencies. However, since the current dense self-attention uses all image patches (tokens) to compute attention matrix, it may neglect locality of images patches and involve noisy tokens (e.g., clutter background and occlusion), leading to a slow training process and potential degradation of performance. To address these problems, we propose the kk-NN attention for boosting vision transformers. Specifically, instead of involving all the tokens for attention matrix calculation, we only select the top-kk similar tokens from the keys for each query to compute the attention map. The proposed kk-NN attention naturally inherits the local bias of CNNs without introducing convolutional operations, as nearby tokens tend to be more similar than others. In addition, the kk-NN attention allows for the exploration of long range correlation and at the same time filters out irrelevant tokens by choosing the most similar tokens from the entire image. Despite its simplicity, we verify, both theoretically and empirically, that kk-NN attention is powerful in speeding up training and distilling noise from input tokens. Extensive experiments are conducted by using 11 different vision transformer architectures to verify that the proposed kk-NN attention can work with any existing transformer architectures to improve its prediction performance. The codes are available at \url{https://github.com/damo-cv/KVT}.Comment: Accepted by ECCV 202

    A Simple and Unified Tagging Model with Priming for Relational Structure Predictions

    Full text link
    Relational structure extraction covers a wide range of tasks and plays an important role in natural language processing. Recently, many approaches tend to design sophisticated graphical models to capture the complex relations between objects that are described in a sentence. In this work, we demonstrate that simple tagging models can surprisingly achieve competitive performances with a small trick -- priming. Tagging models with priming append information about the operated objects to the input sequence of pretrained language model. Making use of the contextualized nature of pretrained language model, the priming approach help the contextualized representation of the sentence better embed the information about the operated objects, hence, becomes more suitable for addressing relational structure extraction. We conduct extensive experiments on three different tasks that span ten datasets across five different languages, and show that our model is a general and effective model, despite its simplicity. We further carry out comprehensive analysis to understand our model and propose an efficient approximation to our method, which can perform almost the same performance but with faster inference speed

    Dual-mode imaging and therapeutic effects of drug-loaded phase-transition nanoparticles combined with near-infrared laser and low-intensity ultrasound on ovarian cancer

    No full text
    Chemotherapy and photo-sonodynamic therapy (PSDT) can be combined through drug delivery nano-platforms to enhance the anti-tumor efficacy, however, which is limited by hypoxia in tumor, thereby causing chemotherapy resistance. Perfluoropentane (PFP) has the ability to carry oxygen and to enhance ultrasound or photoacoustic imaging after vaporization. Herein, we constructed a kind of nanoparticles (PTX/ICG and oxygen loaded PLGA nanoparticles (PIO_NPs)), which had PFP core carrying oxygen and PLGA shell loaded indocyanine green (ICG) and paclitaxel (PTX). PIO_NPs harbored good optical stability and the ability to transit phase. Moreover, it could rapidly release PTX and generate ROS under the mediation by near-infrared laser and low-intensity ultrasound. The PIO_NPs enhanced contrast of the ultrasound and PA imaging. In particular, PIO_NPs may be used to monitor and guide treatment for the accumulation of PIO_NPs at tumor site can be observed by PA imaging. Compared with PTX or other nanoparticles, PIO_NPs combined with laser and ultrasound (L.U) significantly induced apoptosis of SKOV3 cells and inhibited SKOV3 tumor growth. Therefore, PIO_NPs are of great potential in cancer imaging and therapy

    A DAAM1 3′-UTR SNP mutation regulates breast cancer metastasis through affecting miR-208a-5p-DAAM1-RhoA axis

    No full text
    Abstract Background Dishevelled-associated activator of morphogenesis 1 (DAAM1) is a member of microfilament-related formins and mediates cell motility in breast cancer (BrCa). However, the genetic mutation status of DAAM1 mRNA and its correlation with pathological characteristics are still unclearly. Methods A patient cohort and BrCa cells were recruited to demonstrate the role of functional SNP in microRNA-208a-5p binding site of DAAM1 3′-UTR and underlying mechanism in BrCa metastasis. Results The expression and activation of DAAM1 increased markedly in lymphnode metastatic tissues. A genetic variant (rs79036859 A/G) was validated in the miR-208a-5p binding site of DAAM1 3′-UTR. The G genotype (AG/GG) was a risk genotype for the metastasis of BrCa by reducing binding affinity of miR-208a-5p for the DAAM1 3′-UTR. Furthermore, the miR-208a-5p expression level was significantly suppressed in lymphnode metastatic tissues compared with that in non-lymphnode metastatic tissues. Overexpression of miR-208a-5p inhibited DAAM1/RhoA signaling pathway, thereby leading to the decrease of the migratory ability. Conclusion Overall, the rs79036859 G variant of DAAM1 3′-UTR was identified as a relevant role in BrCa metastasis via the diversity of miR-208a-5p binding affinity
    corecore