Search CORE

82 research outputs found

Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty

Author: Cao Jian
Chen Weihua
Huang Tao
Lu Yichen
Sun Xiuyu
Zhang Yuan
Publication venue
Publication date: 31/07/2023
Field of study

Knowledge distillation is an effective paradigm for boosting the performance of pocket-size model, especially when multiple teacher models are available, the student would break the upper limit again. However, it is not economical to train diverse teacher models for the disposable distillation. In this paper, we introduce a new concept dubbed Avatars for distillation, which are the inference ensemble models derived from the teacher. Concretely, (1) For each iteration of distillation training, various Avatars are generated by a perturbation transformation. We validate that Avatars own higher upper limit of working capacity and teaching ability, aiding the student model in learning diverse and receptive knowledge perspectives from the teacher model. (2) During the distillation, we propose an uncertainty-aware factor from the variance of statistical differences between the vanilla teacher and Avatars, to adjust Avatars' contribution on knowledge transfer adaptively. Avatar Knowledge Distillation AKD is fundamentally different from existing methods and refines with the innovative view of unequal training. Comprehensive experiments demonstrate the effectiveness of our Avatars mechanism, which polishes up the state-of-the-art distillation methods for dense prediction without more extra computational cost. The AKD brings at most 0.7 AP gains on COCO 2017 for Object Detection and 1.83 mIoU gains on Cityscapes for Semantic Segmentation, respectively.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive

DAMO-YOLO : A Report on Real-Time Object Detection Design

Author: Chen Weihua
Huang Yilun
Jiang Yiqi
Sun Xiuyu
Xu Xianzhe
Zhang Yuan
Publication venue
Publication date: 23/04/2023
Field of study

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. In particular, we use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone under the constraints of low latency and high performance, producing ResNet/CSP-like structures with spatial pyramid pooling and focus modules. In the design of necks and heads, we follow the rule of ``large neck, small head''.We import Generalized-FPN with accelerated queen-fusion to build the detector neck and upgrade its CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Then we investigate how detector head size affects detection performance and find that a heavy neck with only one task projection layer would yield better results.In addition, AlignedOTA is proposed to solve the misalignment problem in label assignment. And a distillation schema is introduced to improve performance to a higher level. Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios. For general industry requirements, we propose DAMO-YOLO-T/S/M/L. They can achieve 43.6/47.7/50.2/51.9 mAPs on COCO with the latency of 2.78/3.83/5.62/7.95 ms on T4 GPUs respectively. Additionally, for edge devices with limited computing power, we have also proposed DAMO-YOLO-Ns/Nm/Nl lightweight models. They can achieve 32.3/38.2/40.5 mAPs on COCO with the latency of 4.08/5.05/6.69 ms on X86-CPU. Our proposed general and lightweight models have outperformed other YOLO series models in their respective application scenarios.Comment: Project Website: https://github.com/tinyvision/damo-yol

arXiv.org e-Print Archive

Characterization of severe fever with thrombocytopenia syndrome in rural regions of Zhejiang, China.

Author: feng Cen
Li Shibo
Lou Xiuyu
Ojcius David M
Sun Yi
Wang Chengwei
Wang Zhongfa
Ye Ling
Zhang Lei
Zhang Yanjun
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

Severe fever with thrombocytopenia syndrome virus (SFTSV) infections have recently been found in rural regions of Zhejiang. A severe fever with thrombocytopenia syndrome (SFTS) surveillance and sero-epidemiological investigation was conducted in the districts with outbreaks. During the study period of 2011-2014, a total of 51 SFTSV infection cases were identified and the case fatality rate was 12% (6/51). Ninety two percent of the patients (47/51) were over 50 years of age, and 63% (32/51) of laboratory confirmed cases occurred from May to July. Nine percent (11/120) of the serum samples from local healthy people without symptoms were found to be positive for antibodies to the SFTS virus. SFTSV strains were isolated by culture using Vero, and the whole genomic sequences of two SFTSV strains (01 and Zhao) were sequenced and submitted to the GenBank. Homology analysis showed that the similarity of the target nucleocapsid gene from the SFTSV strains from different geographic areas was 94.2-100%. From the constructed phylogenetic tree, it was found that all the SFTSV strains diverged into two main clusters. Only the SFTSV strains from the Zhejiang (Daishan) region of China and the Yamaguchi, Miyazakj regions of Japan, were clustered into lineage II, consistent with both of these regions being isolated areas with similar geographic features. Two out of eight predicted linear B cell epitopes from the nucleocapsid protein showed mutations between the SFTSV strains of different clusters, but did not contribute to the binding ability of the specific SFTSV antibodies. This study confirmed that SFTSV has been circulating naturally and can cause a seasonal prevalence in Daishan, China. The results also suggest that the molecular characteristics of SFTSV are associated with the geographic region and all SFTSV strains can be divided into two genotypes

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Pacific McGeorge School of Law

Scholarly Commons

The Francis Crick Institute

Deconfounding Causal Inference for Zero-shot Action Recognition

Author: Jiang Yiqi
Long Yang
Pagnucco Maurice
Song Yang
Sun Xiuyu
Wang Junyan
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 22/09/2023
Field of study

Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action categories to model the feature distribution of unseen categories. However, due to the complexity and diversity of actions, it remains challenging to generate unseen feature distribution, especially for the cross-dataset scenario when there is potentially larger domain shift. This paper proposes a De confounding Ca usa l GAN (DeCalGAN) for generating unseen action video features with the following technical contributions: 1) Our model unifies compositional ZSAR with traditional visual-semantic models to incorporate local object information with global semantic information for feature generation. 2) A GAN-based architecture is proposed for causal inference and unseen distribution discovery. 3) A deconfounding module is proposed to refine representations of local object and global semantic information confounder in the training data. Action descriptions and random object feature after causal inference are then used to discover unseen distributions of novel actions in different datasets. Our extensive experiments on C ross- D ataset Z ero- S hot A ction R ecognition (CD-ZSAR) demonstrate substantial improvement over the UCF101 and HMDB51 standard benchmarks for this problem

Durham Research Online

Learning Accurate Entropy Model with Global Reference for Image Compression

Author: Jin Rong
Li Dongyang
Li Hao
Lin Ming
Qian Yichen
Sun Xiuyu
Sun Zhenhong
Tan Zhiyu
Publication venue
Publication date: 29/10/2020
Field of study

In recent deep image compression neural networks, the entropy model plays a critical role in estimating the prior distribution of deep image encodings. Existing methods combine hyperprior with local context in the entropy estimation function. This greatly limits their performance due to the absence of a global vision. In this work, we propose a novel Global Reference Model for image compression to effectively leverage both the local and the global context information, leading to an enhanced compression rate. The proposed method scans decoded latents and then finds the most relevant latent to assist the distribution estimating of the current latent. A by-product of this work is the innovation of a mean-shifting GDN module that further improves the performance. Experimental results demonstrate that the proposed model outperforms the rate-distortion performance of most of the state-of-the-art methods in the industry

arXiv.org e-Print Archive

Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space

Author: Lin Ming
Sun Xiuyu
Wang Senzhang
Wang Yaohua
Zhang Fangyi
Zhang Yaobin
Zhang YuQi
Publication venue
Publication date: 01/01/2022
Field of study

Face clustering has attracted rising research interest recently to take advantage of massive amounts of face images on the web. State-of-the-art performance has been achieved by Graph Convolutional Networks (GCN) due to their powerful representation capacity. However, existing GCN-based methods build face graphs mainly according to kNN relations in the feature space, which may lead to a lot of noise edges connecting two faces of different classes. The face features will be polluted when messages pass along these noise edges, thus degrading the performance of GCNs. In this paper, a novel algorithm named Ada-NETS is proposed to cluster faces by constructing clean graphs for GCNs. In Ada-NETS, each face is transformed to a new structure space, obtaining robust features by considering face features of the neighbour images. Then, an adaptive neighbour discovery strategy is proposed to determine a proper number of edges connecting to each face image. It significantly reduces the noise edges while maintaining the good ones to build a graph with clean yet rich edges for GCNs to cluster faces. Experiments on multiple public clustering datasets show that Ada-NETS significantly outperforms current state-of-the-art methods, proving its superiority and generalization. Code is available at https://github.com/damo-cv/Ada-NETS

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Author: Ding Bolin
Gao Dawei
Li Yaliang
Qian Yichen
Sun Xiuyu
Wang Haibin
Zhou Jingren
Publication venue
Publication date: 20/09/2023
Field of study

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.Comment: We have released code on https://github.com/BeachWang/DAIL-SQ

arXiv.org e-Print Archive

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

Author: Huang Yilun
Lin Ming
Shen Xuan
Sun Xiuyu
Tang Hao
Wang Yanzhi
Wang Yaohua
Publication venue
Publication date: 26/04/2023
Field of study

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN models can achieve as good performance as ViT models when carefully tuned. While encouraging, designing such high-performance CNN models is challenging, requiring non-trivial prior knowledge of network design. To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way. In DeepMAD, a CNN network is modeled as an information processing system whose expressiveness and effectiveness can be analytically formulated by their structural parameters. Then a constrained mathematical programming (MP) problem is proposed to optimize these structural parameters. The MP problem can be easily solved by off-the-shelf MP solvers on CPUs with a small memory footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or training data is required during network design. The superiority of DeepMAD is validated on multiple large-scale computer vision benchmark datasets. Notably on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves 0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and 0.8% and 0.9% higher on Small level.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

PTGES2 and RNASET2 identified as novel potential biomarkers and therapeutic targets for basal cell carcinoma: insights from proteome-wide mendelian randomization, colocalization, and MR-PheWAS analyses

Author: Jing Sun
Qiang-Zhe Zhang
Qiu-Ju Han
Xin-Yu Ding
Xiuyu Wang
Yi-Pan Zhu
Publication venue: Frontiers Media S.A.
Publication date: 01/07/2024
Field of study

IntroductionBasal cell carcinoma (BCC) is the most common skin cancer, lacking reliable biomarkers or therapeutic targets for effective treatment. Genome-wide association studies (GWAS) can aid in identifying drug targets, repurposing existing drugs, predicting clinical trial side effects, and reclassifying patients in clinical utility. Hence, the present study investigates the association between plasma proteins and skin cancer to identify effective biomarkers and therapeutic targets for BCC.MethodsProteome-wide mendelian randomization was performed using inverse-variance-weight and Wald Ratio methods, leveraging 1 Mb cis protein quantitative trait loci (cis-pQTLs) in the UK Biobank Pharma Proteomics Project (UKB-PPP) and the deCODE Health Study, to determine the causal relationship between plasma proteins and skin cancer and its subtypes in the FinnGen R10 study and the SAIGE database of Lee lab. Significant association with skin cancer and its subtypes was defined as a false discovery rate (FDR) < 0.05. pQTL to GWAS colocalization analysis was executed using a Bayesian model to evaluate five exclusive hypotheses. Strong colocalization evidence was defined as a posterior probability for shared causal variants (PP.H4) of ≥0.85. Mendelian randomization-Phenome-wide association studies (MR-PheWAS) were used to evaluate potential biomarkers and therapeutic targets for skin cancer and its subtypes within a phenome-wide human disease category.ResultsPTGES2, RNASET2, SF3B4, STX8, ENO2, and HS3ST3B1 (besides RNASET2, five other plasma proteins were previously unknown in expression quantitative trait loci (eQTL) and methylation quantitative trait loci (mQTL)) were significantly associated with BCC after FDR correction in the UKB-PPP and deCODE studies. Reverse MR showed no association between BCC and these proteins. PTGES2 and RNASET2 exhibited strong evidence of colocalization with BCC based on a posterior probability PP.H4 >0.92. Furthermore, MR-PheWAS analysis showed that BCC was the most significant phenotype associated with PTGES2 and RNASET2 among 2,408 phenotypes in the FinnGen R10 study. Therefore, PTGES2 and RNASET2 are highlighted as effective biomarkers and therapeutic targets for BCC within the phenome-wide human disease category.ConclusionThe study identifies PTGES2 and RNASET2 plasma proteins as novel, reliable biomarkers and therapeutic targets for BCC, suggesting more effective clinical application strategies for patients

Directory of Open Access Journals