40 research outputs found

    Exploring Model Transferability through the Lens of Potential Energy

    Full text link
    Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models. However, selecting the optimal pre-trained model from a diverse pool for a specific downstream task remains a challenge. Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels, but they overlook the impact of underlying representation dynamics during fine-tuning, leading to unreliable results, especially for self-supervised models. In this paper, we present an insightful physics-inspired approach named PED to address these challenges. We reframe the challenge of model selection through the lens of potential energy and directly model the interaction forces that influence fine-tuning dynamics. By capturing the motion of dynamic representations to decline the potential energy within a force-driven physical model, we can acquire an enhanced and more stable observation for estimating transferability. The experimental results on 10 downstream tasks and 12 self-supervised models demonstrate that our approach can seamlessly integrate into existing ranking techniques and enhance their performances, revealing its effectiveness for the model selection task and its potential for understanding the mechanism in transfer learning. Code will be available at https://github.com/lixiaotong97/PED.Comment: Accepted by ICCV 202

    mc-BEiT: Multi-choice Discretization for Image BERT Pre-training

    Full text link
    Image BERT pre-training with masked image modeling (MIM) becomes a popular practice to cope with self-supervised representation learning. A seminal work, BEiT, casts MIM as a classification task with a visual vocabulary, tokenizing the continuous visual signals into discrete vision tokens using a pre-learned dVAE. Despite a feasible solution, the improper discretization hinders further improvements of image pre-training. Since image discretization has no ground-truth answers, we believe that the masked patch should not be assigned with a unique token id even if a better tokenizer can be obtained. In this work, we introduce an improved BERT-style image pre-training method, namely mc-BEiT, which performs MIM proxy tasks towards eased and refined multi-choice training objectives. Specifically, the multi-choice supervision for the masked image patches is formed by the soft probability vectors of the discrete token ids, which are predicted by the off-the-shelf image tokenizer and further refined by high-level inter-patch perceptions resorting to the observation that similar patches should share their choices. Extensive experiments on classification, segmentation, and detection tasks demonstrate the superiority of our method, e.g., the pre-trained ViT-B achieves 84.1% top-1 fine-tuning accuracy on ImageNet-1K classification, 50.8% mIOU on ADE20K semantic segmentation, 51.2% AP^b and 44.3% AP^m of object detection and instance segmentation on COCO, outperforming the competitive counterparts

    Cross Entropy versus Label Smoothing: A Neural Collapse Perspective

    Full text link
    Label smoothing loss is a widely adopted technique to mitigate overfitting in deep neural networks. This paper studies label smoothing from the perspective of Neural Collapse (NC), a powerful empirical and theoretical framework which characterizes model behavior during the terminal phase of training. We first show empirically that models trained with label smoothing converge faster to neural collapse solutions and attain a stronger level of neural collapse. Additionally, we show that at the same level of NC1, models under label smoothing loss exhibit intensified NC2. These findings provide valuable insights into the performance benefits and enhanced model calibration under label smoothing loss. We then leverage the unconstrained feature model to derive closed-form solutions for the global minimizers for both loss functions and further demonstrate that models under label smoothing have a lower conditioning number and, therefore, theoretically converge faster. Our study, combining empirical evidence and theoretical results, not only provides nuanced insights into the differences between label smoothing and cross-entropy losses, but also serves as an example of how the powerful neural collapse framework can be used to improve our understanding of DNNs

    SpikeBERT: A Language Spikformer Trained with Two-Stage Knowledge Distillation from BERT

    Full text link
    Spiking neural networks (SNNs) offer a promising avenue to implement deep neural networks in a more energy-efficient way. However, the network architectures of existing SNNs for language tasks are too simplistic, and deep architectures have not been fully explored, resulting in a significant performance gap compared to mainstream transformer-based networks such as BERT. To this end, we improve a recently-proposed spiking transformer (i.e., Spikformer) to make it possible to process language tasks and propose a two-stage knowledge distillation method for training it, which combines pre-training by distilling knowledge from BERT with a large collection of unlabelled texts and fine-tuning with task-specific instances via knowledge distillation again from the BERT fine-tuned on the same training examples. Through extensive experimentation, we show that the models trained with our method, named SpikeBERT, outperform state-of-the-art SNNs and even achieve comparable results to BERTs on text classification tasks for both English and Chinese with much less energy consumption

    Tailoring Personality Traits in Large Language Models via Unsupervisedly-Built Personalized Lexicons

    Full text link
    Personality plays a pivotal role in shaping human expression patterns, thus regulating the personality of large language models (LLMs) holds significant potential in enhancing the user experience of LLMs. Previous methods either relied on fine-tuning LLMs on specific corpora or necessitated manually crafted prompts to elicit specific personalities from LLMs. However, the former approach is inefficient and costly, while the latter cannot precisely manipulate personality traits at a fine-grained level. To address the above challenges, we have employed a novel Unsupervisedly-Built Personalized Lexicons (UBPL) in a pluggable manner during the decoding phase of LLMs to manipulate their personality traits. UBPL is a lexicon built through an unsupervised approach from a situational judgment test dataset (SJTs4LLM). Users can utilize UBPL to adjust the probability vectors of predicted words in the decoding phase of LLMs, thus influencing the personality expression of LLMs. Extensive experimentation demonstrates the remarkable effectiveness and pluggability of our method for fine-grained manipulation of LLM's personality.Comment: Work in progres

    Elucidating the multifaceted roles of GPR146 in non-specific orbital inflammation: a concerted analytical approach through the prisms of bioinformatics and machine learning

    Get PDF
    BackgroundNon-specific Orbital Inflammation (NSOI) is a chronic idiopathic condition marked by extensive polymorphic lymphoid infiltration in the orbital area. The integration of metabolic and immune pathways suggests potential therapeutic roles for C-peptide and G protein-coupled receptor 146 (GPR146) in diabetes and its sequelae. However, the specific mechanisms through which GPR146 modulates immune responses remain poorly understood. Furthermore, the utility of GPR146 as a diagnostic or prognostic marker for NSOI has not been conclusively demonstrated.MethodsWe adopted a comprehensive analytical strategy, merging differentially expressed genes (DEGs) from the Gene Expression Omnibus (GEO) datasets GSE58331 and GSE105149 with immune-related genes from the ImmPort database. Our methodology combined LASSO regression and support vector machine-recursive feature elimination (SVM-RFE) for feature selection, followed by Gene Set Enrichment Analysis (GSEA) and Gene Set Variation Analysis (GSVA) to explore gene sets co-expressed with GPR146, identifying a significant enrichment in immune-related pathways. The tumor microenvironment’s immune composition was quantified using the CIBERSORT algorithm and the ESTIMATE method, which confirmed a positive correlation between GPR146 expression and immune cell infiltration. Validation of GPR146 expression was performed using the GSE58331 dataset.ResultsAnalysis identified 113 DEGs associated with GPR146, with a significant subset showing distinct expression patterns. Using LASSO and SVM-RFE, we pinpointed 15 key hub genes. Functionally, these genes and GPR146 were predominantly linked to receptor ligand activity, immune receptor activity, and cytokine-mediated signaling. Specific immune cells, such as memory B cells, M2 macrophages, resting mast cells, monocytes, activated NK cells, plasma cells, and CD8+ T cells, were positively associated with GPR146 expression. In contrast, M0 macrophages, naive B cells, M1 macrophages, activated mast cells, activated memory CD4+ T cells, naive CD4+ T cells, and gamma delta T cells showed inverse correlations. Notably, our findings underscore the potential diagnostic relevance of GPR146 in distinguishing NSOI.ConclusionOur study elucidates the immunological signatures associated with GPR146 in the context of NSOI, highlighting its prognostic and diagnostic potential. These insights pave the way for GPR146 to be a novel biomarker for monitoring the progression of NSOI, providing a foundation for future therapeutic strategies targeting immune-metabolic pathways

    Experimental quantum adversarial learning with programmable superconducting qubits

    Full text link
    Quantum computing promises to enhance machine learning and artificial intelligence. Different quantum algorithms have been proposed to improve a wide spectrum of machine learning tasks. Yet, recent theoretical works show that, similar to traditional classifiers based on deep classical neural networks, quantum classifiers would suffer from the vulnerability problem: adding tiny carefully-crafted perturbations to the legitimate original data samples would facilitate incorrect predictions at a notably high confidence level. This will pose serious problems for future quantum machine learning applications in safety and security-critical scenarios. Here, we report the first experimental demonstration of quantum adversarial learning with programmable superconducting qubits. We train quantum classifiers, which are built upon variational quantum circuits consisting of ten transmon qubits featuring average lifetimes of 150 μ\mus, and average fidelities of simultaneous single- and two-qubit gates above 99.94% and 99.4% respectively, with both real-life images (e.g., medical magnetic resonance imaging scans) and quantum data. We demonstrate that these well-trained classifiers (with testing accuracy up to 99%) can be practically deceived by small adversarial perturbations, whereas an adversarial training process would significantly enhance their robustness to such perturbations. Our results reveal experimentally a crucial vulnerability aspect of quantum learning systems under adversarial scenarios and demonstrate an effective defense strategy against adversarial attacks, which provide a valuable guide for quantum artificial intelligence applications with both near-term and future quantum devices.Comment: 26 pages, 17 figures, 8 algorithm

    CCL21/CCR7 Prevents Apoptosis via the ERK Pathway in Human Non-Small Cell Lung Cancer Cells

    Get PDF
    Previously, we confirmed that C-C chemokine receptor 7 (CCR7) promotes cell proliferation via the extracellular signal-regulated kinase (ERK) pathway, but its role in apoptosis of non-small cell lung cancer (NSCLC) cell lines remains unknown. A549 and H460 cells of NSCLC were used to examine the effect of CCL21/CCR7 on apoptosis using flow cytometry. The results showed that activation of CCR7 by its specific ligand, exogenous chemokine ligand 21 (CCL21), was associated with a significant decline in the percent of apoptosis. Western blot and real-time PCR assays indicated that activation of CCR7 significantly caused upregulation of anti-apoptotic bcl-2 and downregulation of pro-apoptotic bax and caspase-3, but not p53, at both protein and mRNA levels. CCR7 small interfering RNA significantly attenuated these effects of exogenous CCL21. Besides, PD98059, a selective inhibitor of MEK that disrupts the activation of downstream ERK, significantly abolished these effects of CCL21/CCR7. Coimmunoprecipitation further confirmed that there was an interaction between p-ERK and bcl-2, bax, or caspase-3, particularly in the presence of CCL21. These results strongly suggest that CCL21/CCR7 prevents apoptosis by upregulating the expression of bcl-2 and by downregulating the expression of bax and caspase-3 potentially via the ERK pathway in A549 and H460 cells of NSCLC

    Impact of opioid-free analgesia on pain severity and patient satisfaction after discharge from surgery: multispecialty, prospective cohort study in 25 countries

    Get PDF
    Background: Balancing opioid stewardship and the need for adequate analgesia following discharge after surgery is challenging. This study aimed to compare the outcomes for patients discharged with opioid versus opioid-free analgesia after common surgical procedures.Methods: This international, multicentre, prospective cohort study collected data from patients undergoing common acute and elective general surgical, urological, gynaecological, and orthopaedic procedures. The primary outcomes were patient-reported time in severe pain measured on a numerical analogue scale from 0 to 100% and patient-reported satisfaction with pain relief during the first week following discharge. Data were collected by in-hospital chart review and patient telephone interview 1 week after discharge.Results: The study recruited 4273 patients from 144 centres in 25 countries; 1311 patients (30.7%) were prescribed opioid analgesia at discharge. Patients reported being in severe pain for 10 (i.q.r. 1-30)% of the first week after discharge and rated satisfaction with analgesia as 90 (i.q.r. 80-100) of 100. After adjustment for confounders, opioid analgesia on discharge was independently associated with increased pain severity (risk ratio 1.52, 95% c.i. 1.31 to 1.76; P < 0.001) and re-presentation to healthcare providers owing to side-effects of medication (OR 2.38, 95% c.i. 1.36 to 4.17; P = 0.004), but not with satisfaction with analgesia (beta coefficient 0.92, 95% c.i. -1.52 to 3.36; P = 0.468) compared with opioid-free analgesia. Although opioid prescribing varied greatly between high-income and low- and middle-income countries, patient-reported outcomes did not.Conclusion: Opioid analgesia prescription on surgical discharge is associated with a higher risk of re-presentation owing to side-effects of medication and increased patient-reported pain, but not with changes in patient-reported satisfaction. Opioid-free discharge analgesia should be adopted routinely
    corecore