164 research outputs found

    Preserving Silent Features for Domain Generalization

    Full text link
    Domain generalization (DG) aims to improve the generalization ability of the model trained on several known training domains over unseen test domains. Previous work has shown that self-supervised contrastive pre-training improves the robustness of the model on downstream tasks. However, in this paper, we find that self-supervised models do not exhibit better generalization performance than supervised models pre-trained on the same dataset in the DG setting. We argue that this is owing to the fact that the richer intra-class discriminative features extracted by self-supervised contrastive learning, which we term silent features, are suppressed during supervised fine-tuning. These silent features are likely to contain features that are more generalizable on the test domain. In this work, we model and analyze this feature suppression phenomenon and theoretically prove that preserving silent features can achieve lower expected test domain risk under certain conditions. In light of this, we propose a simple yet effective method termed STEP (Silent Feature Preservation) to improve the generalization performance of the self-supervised contrastive learning pre-trained model by alleviating the suppression of silent features during the supervised fine-tuning process. Experimental results show that STEP exhibits state-of-the-art performance on standard DG benchmarks with significant distribution shifts

    Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning

    Full text link
    It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples (i.e., generations with undesirable attributes). It also adopts a novel likelihood ranking-based strategy to construct contrastive samples from model generations. On the tasks of language detoxification, sentiment steering, and repetition reduction, we show that Click outperforms strong baselines of controllable text generation and demonstrate the superiority of Click's sample construction strategy.Comment: Findings of ACL 202

    The early variation of left ventricular twisting function in patients with lymphoma received anthracycline therapy assessed by three-dimensional speckle tracking echocardiography

    Get PDF
    Background: Anthracycline-induced cardiotoxicity remains a significant and unresolved issue in patients receiving chemotherapy. The aim of this study was to evaluate left ventricular (LV) twisting function by three-dimensional speckle tracking echocardiography (3D-STE) in patients with lymphoma after anthracycline therapy. Methods: One hundred and one patients with newly diagnosed diffuse large B-cell lymphoma who had planned to receive anthracycline chemotherapy were enrolled. LV apical rotation, basal rotation, twist, torsion, time to peak apical rotation and time to peak basal rotation were measured by 3D-STE at baseline, after the completion of two cycles and four cycles of the regimen, respectively. Apical–basal rotation delay was calculated as the difference between time to basal and time to apical rotation. Results: The results showed that LV apical rotation, basal rotation, twist and torsion declined progressively during the whole procedure (baseline vs. two and four cycles of the regimen, apical rotation: 12.5 ± ± 4.5° vs. 8.8 ± 3.6° vs. 6.0 ± 3.2°; basal rotation: –7.7 ± 3.0° vs. –5.9 ± 2.6° vs. –4.4 ± 2.5°; twist: 20.0 ± 6.4° vs. 14.5 ± 5.1° vs. 9.8 ± 4.5°; torsion: 2.9 ± 0.9°/cm vs. 2.1 ± 0.9°/cm vs. 1.4 ± 0.7°/cm; all p < 0.01). Furthermore, apical-basal rotation delay increased significantly after two cycles as well as after four cycles of the regimen (38.3 ± 67.9 ms vs. 66.7 ± 73.9 ms vs. 92.6 ± 96.9 ms; p < 0.01). Conclusions: LV twisting function deteriorated in the early stage of anthracycline therapy in patients with lymphoma, which could be detected by 3D-STE sensitively.

    LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

    Full text link
    With the rapid development and widespread application of Large Language Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly common, bringing with it potential risks, especially in terms of quality and integrity in fields like news, education, and science. Current research mainly focuses on purely MGT detection without adequately addressing mixed scenarios, including AI-revised Human-Written Text (HWT) or human-revised MGT. To tackle this challenge, we define mixtext, a form of mixed text involving both AI and human-generated content. Then, we introduce MixSet, the first dataset dedicated to studying these mixtext scenarios. Leveraging MixSet, we executed comprehensive experiments to assess the efficacy of prevalent MGT detectors in handling mixtext situations, evaluating their performance in terms of effectiveness, robustness, and generalization. Our findings reveal that existing detectors struggle to identify mixtext, particularly in dealing with subtle modifications and style adaptability. This research underscores the urgent need for more fine-grain detectors tailored for mixtext, offering valuable insights for future research. Code and Models are available at https://github.com/Dongping-Chen/MixSet.Comment: Accepted by NAACL 202

    Specific gene module pair-based target identification and drug discovery

    Get PDF
    Identification of the biological targets of a compound is of paramount importance for the exploration of the mechanism of action of drugs and for the development of novel drugs. A concept of the Connectivity Map (CMap) was previously proposed to connect genes, drugs, and disease states based on the common gene-expression signatures. For a new query compound, the CMap-based method can infer its potential targets by searching similar drugs with known targets (reference drugs) and measuring the similarities into their specific transcriptional responses between the query compound and those reference drugs. However, the available methods are often inefficient due to the requirement of the reference drugs as a medium to link the query agent and targets. Here, we developed a general procedure to extract target-induced consensus gene modules from the transcriptional profiles induced by the treatment of perturbagens of a target. A specific transcriptional gene module pair (GMP) was automatically identified for each target and could be used as a direct target signature. Based on the GMPs, we built the target network and identified some target gene clusters with similar biological mechanisms. Moreover, a gene module pair-based target identification (GMPTI) approach was proposed to predict novel compound–target interactions. Using this method, we have discovered novel inhibitors for three PI3K pathway proteins PI3Kα/β/δ, including PU-H71, alvespimycin, reversine, astemizole, raloxifene HCl, and tamoxifen

    Adding power of artificial intelligence to situational awareness of large interconnections dominated by inverter‐based resources

    Get PDF
    Large-scale power systems exhibit more complex dynamics due to the increasing integration of inverter-based resources (IBRs). Therefore, there is an urgent need to enhance the situational awareness capability for better monitoring and control of power grids dominated by IBRs. As a pioneering Wide-Area Measurement System, FNET/GridEye has developed and implemented various advanced applications based on the collected synchrophasor measurements to enhance the situational awareness capability of large-scale power grids. This study provides an overview of the latest progress of FNET/GridEye. The sensors, communication, and data servers are upgraded to handle ultra-high density synchrophasor and point-on-wave data to monitor system dynamics with more details. More importantly, several artificial intelligence (AI)-based advanced applications are introduced, including AI-based inertia estimation, AI-based disturbance size and location estimation, AI-based system stability assessment, and AI-based data authentication
    corecore