164 research outputs found
Preserving Silent Features for Domain Generalization
Domain generalization (DG) aims to improve the generalization ability of the
model trained on several known training domains over unseen test domains.
Previous work has shown that self-supervised contrastive pre-training improves
the robustness of the model on downstream tasks. However, in this paper, we
find that self-supervised models do not exhibit better generalization
performance than supervised models pre-trained on the same dataset in the DG
setting. We argue that this is owing to the fact that the richer intra-class
discriminative features extracted by self-supervised contrastive learning,
which we term silent features, are suppressed during supervised fine-tuning.
These silent features are likely to contain features that are more
generalizable on the test domain. In this work, we model and analyze this
feature suppression phenomenon and theoretically prove that preserving silent
features can achieve lower expected test domain risk under certain conditions.
In light of this, we propose a simple yet effective method termed STEP (Silent
Feature Preservation) to improve the generalization performance of the
self-supervised contrastive learning pre-trained model by alleviating the
suppression of silent features during the supervised fine-tuning process.
Experimental results show that STEP exhibits state-of-the-art performance on
standard DG benchmarks with significant distribution shifts
Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning
It has always been an important yet challenging problem to control language
models to avoid generating texts with undesirable attributes, such as toxic
language and unnatural repetition. We introduce Click for controllable text
generation, which needs no modification to the model architecture and
facilitates out-of-the-box use of trained models. It employs a contrastive loss
on sequence likelihood, which fundamentally decreases the generation
probability of negative samples (i.e., generations with undesirable
attributes). It also adopts a novel likelihood ranking-based strategy to
construct contrastive samples from model generations. On the tasks of language
detoxification, sentiment steering, and repetition reduction, we show that
Click outperforms strong baselines of controllable text generation and
demonstrate the superiority of Click's sample construction strategy.Comment: Findings of ACL 202
The early variation of left ventricular twisting function in patients with lymphoma received anthracycline therapy assessed by three-dimensional speckle tracking echocardiography
Background: Anthracycline-induced cardiotoxicity remains a significant and unresolved issue in patients receiving chemotherapy. The aim of this study was to evaluate left ventricular (LV) twisting function by three-dimensional speckle tracking echocardiography (3D-STE) in patients with lymphoma after anthracycline therapy.
Methods: One hundred and one patients with newly diagnosed diffuse large B-cell lymphoma who had planned to receive anthracycline chemotherapy were enrolled. LV apical rotation, basal rotation, twist, torsion, time to peak apical rotation and time to peak basal rotation were measured by 3D-STE at baseline, after the completion of two cycles and four cycles of the regimen, respectively. Apical–basal rotation delay was calculated as the difference between time to basal and time to apical rotation.
Results: The results showed that LV apical rotation, basal rotation, twist and torsion declined progressively during the whole procedure (baseline vs. two and four cycles of the regimen, apical rotation: 12.5 ± ± 4.5° vs. 8.8 ± 3.6° vs. 6.0 ± 3.2°; basal rotation: –7.7 ± 3.0° vs. –5.9 ± 2.6° vs. –4.4 ± 2.5°; twist: 20.0 ± 6.4° vs. 14.5 ± 5.1° vs. 9.8 ± 4.5°; torsion: 2.9 ± 0.9°/cm vs. 2.1 ± 0.9°/cm vs. 1.4 ± 0.7°/cm; all p < 0.01). Furthermore, apical-basal rotation delay increased significantly after two cycles as well as after four cycles of the regimen (38.3 ± 67.9 ms vs. 66.7 ± 73.9 ms vs. 92.6 ± 96.9 ms; p < 0.01).
Conclusions: LV twisting function deteriorated in the early stage of anthracycline therapy in patients with lymphoma, which could be detected by 3D-STE sensitively.
LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?
With the rapid development and widespread application of Large Language
Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly
common, bringing with it potential risks, especially in terms of quality and
integrity in fields like news, education, and science. Current research mainly
focuses on purely MGT detection without adequately addressing mixed scenarios,
including AI-revised Human-Written Text (HWT) or human-revised MGT. To tackle
this challenge, we define mixtext, a form of mixed text involving both AI and
human-generated content. Then, we introduce MixSet, the first dataset dedicated
to studying these mixtext scenarios. Leveraging MixSet, we executed
comprehensive experiments to assess the efficacy of prevalent MGT detectors in
handling mixtext situations, evaluating their performance in terms of
effectiveness, robustness, and generalization. Our findings reveal that
existing detectors struggle to identify mixtext, particularly in dealing with
subtle modifications and style adaptability. This research underscores the
urgent need for more fine-grain detectors tailored for mixtext, offering
valuable insights for future research. Code and Models are available at
https://github.com/Dongping-Chen/MixSet.Comment: Accepted by NAACL 202
Specific gene module pair-based target identification and drug discovery
Identification of the biological targets of a compound is of paramount importance for the exploration of the mechanism of action of drugs and for the development of novel drugs. A concept of the Connectivity Map (CMap) was previously proposed to connect genes, drugs, and disease states based on the common gene-expression signatures. For a new query compound, the CMap-based method can infer its potential targets by searching similar drugs with known targets (reference drugs) and measuring the similarities into their specific transcriptional responses between the query compound and those reference drugs. However, the available methods are often inefficient due to the requirement of the reference drugs as a medium to link the query agent and targets. Here, we developed a general procedure to extract target-induced consensus gene modules from the transcriptional profiles induced by the treatment of perturbagens of a target. A specific transcriptional gene module pair (GMP) was automatically identified for each target and could be used as a direct target signature. Based on the GMPs, we built the target network and identified some target gene clusters with similar biological mechanisms. Moreover, a gene module pair-based target identification (GMPTI) approach was proposed to predict novel compound–target interactions. Using this method, we have discovered novel inhibitors for three PI3K pathway proteins PI3Kα/β/δ, including PU-H71, alvespimycin, reversine, astemizole, raloxifene HCl, and tamoxifen
Adding power of artificial intelligence to situational awareness of large interconnections dominated by inverter‐based resources
Large-scale power systems exhibit more complex dynamics due to the increasing integration of inverter-based resources (IBRs). Therefore, there is an urgent need to enhance the situational awareness capability for better monitoring and control of power grids dominated by IBRs. As a pioneering Wide-Area Measurement System, FNET/GridEye has developed and implemented various advanced applications based on the collected synchrophasor measurements to enhance the situational awareness capability of large-scale power grids. This study provides an overview of the latest progress of FNET/GridEye. The sensors, communication, and data servers are upgraded to handle ultra-high density synchrophasor and point-on-wave data to monitor system dynamics with more details. More importantly, several artificial intelligence (AI)-based advanced applications are introduced, including AI-based inertia estimation, AI-based disturbance size and location estimation, AI-based system stability assessment, and AI-based data authentication
- …