120 research outputs found

    A Semantic Invariant Robust Watermark for Large Language Models

    Full text link
    Watermark algorithms for large language models (LLMs) have achieved extremely high accuracy in detecting text generated by LLMs. Such algorithms typically involve adding extra watermark logits to the LLM's logits at each generation step. However, prior algorithms face a trade-off between attack robustness and security robustness. This is because the watermark logits for a token are determined by a certain number of preceding tokens; a small number leads to low security robustness, while a large number results in insufficient attack robustness. In this work, we propose a semantic invariant watermarking method for LLMs that provides both attack robustness and security robustness. The watermark logits in our work are determined by the semantics of all preceding tokens. Specifically, we utilize another embedding LLM to generate semantic embeddings for all preceding tokens, and then these semantic embeddings are transformed into the watermark logits through our trained watermark model. Subsequent analyses and experiments demonstrated the attack robustness of our method in semantically invariant settings: synonym substitution and text paraphrasing settings. Finally, we also show that our watermark possesses adequate security robustness. Our code and data are available at https://github.com/THU-BPM/Robust_Watermark.Comment: 16 pages, 9 figures, 2 table

    Prompt Me Up: Unleashing the Power of Alignments for Multimodal Entity and Relation Extraction

    Full text link
    How can we better extract entities and relations from text? Using multimodal extraction with images and text obtains more signals for entities and relations, and aligns them through graphs or hierarchical fusion, aiding in extraction. Despite attempts at various fusions, previous works have overlooked many unlabeled image-caption pairs, such as NewsCLIPing. This paper proposes innovative pre-training objectives for entity-object and relation-image alignment, extracting objects from images and aligning them with entity and relation prompts for soft pseudo-labels. These labels are used as self-supervised signals for pre-training, enhancing the ability to extract entities and relations. Experiments on three datasets show an average 3.41% F1 improvement over prior SOTA. Additionally, our method is orthogonal to previous multimodal fusions, and using it on prior SOTA fusions further improves 5.47% F1.Comment: Accepted to ACM Multimedia 202

    RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction

    Full text link
    How to identify semantic relations among entities in a document when only a few labeled documents are available? Few-shot document-level relation extraction (FSDLRE) is crucial for addressing the pervasive data scarcity problem in real-world scenarios. Metric-based meta-learning is an effective framework widely adopted for FSDLRE, which constructs class prototypes for classification. However, existing works often struggle to obtain class prototypes with accurate relational semantics: 1) To build prototype for a target relation type, they aggregate the representations of all entity pairs holding that relation, while these entity pairs may also hold other relations, thus disturbing the prototype. 2) They use a set of generic NOTA (none-of-the-above) prototypes across all tasks, neglecting that the NOTA semantics differs in tasks with different target relation types. In this paper, we propose a relation-aware prototype learning method for FSDLRE to strengthen the relational semantics of prototype representations. By judiciously leveraging the relation descriptions and realistic NOTA instances as guidance, our method effectively refines the relation prototypes and generates task-specific NOTA prototypes. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches by average 2.61% F1F_1 across various settings of two FSDLRE benchmarks.Comment: Accepted to EMNLP 202

    SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

    Get PDF
    BACKGROUND: Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indeed not homologous or are just results due to inappropriate alignment tools or scoring systems used. Although several systematic evaluations of multiple sequence alignment (MSA) programs have been proposed, they may not provide a standard-bearer for most biologists because those poorly aligned regions in these evaluations are never discussed. Thus, a tool that allows cross comparison of the alignment results obtained by different tools simultaneously could help a biologist evaluate their correctness and accuracy. RESULTS: In this paper, we present a versatile alignment visualization system, called SinicView, (for Sequence-aligning INnovative and Interactive Comparison VIEWer), which allows the user to efficiently compare and evaluate assorted nucleotide alignment results obtained by different tools. SinicView calculates similarity of the alignment outputs under a fixed window using the sum-of-pairs method and provides scoring profiles of each set of aligned sequences. The user can visually compare alignment results either in graphic scoring profiles or in plain text format of the aligned nucleotides along with the annotations information. We illustrate the capabilities of our visualization system by comparing alignment results obtained by MLAGAN, MAVID, and MULTIZ, respectively. CONCLUSION: With SinicView, users can use their own data sequences to compare various alignment tools or scoring systems and select the most suitable one to perform alignment in the initial stage of sequence analysis

    Transcriptomes of Mouse Olfactory Epithelium Reveal Sexual Differences in Odorant Detection

    Get PDF
    To sense numerous odorants and chemicals, animals have evolved a large number of olfactory receptor genes (Olfrs) in their genome. In particular, the house mouse has ∼1,100 genes in the Olfr gene family. This makes the mouse a good model organism to study Olfr genes and olfaction-related genes. To date, whether male and female mice possess the same ability in detecting environmental odorants is still unknown. Using the next generation sequencing technology (paired-end mRNA-seq), we detected 1,088 expressed Olfr genes in both male and female olfactory epithelium. We found that not only Olfr genes but also odorant-binding protein (Obp) genes have evolved rapidly in the mouse lineage. Interestingly, Olfr genes tend to express at a higher level in males than in females, whereas the Obp genes clustered on the X chromosome show the opposite trend. These observations may imply a more efficient odorant-transporting system in females, whereas a more active Olfr gene expressing system in males. In addition, we detected the expression of two genes encoding major urinary proteins, which have been proposed to bind and transport pheromones or act as pheromones in mouse urine. This observation suggests a role of main olfactory system (MOS) in pheromone detection, contrary to the view that only accessory olfactory system (AOS) is involved in pheromone detection. This study suggests the sexual differences in detecting environmental odorants in MOS and demonstrates that mRNA-seq provides a powerful tool for detecting genes with low expression levels and with high sequence similarities

    Artificial intelligence-assisted remote detection of ST-elevation myocardial infarction using a mini-12-lead electrocardiogram device in prehospital ambulance care

    Get PDF
    ObjectiveTo implement an all-day online artificial intelligence (AI)-assisted detection of ST-elevation myocardial infarction (STEMI) by prehospital 12-lead electrocardiograms (ECGs) to facilitate patient triage for timely reperfusion therapy.MethodsThe proposed AI model combines a convolutional neural network and long short-term memory (CNN-LSTM) to predict STEMI on prehospital 12-lead ECGs obtained from mini-12-lead ECG devices equipped in ambulance vehicles in Central Taiwan. Emergency medical technicians (EMTs) from the 14 AI-implemented fire stations performed the on-site 12-lead ECG examinations using the mini portable device. The 12-lead ECG signals were transmitted to the AI center of China Medical University Hospital to classify the recordings as “STEMI” or “Not STEMI”. In 11 non-AI fire stations, the ECG data were transmitted to a secure network and read by available on-line emergency physicians. The response time was defined as the time interval between the ECG transmission and ECG interpretation feedback.ResultsBetween July 17, 2021, and March 26, 2022, the AI model classified 362 prehospital 12-lead ECGs obtained from 275 consecutive patients who had called the 119 dispatch centers of fire stations in Central Taiwan for symptoms of chest pain or shortness of breath. The AI's response time to the EMTs in ambulance vehicles was 37.2 ± 11.3 s, which was shorter than the online physicians' response time from 11 other fire stations with no AI implementation (113.2 ± 369.4 s, P < 0.001) after analyzing another set of 335 prehospital 12-lead ECGs. The evaluation metrics including accuracy, precision, specificity, recall, area under the receiver operating characteristic curve, and F1 score to assess the overall AI performance in the remote detection of STEMI were 0.992, 0.889, 0.994, 0.941, 0.997, and 0.914, respectively. During the study period, the AI model promptly identified 10 STEMI patients who underwent primary percutaneous coronary intervention (PPCI) with a median contact-to-door time of 18.5 (IQR: 16–20.8) minutes.ConclusionImplementation of an all-day real-time AI-assisted remote detection of STEMI on prehospital 12-lead ECGs in the field is feasible with a high diagnostic accuracy rate. This approach may help minimize preventable delays in contact-to-treatment times for STEMI patients who require PPCI

    The Tumor-Immune Microenvironment and Response to Radiation Therapy

    Get PDF
    Chemotherapy and radiation therapy (RT) are standard therapeutic modalities for patients with cancer, including breast cancer. Historic studies examining tissue and cellular responses to RT have predominantly focused on damage caused to proliferating malignant cells leading to their death. However, there is increasing evidence that RT also leads to significant alterations in the tumor microenvironment, particularly with respect to effects on immune cells infiltrating tumors. This review focuses on tumor-associated immune cell responses following RT and discusses how immune responses may be modified to enhance durability and efficacy of RT

    Circulating Exosomal miRNAs as Biomarkers in Epithelial Ovarian Cancer

    No full text
    Failure to detect early-stage epithelial ovarian cancer (EOC) is a major contributing factor to its low survival rate. Increasing evidence suggests that different subtypes of EOC may behave as distinct diseases due to their different cells of origins, histology and treatment responses. Therefore, the identification of EOC subtype-specific biomarkers that can early detect the disease should be clinically beneficial. Exosomes are extracellular vesicles secreted by different types of cells and carry biological molecules, which play important roles in cell-cell communication and regulation of various biological processes. Multiple studies have proposed that exosomal miRNAs present in the circulation are good biomarkers for non-invasive early detection of cancer. In this review, the potential use of exosomal miRNAs as early detection biomarkers for EOCs and their accuracy are discussed. We also review the differential expression of circulating exosomal miRNAs and cell-free miRNAs between different biofluid sources, i.e., plasma and serum, and touch on the issue of endogenous reference miRNA selection. Additionally, the current clinical trials using miRNAs for detecting EOCs are summarized. In conclusion, circulating exosomal miRNAs as the non-invasive biomarkers have a high potential for early detection of EOC and its subtypes, and are likely to be clinically important in the future
    corecore