215 research outputs found

    SDA: Simple Discrete Augmentation for Contrastive Sentence Representation Learning

    Full text link
    Contrastive learning methods achieve state-of-the-art results in unsupervised sentence representation learning. Although playing essential roles in contrastive learning, data augmentation methods applied on sentences have not been fully explored. Current SOTA method SimCSE utilizes a simple dropout mechanism as continuous augmentation which outperforms discrete augmentations such as cropping, word deletion and synonym replacement. To understand the underlying rationales, we revisit existing approaches and attempt to hypothesize the desiderata of reasonable data augmentation methods: balance of semantic consistency and expression diversity. Based on the hypothesis, we propose three simple yet effective discrete sentence augmentation methods, i.e., punctuation insertion, affirmative auxiliary and double negation. The punctuation marks, auxiliaries and negative words act as minimal noises in lexical level to produce diverse sentence expressions. Unlike traditional augmentation methods which randomly modify the sentence, our augmentation rules are well designed for generating semantically consistent and grammatically correct sentences. We conduct extensive experiments on both English and Chinese semantic textual similarity datasets. The results show the robustness and effectiveness of the proposed methods

    Transcriptional Regulation of opaR, qrr2–4 and aphA by the Master Quorum-Sensing Regulator OpaR in Vibrio parahaemolyticus

    Get PDF
    Background: Vibrio parahaemolyticus is a leading cause of infectious diarrhea and enterogastritis via the fecal-oral route. V. harveyi is a pathogen of fishes and invertebrates, and has been used as a model for quorum sensing (QS) studies. LuxR is the master QS regulator (MQSR) of V. harveyi, and LuxR-dependent expression of its own gene, qrr2–4 and aphA have been established in V. harveyi. Molecular regulation of target genes by the V. parahaemolyticus MQSR OpaR is still poorly understood. Methodology/Principal Findings: The bioinformatics analysis indicated that V. parahaemolyticus OpaR, V. harveyi LuxR, V. vulnificu SmcR, and V. alginolyticus ValR were extremely conserved, and that these four MQSRs appeared to recognize the same conserved cis-acting signals, which was represented by the consensus constructs manifesting as a position frequency matrix and as a 20 bp box, within their target promoters. The MQSR box-like sequences were found within the upstream DNA regions of opaR, qrr2–4 and aphA in V. parahaemolyticus, and the direct transcriptional regulation of these target genes by OpaR were further confirmed by multiple biochemical experiments including primer extension assay, gel mobility shift assay, and DNase I footprinting analysis. Translation and transcription starts, core promoter elements for sigma factor recognition, Shine-Dalgarno sequences for ribosome recognition, and OpaR-binding sites were determined for the five target genes of OpaR, which gave a structural map of the OpaR-dependent promoters. Further computational promote

    HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

    Full text link
    Solving complicated AI tasks with different domains and modalities is a key step toward artificial general intelligence. While there are numerous AI models available for various domains and modalities, they cannot handle complicated AI tasks autonomously. Considering large language models (LLMs) have exhibited exceptional abilities in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks, with language serving as a generic interface to empower this. Based on this philosophy, we present HuggingGPT, an LLM-powered agent that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., Hugging Face) to solve AI tasks. Specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in Hugging Face, execute each subtask with the selected AI model, and summarize the response according to the execution results. By leveraging the strong language capability of ChatGPT and abundant AI models in Hugging Face, HuggingGPT can tackle a wide range of sophisticated AI tasks spanning different modalities and domains and achieve impressive results in language, vision, speech, and other challenging tasks, which paves a new way towards the realization of artificial general intelligence

    Cryo-EM Structure of a Novel Calicivirus, Tulane Virus.

    Get PDF
    Tulane virus (TV) is a newly isolated cultivatable calicivirus that infects juvenile rhesus macaques. Here we report a 6.3 Å resolution cryo-electron microscopy structure of the TV virion. The TV virion is about 400 Å in diameter and consists of a T = 3 icosahedral protein capsid enclosing the RNA genome. 180 copies of the major capsid protein VP1 (~57 KDa) are organized into two types of dimers A/B and C/C and form a thin, smooth shell studded with 90 dimeric protrusions. The overall capsid organization and the capsid protein fold of TV closely resemble that of other caliciviruses, especially of human Norwalk virus, the prototype human norovirus. These close structural similarities support TV as an attractive surrogate for the non-cultivatable human noroviruses. The most distinctive feature of TV is that its C/C dimers are in a highly flexible conformation with significantly reduced interactions between the shell (S) domain and the protruding (P) domain of VP1. A comparative structural analysis indicated that the P domains of TV C/C dimers were much more flexible than those of other caliciviruses. These observations, combined with previous studies on other caliciviruses, led us to hypothesize that the enhanced flexibility of C/C dimer P domains are likely required for efficient calicivirus-host cell interactions and the consequent uncoating and genome release. Residues in the S-P1 hinge between the S and P domain may play a critical role in the flexibility of P domains of C/C dimers

    What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

    Full text link
    In this paper, we propose a theoretical framework to explain the efficacy of prompt learning in zero/few-shot scenarios. First, we prove that conventional pre-training and fine-tuning paradigm fails in few-shot scenarios due to overfitting the unrepresentative labelled data. We then detail the assumption that prompt learning is more effective because it empowers pre-trained language model that is built upon massive text corpora, as well as domain-related human knowledge to participate more in prediction and thereby reduces the impact of limited label information provided by the small training set. We further hypothesize that language discrepancy can measure the quality of prompting. Comprehensive experiments are performed to verify our assumptions. More remarkably, inspired by the theoretical framework, we propose an annotation-agnostic template selection method based on perplexity, which enables us to ``forecast'' the prompting performance in advance. This approach is especially encouraging because existing work still relies on development set to post-hoc evaluate templates. Experiments show that this method leads to significant prediction benefits compared to state-of-the-art zero-shot methods

    Isolation and Characterization of 89K Pathogenicity Island-Positive ST-7 Strains of Streptococcus suis Serotype 2 from Healthy Pigs, Northeast China

    Get PDF
    Streptococcus suis is a swine pathogen which can also cause severe infection, such as meningitis, and streptococcal-like toxic shock syndrome (STSS), in humans. In China, most of the S. suis infections in humans were reported in the southern areas with warm and humid climates, but little attention had been paid to the northern areas. Data presented here showed that the virulent serotypes 1, 2, 7, and 9 of S. suis could be steadily isolated from the healthy pigs in the pig farms in all the three provinces of Northeast China. Notably, a majority of the serotype 2 isolates belonged to the 89K pathogenicity island-positive ST-7 clone that had historically caused the human STSS outbreaks in the Sichuan and Jiangsu provinces of China, although the human STSS case caused by S. suis had never been reported in northern areas of China. Data presented here indicated that the survey of S. suis should be expanded to or reinforced in the northern areas of China

    Cyclic Motion Planning of Redundant Robot Arms: Simple Extension of Performance Index May Not Work

    Get PDF
    Abstract In this paper, multiple types of performance indices (termed, an origina

    Phenotypic and transcriptional analysis of the osmotic regulator OmpR in Yersinia pestis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The osmotic regulator OmpR in <it>Escherichia coli </it>regulates differentially the expression of major porin proteins OmpF and OmpC. In <it>Yersinia enterocolitica </it>and <it>Y. pseudotuberculosis</it>, OmpR is required for both virulence and survival within macrophages. However, the phenotypic and regulatory roles of OmpR in <it>Y. pestis </it>are not yet fully understood.</p> <p>Results</p> <p><it>Y. pestis </it>OmpR is involved in building resistance against phagocytosis and controls the adaptation to various stressful conditions met in macrophages. The <it>ompR </it>mutation likely did not affect the virulence of <it>Y. pestis </it>strain 201 that was a human-avirulent enzootic strain. The microarray-based comparative transcriptome analysis disclosed a set of 224 genes whose expressions were affected by the <it>ompR </it>mutation, indicating the global regulatory role of OmpR in <it>Y. pestis</it>. Real-time RT-PCR or <it>lacZ </it>fusion reporter assay further validated 16 OmpR-dependent genes, for which OmpR consensus-like sequences were found within their upstream DNA regions. <it>ompC</it>, <it>F</it>, <it>X</it>, and <it>R </it>were up-regulated dramatically with the increase of medium osmolarity, which was mediated by OmpR occupying the target promoter regions in a tandem manner.</p> <p>Conclusion</p> <p>OmpR contributes to the resistance against phagocytosis or survival within macrophages, which is conserved in the pathogenic yersiniae. <it>Y. pestis </it>OmpR regulates <it>ompC</it>, <it>F</it>, <it>X</it>, and <it>R </it>directly through OmpR-promoter DNA association. There is an inducible expressions of the pore-forming proteins OmpF, C, and × at high osmolarity in <it>Y. pestis</it>, in contrast to the reciprocal regulation of them in <it>E. coli</it>. The main difference is that <it>ompF </it>expression is not repressed at high osmolarity in <it>Y. pestis</it>, which is likely due to the absence of a promoter-distal OmpR-binding site for <it>ompF</it>.</p

    Regulatory effects of cAMP receptor protein (CRP) on porin genes and its own gene in Yersinia pestis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The cAMP receptor protein (CRP) is a global bacterial regulator that controls many target genes. The CRP-cAMP complex regulates the <it>ompR-envZ </it>operon in <it>E. coli </it>directly, involving both positive and negative regulations of multiple target promoters; further, it controls the production of porins indirectly through its direct action on <it>ompR-envZ</it>. Auto-regulation of CRP has also been established in <it>E. coli</it>. However, the regulation of porin genes and its own gene by CRP remains unclear in <it>Y. pestis</it>.</p> <p>Results</p> <p><it>Y. pestis </it>employs a distinct mechanism indicating that CRP has no regulatory effect on the <it>ompR-envZ </it>operon; however, it stimulates <it>ompC </it>and <it>ompF </it>directly, while repressing <it>ompX</it>. No transcriptional regulatory association between CRP and its own gene can be detected in <it>Y. pestis</it>, which is also in contrast to the fact that CRP acts as both repressor and activator for its own gene in <it>E. coli</it>. It is likely that <it>Y. pestis </it>OmpR and CRP respectively sense different signals (medium osmolarity, and cellular cAMP levels) to regulate porin genes independently.</p> <p>Conclusion</p> <p>Although the CRP of <it>Y. pestis </it>shows a very high homology to that of <it>E. coli</it>, and the consensus DNA sequence recognized by CRP is shared by the two bacteria, the <it>Y. pestis </it>CRP can recognize the promoters of <it>ompC</it>, <it>F</it>, and <it>X </it>directly rather than that of its own gene, which is different from the relevant regulatory circuit of <it>E. coli</it>. Data presented here indicate a remarkable remodeling of the CRP-mediated regulation of porin genes and of its own one between these two bacteria.</p
    corecore