345 research outputs found

    Contrastive Learning with Prompt-derived Virtual Semantic Prototypes for Unsupervised Sentence Embedding

    Full text link
    Contrastive learning has become a new paradigm for unsupervised sentence embeddings. Previous studies focus on instance-wise contrastive learning, attempting to construct positive pairs with textual data augmentation. In this paper, we propose a novel Contrastive learning method with Prompt-derived Virtual semantic Prototypes (ConPVP). Specifically, with the help of prompts, we construct virtual semantic prototypes to each instance, and derive negative prototypes by using the negative form of the prompts. Using a prototypical contrastive loss, we enforce the anchor sentence embedding to be close to its corresponding semantic prototypes, and far apart from the negative prototypes as well as the prototypes of other sentences. Extensive experimental results on semantic textual similarity, transfer, and clustering tasks demonstrate the effectiveness of our proposed model compared to strong baselines. Code is available at https://github.com/lemon0830/promptCSE.Comment: Findings of EMNLP 202

    Project Overview of the Beijing-Arizona Sky Survey

    Full text link
    The Beijing-Arizona Sky Survey (BASS) is a wide-field two-band photometric survey of the Northern Galactic Cap using the 90Prime imager on the 2.3 m Bok telescope at Kitt Peak. It is a four-year collaboration between the National Astronomical Observatory of China and Steward Observatory, the University of Arizona, serving as one of the three imaging surveys to provide photometric input catalogs for target selection of the Dark Energy Spectroscopic Instrument (DESI) project. BASS will take up to 240 dark/grey nights to cover an area of about 5400 deg2^2 in the gg and rr bands. The 5σ\sigma limiting AB magnitudes for point sources in the two bands, corrected for the Galactic extinction, are 24.0 and 23.4 mag, respectively. BASS, together with other DESI imaging surveys, will provide unique science opportunities that cover a wide range of topics in both Galactic and extragalactic astronomy.Comment: 10 pages, submitted to PAS

    Discrimination and classification of tobacco wastes by identification and quantification of polyphenols with LC–MS/MS

    Get PDF
    The chemical composition of polyphenols in tobacco waste was identified by HPLC-PDA–ESI/MS/MS and the contents of chlorogenic acids and rutin in 10 varieties of tobacco wastes were determined by HPLC–UV. The relationships between the contents of active polyphenols and the varieties of tobacco wastes were interpreted by hierarchical cluster analysis (HCA) and principal component analysis (PCA). The results showed that 15 polyphenols were identified in a methanolic extract of dried tobacco waste. The tobacco wastes were characterized by high levels of chlorogenic acids (3-CQA, 5-CQA, and 4-CQA) and rutin; their ranges in the 10 tobacco varieties were 0.116–0.196, 0.686–1.781, 0.094–0.192, and 0.413–0.998 %, respectively. According to multivariate statistics models, two active compound variables can be considered important for the discrimination of the varieties of tobacco wastes: chlorogenic acids and rutin. Consequently, samples of 10 tobacco varieties were characterized into three groups by HCA based on the PCA pattern. In conclusion, tobacco waste could be used as a new pharmaceutical material for the production of natural chlorogenic acids and rutin in the ethnopharmacological industry

    Soft Language Clustering for Multilingual Model Pre-training

    Full text link
    Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size. In this paper, we propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally. Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods. On the tasks of XTREME including text classification, sequence labeling, question answering, and sentence retrieval, both base- and large-size language models pre-trained with our proposed method exhibit consistent performance improvement. Furthermore, it provides substantial advantages for low-resource languages in unsupervised sentence retrieval and for target languages that differ greatly from the source language in cross-lingual transfer

    Time dependence of the orthotropic compression Young's moduli and Poisson's ratios of Chinese fir wood

    Get PDF
    The time dependency of the orthotropic compliance for Chinese fir wood [Cunninghamia lanceolata (Lamb.) Hook] has been investigated by performing compressive creep experiments in all orthotropic directions. Time evolution of the creep strain in the axial and lateral directions was recorded by means of the digital image correlation (DIC) technique, to determine the diagonal and nondiagonal elements of the viscoelastic compliance matrix. The results reveal the significant influence of time on the mechanical behavior. The orthotropic nature of the viscoelastic compliance is highlighted by the different time dependency of the Young's moduli and the Poisson's ratios obtained for the individual directions. Differences among the time-dependent stress-strain relationship determined at the 25, 50, and 75% stress levels indicate that the viscoelastic behavior of wood is also load-dependent. A Poisson's ratio values, which are increasing with time in νLR, νLT, νRT, νTR, and decreasing in νRL and νTL, demonstrate that the creep strain is influenced by loading directions. The substantially different time dependency of the nondiagonal elements of the compliance matrix further reveals the orthotropic compliance asymmetry and emphasizes the complexity of the viscoelastic character of wood
    • …
    corecore