Search CORE

213 research outputs found

How Good Are Large Language Models at Out-of-Distribution Detection?

Author: Feng Yujie
Liu Bo
Lu Zexin
Wu Xiao-Ming
Xue Lei
Zhan Liming
Publication venue
Publication date: 20/08/2023
Field of study

Out-of-distribution (OOD) detection plays a vital role in enhancing the reliability of machine learning (ML) models. The emergence of large language models (LLMs) has catalyzed a paradigm shift within the ML community, showcasing their exceptional capabilities across diverse natural language processing tasks. While existing research has probed OOD detection with smaller encoder-based Transformers like BERT and RoBERTa, the stark differences in scales, pre-training objectives, and inference paradigms call into question the applicability of these findings to LLMs. This paper embarks on a pioneering empirical investigation of OOD detection in the domain of LLMs, focusing on LLaMA series ranging from 7B to 65B in size. We thoroughly evaluate commonly-used OOD detectors, scrutinizing their performance in both zero-grad and fine-tuning scenarios. Notably, we alter previous discriminative in-distribution fine-tuning into generative fine-tuning, aligning the pre-training objective of LLMs with downstream tasks. Our findings unveil that a simple cosine distance OOD detector demonstrates superior efficacy, outperforming other OOD detectors. We provide an intriguing explanation for this phenomenon by highlighting the isotropic nature of the embedding spaces of LLMs, which distinctly contrasts with the anisotropic property observed in smaller BERT family models. The new insight enhances our understanding of how LLMs detect OOD data, thereby enhancing their adaptability and reliability in dynamic environments.Comment: Work in progres

arXiv.org e-Print Archive

Effects of tMa-Xin-Di-Tan decoction on ovalbumin-induced allergic asthma in mice

Author: Bai Li
Liu Yazun
Ming Xi
Xu Wanchao
Xue Zheng
Yu Jianer
Zhan Xinguang
Publication venue: 'African Journals Online (AJOL)'
Publication date: 31/05/2018
Field of study

Purpose: To investigate the effect of the Ma-Xin-Di-Tan (MXDT) decoction on ovalbumin-induced allergic asthma (AA) in mice.Methods: Asthma was induced in mice by ovalbumin (OVA) injection, and different doses of MXDT (150, 300, and 600 mg/kg/day) were administered orally for 28 days. Pathological changes in lung tissues were examined, while levels of cytokines, including interleukin (IL)-4, IL-6, IL-17, interferon (IFN)-γ, and transforming growth factor (TGF)-β, were determined using enzyme-linked immunosorbent assays (ELISAs) of the bronchoalveolar lavage fluid. Toll-like receptor (TLR)-4, GATA-binding protein (GATA)-3, Ox40 ligand (OX40L), indoleamine 2,3-dioxygenase (IDO), forkhead box P3 (Foxp3), and T box expressed in T cells (T-bet) levels were determined in lung tissues by western blot analysis.Results: MXDT inhibited the inflammatory reaction of lung tissues in OVA-challenged mice. After treatment with MXDT, levels of IL-4, IL-6, IL-17, and TGF-β were downregulated, whereas IFN-γ levels were upregulated. In addition, MXDT decreased TLR-4, GATA-3, and OX40L levels in lung tissues but increased the expression of Foxp3, T-bet, and IDO.Conclusion: MXDT has antiallergic effects on OVA-induced AA in mice; the possible molecular mechanisms might involve the inhibition of inflammatory reactions and modulation of Th1/Th2 cytokine balance.Keywords: Ma-Xin-Di-Tan decoction, Allergic asthma, Inflammatory reactions, Th1/Th

AJOL - African Journals Online

The first symbiotic stars from the LAMOST survey

Author: Chen Xue-Fei
Han Zhan-Wen
Hou Yonghui
Li Jiao
Luo A-Li
Mikołajewska Joanna
Rebassa-Mansergas Alberto
Wang Yuefei
Wu Yue
Yang Ming
Zhang Yong
Publication venue
Publication date: 25/05/2015
Field of study

Symbiotic stars are interacting binary systems with the longest orbital periods. They are typically formed by a white dwarf, a red giant and a nebula. These objects are natural astrophysical laboratories for studying the evolution of binaries. Current estimates of the population of Milky Way symbiotic stars vary from 3000 up to 400000. However, the current census is less than 300. The Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST) survey can obtain hundreds of thousands of stellar spectra per year, providing a good opportunity to search for new symbiotic stars. In this work we detect 4 of such binaries among 4,147,802 spectra released by the LAMOST, of which two are new identifications. The first is LAMOST J12280490-014825.7, considered to be an S-type halo symbiotic star. The second is LAMOST J202629.80+423652.0, a D-type symbiotic star

arXiv.org e-Print Archive

Examining the impact of carbon constraints on the capital structure of Chinese power enterprises

Author: Ming Xue Han
Tang Zhan Long
Yi Jing Dang
Zi Xin Guo
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

China’s power system will face more constraints of the carbon emission reduction policy under the goal of “double carbon”, it is particularly important to study the impact of carbon constraints on the capital structure of power enterprises. Commencing the viewpoint of static and dynamic, this research regards the implementation of China’s carbon pilot policy as a quasi-natural experiment, using DID method, sys-GMM model and some robustness tests to examine how the carbon constraint affects the capital structure of power companies from 2008 to 2020. The empirical results show that the financial leverage is significantly reduced after the implementation of China’s carbon pilot policy. Moreover, the mandatory implementation of carbon emission trading mechanism makes heavy emission enterprises such as power enterprises face greater pressure on emission reduction, resulting in an increase in the risk of financial distress, reducing debt financing and equity financing of power enterprises, which promotes enterprises to decrease financial leverage. In addition, the article verifies another possibility, the enhancement of carbon constraints leads to the reduction of carbon-intensive investment rather than the increase of financial distress risk, so as to reduce the asset-liability ratio. However, the coefficient of interactive items is not significant. Further analysis indicates that the decline of financial leverage is unlikely to be caused by changes in investment

Directory of Open Access Journals

Expression profiling of human glial precursors

Author: Campanelli James T
Chesnut Jonathan D
Liang Feng
Liu Ying
Rao Mahendra S
Sandrock Robert W
Wheatley Will
Xue Haipeng
Zhan Ming
Zheng Jianhua
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background We have generated gene expression databases for human glial precursors, neuronal precursors, astrocyte precursors and neural stem cells and focused on comparing the profile of glial precursors with that of other populations. Results A total of 14 samples were analyzed. Each population, previously distinguished from each other by immunocytochemical analysis of cell surface markers, expressed genes related to their key differentiation pathways. For the glial precursor cell population, we identified 458 genes that were uniquely expressed. Expression of a subset of these individual genes was validated by RT-PCR. We also report genes encoding cell surface markers that may be useful for identification and purification of human glial precursor populations. Conclusion We provide gene expression profile for human glial precursors. Our data suggest several signaling pathways that are important for proliferation and differentiation of human glial precursors. Such information may be utilized to further purify glial precursor populations, optimize media formulation, or study the effects of glial differentiation.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Towards Better Query Classification with Multi-Expert Knowledge Condensation in JD Ads Search

Author: Fang Zheng
Hu Jing-He
Jiang Xue
Lin Zhan-Gang
Ning Kun-Peng
Pang Ming
Peng Chang-Ping
Shao Jing-Ping
Zhao Xi-Wei
Publication venue
Publication date: 02/08/2023
Field of study

Search query classification, as an effective way to understand user intents, is of great importance in real-world online ads systems. To ensure a lower latency, a shallow model (e.g. FastText) is widely used for efficient online inference. However, the representation ability of the FastText model is insufficient, resulting in poor classification performance, especially on some low-frequency queries and tailed categories. Using a deeper and more complex model (e.g. BERT) is an effective solution, but it will cause a higher online inference latency and more expensive computing costs. Thus, how to juggle both inference efficiency and classification performance is obviously of great practical importance. To overcome this challenge, in this paper, we propose knowledge condensation (KC), a simple yet effective knowledge distillation framework to boost the classification performance of the online FastText model under strict low latency constraints. Specifically, we propose to train an offline BERT model to retrieve more potentially relevant data. Benefiting from its powerful semantic representation, more relevant labels not exposed in the historical data will be added into the training set for better FastText model training. Moreover, a novel distribution-diverse multi-expert learning strategy is proposed to further improve the mining ability of relevant data. By training multiple BERT models from different data distributions, it can respectively perform better at high, middle, and low-frequency search queries. The model ensemble from multi-distribution makes its retrieval ability more powerful. We have deployed two versions of this framework in JD search, and both offline experiments and online A/B testing from multiple datasets have validated the effectiveness of the proposed approach

arXiv.org e-Print Archive

Recommended from our members

A Refined Study of FCRL Genes from a Genome-Wide Association Study for Graves’ Disease

Author: Chen Jia-Lun
Chen Ming-Dao
Gao Guan-Qi
Gu Zhao-Hui
Li Chang-Gui
Liang Jun
Liang Liming
Liu Bing-Li
Liu Wei
Pan Chun-Ming
Song Huai-Dong
Song Zhi-Yi
Wang Hai-Ning
Xue Li-Qiong
Yang Shao-Ying
Yuan Guo-Yue
Zhan Ming
Zhang Xiao-Mei
Zhao Shuang-Xia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/05/2013
Field of study

To pinpoint the exact location of the etiological variant/s present at 1q21.1 harboring FCRL1-5 and CD5L genes, we carried out a refined association study in the entire FCRL region in 1,536 patients with Graves’ disease (GD) and 1,516 sex-matched controls by imputation analysis, logistic regression, and cis-eQTL analysis. Among 516 SNPs with P<0.05 in the initial GWAS scan, the strongest signals associated with GD and correlated to FCRL3 expression were located at a cluster of SNPs including rs7528684 and rs3761959. And the allele-specific effects for rs3761959 and rs7528684 on FCRL3 expression level revealed that the risk alleles A of rs3761959 and C of rs7528684 were correlated with the elevated expression level of FCRL3 whether in PBMCs or its subsets, especially in

CD19^+

B cells and

CD8^+

T subsets. Next, the combined analysis with 5,300 GD cases and 4,916 control individuals confirmed FCRL3 was a susceptibility gene of GD in Chinese Han populations, and rs3761959 and rs7528684 met the genome-wide association significance level (

P_{combined}

= 2.27×

10^{−12}

and 7.11×

10^{−13}

, respectively). Moreover, the haplotypes with the risk allele A of rs3761959 and risk allele C of rs7528684 were associated with GD risk. Finally, our epigenetic analysis suggested the disease-associated C allele of rs7528684 increased affinity for NF-KB transcription factor. Above data indicated that FCRL3 gene and its proxy SNP rs7528684 may be involved in the pathogenesis of GD by excessive inhibiting B cell receptor signaling and the impairment of suppressing function of Tregs

Harvard University - DASH

FigShare

Genome wide profiling of human embryonic stem cells (hESCs), their derivatives and embryonal carcinoma cells to develop base profiles of U.S. Federal government approved hESC lines

Author: Baker Shawn C
Barker David L
Chudin Eugene
Gonzalez Rodolfo
Li Huai
Liu Ying
Loring Jeanne F
Mattson Mark P
McDaniel Timothy K
Mueller Franz-Josef
Oeser Steffen
Rao Mahendra S
Schwartz Catherine M
Shin Soojung
Xue Haipeng
Zeng Xianmin
Zhan Ming
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: In order to compare the gene expression profiles of human embryonic stem cell (hESC) lines and their differentiated progeny and to monitor feeder contaminations, we have examined gene expression in seven hESC lines and human fibroblast feeder cells using Illumina(® )bead arrays that contain probes for 24,131 transcript probes. RESULTS: A total of 48 different samples (including duplicates) grown in multiple laboratories under different conditions were analyzed and pairwise comparisons were performed in all groups. Hierarchical clustering showed that blinded duplicates were correctly identified as the closest related samples. hESC lines clustered together irrespective of the laboratory in which they were maintained. hESCs could be readily distinguished from embryoid bodies (EB) differentiated from them and the karyotypically abnormal hESC line BG01V. The embryonal carcinoma (EC) line NTera2 is a useful model for evaluating characteristics of hESCs. Expression of subsets of individual genes was validated by comparing with published databases, MPSS (Massively Parallel Signature Sequencing) libraries, and parallel analysis by microarray and RT-PCR. CONCLUSION: we show that Illumina's bead array platform is a reliable, reproducible and robust method for developing base global profiles of cells and identifying similarities and differences in large number of samples

Springer - Publisher Connector

PubMed Central

LAMOST medium-resolution spectroscopic survey of binarity and exotic star (LAMOST-MRS-B): Observation strategy and target selection

Author: Cai Jing-Hao
Chen Xue-Fei
Cui Wen-Yuan
Ge Hong-Wei
Guo Yan-Jun
Han Zhan-Wen
Jiang Deng-Kai
Li Jiang-Dan
Li Jiao
Li Li-Fang
Liu Chao
Liu Jia-Ming
Ren Juan-Juan
Shi Jian-Rong
Tian Hao
Yuan Hai-Long
Zhang Bo
Zhang Hao-Tong
Publication venue
Publication date: 27/12/2022
Field of study

LAMOST-MRS-B is one of the sub-surveys of LAMOST medium-resolution (R~7500) spectroscopic survey. It aims at studying the statistical properties (e.g., binary fraction, orbital period distribution, mass ratio distribution) of binary stars and exotic stars. We intend to observe about 30000 stars (10 mag <= G <= 14.5 mag) with at least 10 visits in five years. We first planned to observe 25 plates around the galactic plane in 2018. Then the plates were reduced to 12 in 2019 because of the limitation of observation. At the same time, two new plates located at the high galactic latitude were added to explore binary properties influenced by the different environments. In this survey project, we set the identified exotic and low-metallicity stars with the highest observation priorities. For the rest of the selected stars, we gave higher priority to the relatively brighter stars in order to obtain high-quality spectra as many as possible. Spectra of 49129 stars have been obtained in LAMOST-MRS-B field and released in DR8, of which 28828 and 3375 stars have been visited more than twice and ten times with SNR >= 10, respectively. Most of the sources are B-, A-, and F-type stars with 0.6 < [Fe/H] < 0.4 dex. We also obtain 347 identified variable and exotic stars and about 250 stars with [Fe/H] < 1 dex. We measure radial velocities (RVs) by using 892233 spectra of the stars. The uncertainties of RV achieve about 1 km/s and 10 km/s1 for 95% of late- and early-type stars, respectively. The datasets presented in this paper are available at http://www.doi.org/10.57760/sciencedb.j00113.00035

arXiv.org e-Print Archive