Search CORE

239 research outputs found

In-context Autoencoder for Context Compression in a Large Language Model

Author: Chen Si-Qing
Ge Tao
Hu Jing
Wang Xun
Wei Furu
Publication venue
Publication date: 13/07/2023
Field of study

We propose the In-context Autoencoder (ICAE) for context compression in a large language model (LLM). The ICAE has two modules: a learnable encoder adapted with LoRA from an LLM for compressing a long context into a limited number of memory slots, and a fixed decoder which is the target LLM that can condition on the memory slots for various purposes. We first pretrain the ICAE using both autoencoding and language modeling objectives on massive text data, enabling it to generate memory slots that accurately and comprehensively represent the original context. Then, we fine-tune the pretrained ICAE on a small amount of instruct data to enhance its interaction with various prompts for producing desirable responses. Our experimental results demonstrate that the ICAE learned with our proposed pretraining and fine-tuning paradigm can effectively produce memory slots with

4\times

context compression, which can be well conditioned on by the target LLM to respond to various prompts. The promising results demonstrate significant implications of the ICAE for its novel approach to the long context problem and its potential to reduce computation and memory overheads for LLM inference in practice, suggesting further research effort in context management for an LLM. Our code and data will be released shortly.Comment: Work in progres

arXiv.org e-Print Archive

An Evaluation on Large Language Model Outputs: Discourse and Memorization

Author: Chen Si-Qing
de Wynter Adrian
Gu Qilong
Sokolov Alex
Wang Xun
Publication venue
Publication date: 17/04/2023
Field of study

We present an empirical evaluation of various outputs generated by nine of the most widely-available large language models (LLMs). Our analysis is done with off-the-shelf, readily-available tools. We find a correlation between percentage of memorized text, percentage of unique text, and overall output quality, when measured with respect to output pathologies such as counterfactual and logically-flawed statements, and general failures like not staying on topic. Overall, 80.0% of the outputs evaluated contained memorized data, but outputs containing the most memorized content were also more likely to be considered of high quality. We discuss and evaluate mitigation strategies, showing that, in the models evaluated, the rate of memorized text being output is reduced. We conclude with a discussion on potential implications around what it means to learn, to memorize, and to evaluate quality text.Comment: Preprint. Under revie

arXiv.org e-Print Archive

Phase selection rule of high-entropy metallic glasses with different short-to-medium-range orders

Author: Dong Wei-Xia
Ge Jia-Cheng
Hahn Horst
Lan Si
Liu Si-Nan
Provenzano Virgil
Wang Xun-Li
Wu Zhen-Duo
Ying Hui-Qiang
Publication venue: Nonferrous Metals Society of China
Publication date: 24/05/2022
Field of study

KITopen

Engineering medium-range order and polyamorphism in a nanostructured amorphous alloy

Author: Almer Jon
Feng Tao
Gleiter Herbert
Guo Chunyu
Hahn Horst
Lan Si
Liu Chain-Tsuan
Pei Chaoqun
Ren Yang
Wang Xun-Li
Zhou Wenzhao
Publication venue: Nature Research
Publication date: 27/04/2020
Field of study

Like crystalline materials, the properties of amorphous materials can be tailored by tuning the local atomic-to-nanoscale structural configurations. Polyamorphism is evident by the coexistence of kinetically stabilized amorphous structures with tailorable short-to-medium-range orders, providing a viable means to engineer the degree of local order and heterogeneity. Here, we report experimental evidence of the coexistence of liquid-like and solid-like amorphous phases in a Ni

_{82}

_{18}

amorphous alloy with enhanced thermal stability and plasticity prepared by pulsed electrodeposition. The two amorphous phases, of comparable volume fraction of ~50% each, have similar short-range order but are distinguished by packing at the medium-range length scale (>6 Å). Upon heating, a structure crossover at ~450 K was observed, where the liquid-like structure transforms to the solid-like structure, as evidenced by the enthalpy release and an anomalous contraction of atomic structure over the medium-range length scale, due to the metastable nature of the liquid-like structure

KITopen

Enhancement of thermophilic anaerobic sludge digestion by 70ºC pre-treatment : energy considerations

Author: Heng-Hua Zhang (1904398)
Jin-Cang Zhang (1769521)
Meng-Si Wang (1904395)
Min Shao (137894)
Ming-Xing Li (1571302)
Peng Yang (296696)
Shi-Xun Cao (1904392)
Publication venue
Publication date: 01/01/2009
Field of study

The objective of this work was to investigate the effect of a low temperature pre-treatment (70°C) on the thermophilic anaerobic digestion of sewage sludge. Experimental results were used for the calculation of theoretical energy balances of full-scale digesters with and without pre-treatment step. The 70°C sludge pre-treatment increased sludge solubilization by 10 times and enhanced volatile fatty acids generation. Biogas production increased up to 30-40% and methane content in biogas from 64 to 68-70%. Theoretical calculations showed that additional surplus energy production would be expected by incorporating a 70°C pre-treatment step to a thermophilic reactor

Diposit Digital de Documents de la UAB

FigShare

Development and validation of a risk score model for predicting autism based on pre- and perinatal factors

Author: Guanglei Xun
Huixi Dong
Jianjun Ou
Jingping Zhao
Kun Xia
Si Dai
Xiaozi Lu
Yanting Hou
Yidong Shen
Ying Wang
Publication venue: Frontiers Media S.A.
Publication date: 01/02/2024
Field of study

BackgroundThe use of pre- and perinatal risk factors as predictive factors may lower the age limit for reliable autism prediction. The objective of this study was to develop a clinical model based on these risk factors to predict autism.MethodsA stepwise logistic regression analysis was conducted to explore the relationships between 28 candidate risk factors and autism risk among 615 Han Chinese children with autism and 615 unrelated typically developing children. The significant factors were subsequently used to create a clinical risk score model. A chi-square automatic interaction detector (CHAID) decision tree was used to validate the selected predictors included in the model. The predictive performance of the model was evaluated by an independent cohort.ResultsFive factors (pregnancy influenza-like illness, pregnancy stressors, maternal allergic/autoimmune disease, cesarean section, and hypoxia) were found to be significantly associated with autism risk. A receiver operating characteristic (ROC) curve indicated that the risk score model had good discrimination ability for autism, with an area under the curve (AUC) of 0.711 (95% CI=0.679-0.744); in the external validation cohort, the model showed slightly worse but overall similar predictive performance. Further subgroup analysis indicated that a higher risk score was associated with more behavioral problems. The risk score also exhibited robustness in a subgroup analysis of patients with mild autism.ConclusionThis risk score model could lower the age limit for autism prediction with good discrimination performance, and it has unique advantages in clinical application

Directory of Open Access Journals

A weighted cranial diffusion-weighted imaging scale for Wilson’s disease

Author: Bo Li
Chen-chen Xu
Hao Geng
Hao Geng
Nan Cheng
Nan Cheng
Rui-qi Zhang
Rui-qi Zhang
Shi-jing Wang
Shi-jing Wang
Si-rui Cheng
Tao Wang
Tong Wu
Xun Wang
Xun Wang
Yi-ning Sun
Yong-sheng Han
Yong-zhu Han
Yong-zhu Han
Yu Wang
Yu Wang
Zeng-hui Ding
Publication venue: Frontiers Media S.A.
Publication date: 01/08/2023
Field of study

ObjectivesCranial magnetic resonance imaging (MRI) could be a crucial tool for the assessment for neurological symptoms in patients with Wilson’s disease (WD). Diffusion-weighted imaging (DWI) hyperintensity reflects the acute brain injuries, which mainly occur in specific brain regions. Therefore, this study aimed to develop a weighted cranial DWI scale for patients with WD, with special focus on specific brain regions.Materials and methodsIn total, 123 patients with WD were enrolled, 118 of whom underwent 1.5 T-MRI on admission. The imaging score was calculated as described previously and depended on the following sequences: one point was acquired when abnormal intensity occurred in the T1, T2, and fluid-attenuation inversion recovery sequences, and two points were acquired when DWI hyperintensity were found. Consensus weighting was conducted based on the symptoms and response to treatment.ResultsIntra-rater agreement were good (r = 0.855 [0.798–0.897], p < 0.0001). DWI hyperintensity in the putamen was a high-risk factor for deterioration during de-copper therapy (OR = 8.656, p < 0.05). The high-risk factors for readmission for intravenous de-copper therapies were DWI hyperintensity in the midbrain (OR = 3.818, p < 0.05) and the corpus callosum (OR = 2.654, p < 0.05). Both scoring systems had positive correlation with UWDRS scale (original semi-quantitative scoring system, r = 0.35, p < 0.001; consensus semi-quantitative scoring system, r = 0.351, p < 0.001.). Compared to the original scoring system, the consensus scoring system had higher correlations with the occurrence of deterioration (OR = 1.052, 95%CI [1.003, 1.0103], p < 0.05) and readmission for intravenous de-copper therapy (OR = 1.043, 95%CI [1.001, 1.086], p < 0.05).ConclusionThe predictive performance of the consensus semi-quantitative scoring system for cranial MRI was improved to guide medication, healthcare management, and prognosis prediction in patients with WD. For every point increase in the neuroimaging score, the risk of exacerbations during treatment increased by 5.2%, and the risk of readmission to the hospital within 6 months increased by 4.3%

Directory of Open Access Journals