110 research outputs found

    Augmenting Large Language Model Translators via Translation Memories

    Full text link
    Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. In this work, we take a step towards prompting large language models (LLMs) with TMs and making them better translators. We find that the ability of LLMs to ``understand'' prompts is indeed helpful for making better use of TMs. Experiments show that the results of a pre-trained LLM translator can be greatly improved by using high-quality TM-based prompts. These results are even comparable to those of the state-of-the-art NMT systems which have access to large-scale in-domain bilingual data and are well tuned on the downstream tasks.Comment: Accepted to Findings of ACL 202

    Cross-layer Attention Sharing for Large Language Models

    Full text link
    As large language models (LLMs) evolve, the increase in model depth and parameter number leads to substantial redundancy. To enhance the efficiency of the attention mechanism, previous works primarily compress the KV cache or group attention heads, while largely overlooking redundancy between layers. Our comprehensive analyses across various LLMs show that highly similar attention patterns persist within most layers. It's intuitive to save the computation by sharing attention weights across layers. However, further analysis reveals two challenges: (1) Directly sharing the weight matrix without carefully rearranging the attention heads proves to be ineffective; (2) Shallow layers are vulnerable to small deviations in attention weights. Driven by these insights, we introduce LiSA, a lightweight substitute for self-attention in well-trained LLMs. LiSA employs tiny feed-forward networks to align attention heads between adjacent layers and low-rank matrices to approximate differences in layer-wise attention weights. Evaluations encompassing 13 typical benchmarks demonstrate that LiSA maintains high response quality in terms of accuracy and perplexity while reducing redundant attention calculations within 53-84% of the total layers. Our implementations of LiSA achieve a 6X compression of Q and K, with maximum throughput improvements of 19.5% for LLaMA3-8B and 32.3% for LLaMA2-7B.Comment: Working in proces

    Large Language Models are Parallel Multilingual Learners

    Full text link
    In this study, we reveal an in-context learning (ICL) capability of multilingual large language models (LLMs): by translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities. To test this capability, we design extensive experiments encompassing 8 typical datasets, 7 languages and 8 state-of-the-art multilingual LLMs. Experimental results show that (1) incorporating more languages help PiM surpass the conventional ICL further; (2) even combining with the translations that are inferior to baseline performance can also help. Moreover, by examining the activated neurons in LLMs, we discover a counterintuitive but interesting phenomenon. Contrary to the common thought that PiM would activate more neurons than monolingual input to leverage knowledge learned from diverse languages, PiM actually inhibits neurons and promotes more precise neuron activation especially when more languages are added. This phenomenon aligns with the neuroscience insight about synaptic pruning, which removes less used neural connections, strengthens remainders, and then enhances brain intelligence.Comment: Working in proces

    The Applications of Finite Element Analysis in Proximal Humeral Fractures

    Get PDF
    Proximal humeral fractures are common and most challenging, due to the complexity of the glenohumeral joint, especially in the geriatric population with impacted fractures, that the development of implants continues because currently the problems with their fixation are not solved. Pre-, intra-, and postoperative assessments are crucial in management of those patients. Finite element analysis, as one of the valuable tools, has been implemented as an effective and noninvasive method to analyze proximal humeral fractures, providing solid evidence for management of troublesome patients. However, no review article about the applications and effects of finite element analysis in assessing proximal humeral fractures has been reported yet. This review article summarized the applications, contribution, and clinical significance of finite element analysis in assessing proximal humeral fractures. Furthermore, the limitations of finite element analysis, the difficulties of more realistic simulation, and the validation and also the creation of validated FE models were discussed. We concluded that although some advancements in proximal humeral fractures researches have been made by using finite element analysis, utility of this powerful tool for routine clinical management and adequate simulation requires more state-of-the-art studies to provide evidence and bases

    Translate-and-Revise: Boosting Large Language Models for Constrained Translation

    Full text link
    Imposing constraints on machine translation systems presents a challenging issue because these systems are not trained to make use of constraints in generating adequate, fluent translations. In this paper, we leverage the capabilities of large language models (LLMs) for constrained translation, given that LLMs can easily adapt to this task by taking translation instructions and constraints as prompts. However, LLMs cannot always guarantee the adequacy of translation, and, in some cases, ignore the given constraints. This is in part because LLMs might be overly confident in their predictions, overriding the influence of the constraints. To overcome this overiding behaviour, we propose to add a revision process that encourages LLMs to correct the outputs by prompting them about the constraints that have not yet been met. We evaluate our approach on four constrained translation tasks, encompassing both lexical and structural constraints in multiple constraint domains. Experiments show 15\% improvement in constraint-based translation accuracy over standard LLMs and the approach also significantly outperforms neural machine translation (NMT) state-of-the-art methods.16 page

    Hybrid Alignment Training for Large Language Models

    Full text link
    Alignment training is crucial for enabling large language models (LLMs) to cater to human intentions and preferences. It is typically performed based on two stages with different objectives: instruction-following alignment and human-preference alignment. However, aligning LLMs with these objectives in sequence suffers from an inherent problem: the objectives may conflict, and the LLMs cannot guarantee to simultaneously align with the instructions and human preferences well. To response to these, in this work, we propose a Hybrid Alignment Training (Hbat) approach, based on alternating alignment and modified elastic weight consolidation methods. The basic idea is to alternate between different objectives during alignment training, so that better collaboration can be achieved between the two alignment tasks.We experiment with Hbat on summarization and dialogue tasks. Experimental results show that the proposed \textsc{Hbat} can significantly outperform all baselines. Notably, Hbat yields consistent performance gains over the traditional two-stage alignment training when using both proximal policy optimization and direct preference optimization.accepted by ACL (Findings) 202

    Values of lymphocyte-related ratios in predicting the clinical outcome of acute ischemic stroke patients receiving intravenous thrombolysis based on different etiologies

    Get PDF
    BackgroundWhile neutrophil-to-lymphocyte ratio (NLR), lymphocyte-to-monocyte ratio (LMR), and platelet-to-lymphocyte ratio (PLR) have been associated with acute ischemic stroke (AIS) outcomes, their differential predictive value across etiological subtypes (TOAST classification) in thrombolysis-treated patients remains underexplored.MethodsIn this retrospective cohort study, we analyzed 381 AIS patients receiving intravenous thrombolysis. Hematological indices were calculated from pre-thrombolysis. Using multivariable logistic regression adjusted for age, NIHSS, and comorbidities, we assessed associations between baseline ratios and 90-day unfavorable outcomes (mRS 3–6). Receiver operating characteristic (ROC) analysis was used to determine optimal cutoffs stratified by TOAST subtypes.ResultsA total of 381 patients were included in the study. NLR showed superior predictive performance: large-artery atherosclerosis: AUC = 0.702 (aOR = 1.35, 95%CI = 1.14–1.61, p = 0.001), small-artery occlusion: AUC = 0.750 (aOR = 1.51, 95%CI = 1.08–2.10, p = 0.015), cardioembolic stroke: AUC = 0.679 (aOR = 1.82, 95%CI = 1.07–3.10, p = 0.028). LMR showed predictive value only in large-artery atherosclerosis (AUC = 0.632, p = 0.004). Optimal NLR cutoffs: 3.19 (large-artery), 3.94 (small-artery), 3.17 (cardioembolic stroke).ConclusionNLR emerged as a robust, subtype-specific predictor of post-thrombolysis outcomes, particularly in atherosclerotic stroke variants. These findings supported NLR’s clinical utility for risk stratification in thrombolysis-eligible AIS patients

    N

    Get PDF

    RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data

    Full text link
    Large vision-language models (LVLMs) often fail to align with human preferences, leading to issues like generating misleading content without proper visual context (also known as hallucination). A promising solution to this problem is using human-preference alignment techniques, such as best-of-n sampling and reinforcement learning. However, these techniques face the difficulty arising from the scarcity of visual preference data, which is required to train a visual reward model (VRM). In this work, we continue the line of research. We present a Robust Visual Reward Model (RoVRM) which improves human-preference alignment for LVLMs. RoVRM leverages auxiliary textual preference data through a three-phase progressive training and optimal transport-based preference data selection to effectively mitigate the scarcity of visual preference data. We experiment with RoVRM on the commonly used vision-language tasks based on the LLaVA-1.5-7B and -13B models. Experimental results demonstrate that RoVRM consistently outperforms traditional VRMs. Furthermore, our three-phase progressive training and preference data selection approaches can yield consistent performance gains over ranking-based alignment techniques, such as direct preference optimization

    Human tumor necrosis factor (TNF)-alpha-induced protein 8-like 2 suppresses hepatocellular carcinoma metastasis through inhibiting Rac1

    Full text link
    Abstract Background Tumor invasion and metastasis are the major reasons for leading death of patients with hepatocellular carcinoma (HCC). Therefore, to identify molecules that can suppress invasion and metastasis of tumor will provide novel targets for HCC therapies. Tumor necrosis factor (TNF)-alpha-induced protein 8-like 2, TIPE2, is a novel immune negative molecule and an inhibitor of the oncogenic Ras in mice but its function in human is unclear. Our previous research has shown that TIPE2 is downregulated in human primary HCC compared with the paired adjacent non-tumor tissues. Results In present study, we provide evidence that TIPE2 inhibits effectively human hepatocellular carcinoma metastasis. The forced expression of TIPE2 in HCC-derived cell lines markedly inhibits tumor cell growth, migration and invasion in vitro and suppresses growth and metastasis of HCC in vivo. Clinical information from a cohort of 112 patients reveals that loss or reduced expression of TIPE2 in primary HCC tissues is significantly associated with tumor metastasis. Mechanically, TIPE2 inhibits the migration and invasion through targeting Rac1 and then reduces F-actin polymerization and expression of matrix metallopeptidase 9 (MMP9) and urokinase plasminogen activator (uPA). Conclusion Our results indicate that human TIPE2 is endogenous inhibitor of Rac1 in HCC by which it attenuates invasion and metastasis of HCC. The data suggest that TIPE2 will be a new target for HCC therapy. </jats:sec
    corecore