8 research outputs found
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models
This paper presents RevOrder, a novel technique aimed at improving arithmetic
operations in large language models (LLMs) by reversing the output digits in
addition, subtraction, and n-digit by 1-digit (nD by 1D) multiplication tasks.
Our method significantly reduces the Count of Sequential Intermediate Digits
(CSID) to , a new metric we introduce to assess equation
complexity. Through comprehensive testing, RevOrder not only achieves perfect
accuracy in basic arithmetic operations but also substantially boosts LLM
performance in division tasks, particularly with large numbers where
traditional models struggle. Implementation of RevOrder is cost-effective for
both training and inference phases. Moreover, applying RevOrder to fine-tune
the LLaMA2-7B model on the GSM8K math task results in a considerable
improvement, reducing equation calculation errors by 46% and increasing overall
scores from 41.6 to 44.4
DeepE: a deep neural network for knowledge graph embedding
Recently, neural network based methods have shown their power in learning
more expressive features on the task of knowledge graph embedding (KGE).
However, the performance of deep methods often falls behind the shallow ones on
simple graphs. One possible reason is that deep models are difficult to train,
while shallow models might suffice for accurately representing the structure of
the simple KGs.
In this paper, we propose a neural network based model, named DeepE, to
address the problem, which stacks multiple building blocks to predict the tail
entity based on the head entity and the relation. Each building block is an
addition of a linear and a non-linear function. The stacked building blocks are
equivalent to a group of learning functions with different non-linear depth.
Hence, DeepE allows deep functions to learn deep features, and shallow
functions to learn shallow features. Through extensive experiments, we find
DeepE outperforms other state-of-the-art baseline methods. A major advantage of
DeepE is the robustness. DeepE achieves a Mean Rank (MR) score that is 6%, 30%,
65% lower than the best baseline methods on FB15k-237, WN18RR and YAGO3-10. Our
design makes it possible to train much deeper networks on KGE, e.g. 40 layers
on FB15k-237, and without scarifying precision on simple relations.Comment: 10 pages, 5 figures, 7 table
Rabies virus pseudotyped with CVS-N2C glycoprotein as a powerful tool for retrograde neuronal network tracing
Abstract
Background: Efficient viral vectors for mapping and manipulating long projection neuronal circuits are crucial in brain structural and functional studies. The glycoprotein gene-deleted SAD strain rabies virus pseudotyped with the N2C glycoprotein (SAD-RV(ΔG)-N2C(G)) shows high neuro-tropism in cell culture, but its in vivo retrograde infection efficiency and neuro-tropism have not been systematically characterized.
Methods: SAD-RV(ΔG)-N2C(G) and two other broadly used retrograde tracers, SAD-RV(ΔG)-B19(G) and rAAV2-retro were respectively injected into the VTA or DG in C57BL/6 mice. The neuron numbers labeled across the whole brain regions were counted and analyzed by measuring the retrograde infection efficiencies and tropisms of these viral tools. The labeled neural types were analyzed using fluorescence immunohistochemistry or GAD67-GFP mice.
Result: We found that SAD-RV (ΔG)-N2C (G) enhanced the infection efficiency of long-projecting neurons by ~ 10 times but with very similar neuro-tropism, compared with SAD-RV (ΔG)-B19(G). On the other hand, SAD-RV(ΔG)-N2C(G) showed comparable infection efficiency with rAAV2-retro, but had a more restricted diffusion range, and broader tropism to different types and regions of long-projecting neuronal populations.
Conclusions: These results demonstrate that SAD-RV(ΔG)-N2C(G) can serve as an effective retrograde vector for studying neuronal circuits.
Key words:Viral vector, N2C Glycoprotein, Neuronal circuits, Retrograde tracin