Search CORE

2,390 research outputs found

YaRN: Efficient Context Window Extension of Large Language Models

Author: Fan Honglu
Peng Bowen
Quesnelle Jeffrey
Shippole Enrico
Publication venue
Publication date: 31/08/2023
Field of study

Rotary Position Embeddings (RoPE) have been shown to effectively encode positional information in transformer-based language models. However, these models fail to generalize past the sequence length they were trained on. We present YaRN (Yet another RoPE extensioN method), a compute-efficient method to extend the context window of such models, requiring 10x less tokens and 2.5x less training steps than previous methods. Using YaRN, we show that LLaMA models can effectively utilize and extrapolate to context lengths much longer than their original pre-training would allow, while also surpassing previous the state-of-the-art at context window extension. In addition, we demonstrate that YaRN exhibits the capability to extrapolate beyond the limited context of a fine-tuning dataset. We publish the checkpoints of Llama 2 7B/13B fine-tuned using YaRN with 64k and 128k context windows at https://github.com/jquesnelle/yar

arXiv.org e-Print Archive

Large Model Based Referring Camouflaged Object Detection

Author: Cheng Shupeng
Fan Deng-Ping
Ji Ge-Peng
Qin Pengda
Xu Peng
Zhou Bowen
Publication venue
Publication date: 28/11/2023
Field of study

Referring camouflaged object detection (Ref-COD) is a recently-proposed problem aiming to segment out specified camouflaged objects matched with a textual or visual reference. This task involves two major challenges: the COD domain-specific perception and multimodal reference-image alignment. Our motivation is to make full use of the semantic intelligence and intrinsic knowledge of recent Multimodal Large Language Models (MLLMs) to decompose this complex task in a human-like way. As language is highly condensed and inductive, linguistic expression is the main media of human knowledge learning, and the transmission of knowledge information follows a multi-level progression from simplicity to complexity. In this paper, we propose a large-model-based Multi-Level Knowledge-Guided multimodal method for Ref-COD termed MLKG, where multi-level knowledge descriptions from MLLM are organized to guide the large vision model of segmentation to perceive the camouflage-targets and camouflage-scene progressively and meanwhile deeply align the textual references with camouflaged photos. To our knowledge, our contributions mainly include: (1) This is the first time that the MLLM knowledge is studied for Ref-COD and COD. (2) We, for the first time, propose decomposing Ref-COD into two main perspectives of perceiving the target and scene by integrating MLLM knowledge, and contribute a multi-level knowledge-guided method. (3) Our method achieves the state-of-the-art on the Ref-COD benchmark outperforming numerous strong competitors. Moreover, thanks to the injected rich knowledge, it demonstrates zero-shot generalization ability on uni-modal COD datasets. We will release our code soon

arXiv.org e-Print Archive

Squeezing and entanglement delay using slow light

Author: Amy Peng
C. Cohen-Tannoudji
H.-A. Bachor
J. J. Hope
M. O. Scully
Mattias Johnsson
P. K. Lam
W. P. Bowen
Publication venue: 'American Physical Society (APS)'
Publication date: 08/12/2004
Field of study

We examine the interaction of a weak probe with

N

atoms in a lambda-level configuration under the conditions of electromagnetically induced transparency (EIT). In contrast to previous works on EIT, we calculate the output state of the resultant slowly propagating light field while taking into account the effects of ground state dephasing and atomic noise for a more realistic model. In particular, we propose two experiments using slow light with a nonclassical probe field and show that two properties of the probe, entanglement and squeezing, characterizing the quantum state of the probe field, can be well-preserved throughout the passage.Comment: 2 figures; v2: fixed some minor typographical errors in a couple of equations and corrected author spelling in one reference. v3: Added three authors; changed the entaglement definition to conform to a more accepted standard (Duan's entanglement measure); altered the abstract slightly. v4: fixed formatting of figure

arXiv.org e-Print Archive

Crossref

The Australian National University

University of Queensland eSpace

Experimental generation of 6 dB continuous variable entanglement from a nondegenerate optical parametric amplifier

Author: Bencheikh
Bowen
Bowen
Changde Xie
Cochrane
Drever
Furusawa
Furusawa
Heng Shen
Jia
Jing
Jing
Kunchi Peng
Laurat
Li
Menicucci
Ou
Su
Su
Takeno
Tan
Vahlbruch
Villar
Xiaoli Jin
Xiaolong Su
Yonezawa
Yoshikawa
Yu Wang
Yukawa
Zhang
Zhang
Publication venue: 'The Optical Society'
Publication date: 11/03/2010
Field of study

We experimentally demonstrated that the quantum correlations of amplitude and phase quadratures between signal and idler beams produced from a non-degenerate optical parametric amplifier (NOPA) can be significantly improved by using a mode cleaner in the pump field and reducing the phase fluctuations in phase locking systems. Based on the two technical improvements the quantum entanglement measured with a two-mode homodyne detector is enhanced from ~ 4 dB to ~ 6 dB below the quantum noise limit using the same NOPA and nonlinear crystal.Comment: 7 pages, 5 figure

arXiv.org e-Print Archive

Crossref

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

Author: Lei Bowen
Liu Yuchen
Mukherjee Subhabrata
Peng Zhiyuan
Xu Binfeng
Xu Dongkuan
Publication venue
Publication date: 22/05/2023
Field of study

Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution. Existing ALM systems trigger LLM thought processes while pulling observations from these tools in an interleaved fashion. Specifically, an LLM reasons to call an external tool, gets halted to fetch the tool's response, and then decides the next action based on all preceding response tokens. Such a paradigm, though straightforward and easy to implement, often leads to huge computation complexity from redundant prompts and repeated execution. This study addresses such challenges for the first time, proposing a modular paradigm ReWOO (Reasoning WithOut Observation) that detaches the reasoning process from external observations, thus significantly reducing token consumption. Comprehensive evaluations across six public NLP benchmarks and a curated dataset reveal consistent performance enhancements with our proposed methodology. Notably, ReWOO achieves 5x token efficiency and 4% accuracy improvement on HotpotQA, a multi-step reasoning benchmark. Furthermore, ReWOO demonstrates robustness under tool-failure scenarios. Beyond prompt efficiency, decoupling parametric modules from non-parametric tool calls enables instruction fine-tuning to offload LLMs into smaller language models, thus substantially reducing model parameters. Our illustrative work offloads reasoning ability from 175B GPT3.5 into 7B LLaMA, demonstrating the significant potential for truly efficient and scalable ALM systems

arXiv.org e-Print Archive

Nitrogen substrate–dependent nitrous oxide cycling in salt marsh sediments

Author: Babbin Andrew R.
Bowen Jennifer L.
Ji Qixing
Peng Xuefeng
Ward Bess B.
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/01/2015
Field of study

Nitrous oxide (N2O) is important to Earth\u27s climate because it is a strong absorber of radiation and an important ozone depletion agent. Increasing anthropogenic nitrogen input into the marine environment, especially to coastal waters, has led to increasing N2O emissions. Identifying the nitrogen compounds that serve as substrates for N2O production in coastal waters reveals important pathways and helps us understand their control by environmental factors. In this study, sediments were collected from a long-term fertilization site in Great Sippewissett Marsh, Falmouth, Massachusetts. The 15N tracer incubation time course experiments were conducted and analyzed for potential N2O production and consumption rates. The two nitrogen substrates of N2O production, ammonium and nitrate, correspond to the two production pathways, nitrification and denitrification, respectively. When measurable nitrate was present, despite ambient high ammonium concentrations, denitrification was the major N2O production pathway. When nitrate was absent, ammonium became the dominant substrate for N2O production, via nitrification and coupled nitrification-denitrification. Net N2O consumption was enhanced under low oxygen and nitrate conditions. N2O production and consumption rates increased with increasing levels of nitrogen fertilization in long-term experimental plots. These results indicate that increasing anthropogenic nitrogen input to salt marshes can stimulate sedimentary N2O production via both nitrification and denitrification, whereas episodic oxygen depletion results in net N2O consumption

Yale University