5 research outputs found

    Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

    Full text link
    We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e.g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques. We first show that a simple yet strong baseline, weighted sum ensemble, is missing for the in-context few-shot classification. Moreover, on more challenging Chain-of-Thought (CoT) reasoning (e.g., HotpotQA), PCW would present unexpected deterioration regarding question miscomprehension and false inference. Based on our findings, we suggest that the existing PCW design may not guarantee sufficient improvement and practicality in handling lengthy documents in real-world applications. More community efforts on enabling language models' long context understanding ability should be paid

    AgentBench: Evaluating LLMs as Agents

    Full text link
    Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interactive environments. We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting. Our extensive test over 27 API-based and open-sourced (OSS) LLMs shows that, while top commercial LLMs present a strong ability of acting as agents in complex environments, there is a significant disparity in performance between them and OSS competitors. We identify the typical reasons of failures in environments and LLMs, showing that poor long-term reasoning, decision-making, and instruction following abilities are the main obstacles for developing usable LLM agents. Training on code and high quality multi-turn alignment data could improve agent performance. Datasets, environments, and an integrated evaluation package for AgentBench are released at \url{https://github.com/THUDM/AgentBench}.Comment: 55 page

    Covalent Patterning and Rapid Visualization of Latent Fingerprints with Photo-Cross-Linkable Semiconductor Polymer Dots

    No full text
    Fingerprint imaging and recognition represent the most important approach in personal identification. Here we designed and synthesized oxetane-functionalized semiconductor polymer dots (Ox-Pdots) for covalent patterning and rapid visualization of latent fingerprints. The high fluorescence brightness, large Stokes shift, and excellent surface properties of the Ox-Pdots lead to fingerprint imaging with high sensitivity and resolution. Fingerprint ridge structures with the first, second, and third levels of details were clearly developed within minutes. The method was facile and robust for visualization of fingerprints on various surfaces including glass, metal, and plastics. Moreover, the oxetane groups in the Ox-Pdots undergo cross-linking reactions induced by a short-time UV irradiation, yielding 3-D intermolecular polymer network. The resulting fingerprint patterns exhibit unparalleled stability against rigorous treatment, as compared to those by traditional Pdots. Our results demonstrate that the Ox-Pdots hold great promise for latent fingerprint imaging and fluorescence anticounterfeiting applications

    Enhanced Phototherapy by Nanoparticle-Enzyme via Generation and Photolysis of Hydrogen Peroxide

    No full text
    Light has been widely used for cancer therapeutics such as photodynamic therapy (PDT) and photothermal therapy. This paper describes a strategy called enzyme-enhanced phototherapy (EEPT) for cancer treatment. We constructed a nanoparticle platform by covalent conjugation of glucose oxidase (GOx) to small polymer dots, which could be persistently immobilized into a tumor. While the malignant tumors have high glucose uptake, the GOx efficiently catalyzes the glucose oxidation with simultaneous generation of H<sub>2</sub>O<sub>2</sub>. Under light irradiation, the in situ generated H<sub>2</sub>O<sub>2</sub> was photolyzed to produce hydroxyl radical, the most reactive oxygen species, for killing cancer cells. In vitro assays indicated that the cancer cells were destroyed by using a nanoparticle concentration at 0.2 μg/mL and a light dose of ∼120 J/cm<sup>2</sup>, indicating the significantly enhanced efficiency of the EEPT method when compared to typical PDT that requires a photosensitizer of >10 μg/mL for effective cell killing under the same light dose. Furthermore, remarkable inhibition of tumor growth was observed in xenograft-bearing mice, indicating the promise of the EEPT approach for cancer therapeutics
    corecore