39 research outputs found

    Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

    Full text link
    We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e.g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques. We first show that a simple yet strong baseline, weighted sum ensemble, is missing for the in-context few-shot classification. Moreover, on more challenging Chain-of-Thought (CoT) reasoning (e.g., HotpotQA), PCW would present unexpected deterioration regarding question miscomprehension and false inference. Based on our findings, we suggest that the existing PCW design may not guarantee sufficient improvement and practicality in handling lengthy documents in real-world applications. More community efforts on enabling language models' long context understanding ability should be paid

    AgentBench: Evaluating LLMs as Agents

    Full text link
    Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interactive environments. We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting. Our extensive test over 27 API-based and open-sourced (OSS) LLMs shows that, while top commercial LLMs present a strong ability of acting as agents in complex environments, there is a significant disparity in performance between them and OSS competitors. We identify the typical reasons of failures in environments and LLMs, showing that poor long-term reasoning, decision-making, and instruction following abilities are the main obstacles for developing usable LLM agents. Training on code and high quality multi-turn alignment data could improve agent performance. Datasets, environments, and an integrated evaluation package for AgentBench are released at \url{https://github.com/THUDM/AgentBench}.Comment: 55 page

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Fast Cut Back Thermal Power Plant Load Rejection and Black Start Field Test Analysis

    No full text
    Fast and reliable black start plays a key role in improving the ability of the power system to resist the risk of large-scale blackouts. For a black start with high voltage and long-distance transmission lines, it is much easier to cause phenomena such as self-excitation and power frequency/operating overvoltage, which may lead to black start failure and impact the reliability of the system’s restoration. Meanwhile, the long time needed to crank up the non-black start units will impact the speed of the restoration. This paper addresses the advantages of using a thermal power unit with a fast cut back (FCB) function as a black start unit, and studies the transient process of the FCB unit during the restoration. Firstly, key problems in the power system black start process are analyzed and a practical engineering criterion of self-excitation is proposed. Secondly, the dynamic model of the FCB unit is presented. Thirdly, the field test of the FCB unit load rejection and black start is introduced, which is the first successful field test of black start with 500 kV long-distance lines in China Southern Power Grid (CSG). Finally, the transient process of this test is simulated using the PSCAD/EMTDC software, and the simulation results accord well with the field test results, which verifies the correctness of the FCB model and the self-excitation engineering criterion proposed

    Two-population Asymmetric Evolutionary Game Dynamics-based Decision-making Behavior Analysis for A Supply-side Electric Power Bidding Market

    No full text
    This paper systematically discusses two-population asymmetric evolutionary games (2PAEGs) from the perspective of decision-making behavior characteristics, and applies these game models to a two-population supply-side electric power bidding market. First, a 2PAEG model is established. Then, complete evolutionary equilibrium rules of this model are revealed during decision-making processes. Discussion shows that final evolutionary game equilibria achieved in the 2PAEG model are only determined by some payoff parameters, which are defined as relative net payoff (RNP) parameters in this paper. Finally, a case study of supply-side bidding simulation for two generator populations is conducted, which can effectively verify the universality and effectiveness of the evolutionary dynamics results obtained in the established general 2PAEG model. Moreover, it shows that reasonable policies made by the government can guide more appropriate power bidding for onto-grid electricity

    Microstructure and mechanical property of high power laser powder bed fusion AlSi10Mg alloy before and after T6 heat treatment

    No full text
    This paper focuses on the microstructure and mechanical property of the high power laser powder bed fusion AlSi10Mg alloy before and after T6 heat treatment. The results demonstrate that the as-printed sample presents a columnar grain structure along the build direction and a strong texture. Inside the columnar α-Al grains, there are cellular dendrites decorated with network eutectic Si. Both the α-Al matrix and eutectic Si have high-density dislocations. Besides, nano-twins and stacking faults are observed in eutectic Si. After T6 treatment, although the α-Al matrix still exhibits a columnar solidification feature, the cellular dendrites disappear and the proportion of equiaxed grains increase. And the eutectic Si presents as separate plates or nanoscale particles, in which nano-twins and stacking faults are not found. The tensile property anisotropy decreases and the strength-ductility balance improves after T6 treatment. The evolution mechanisms of the microstructure and tensile property are revealed

    Polarized Intensity Ratio Constraint Demosaicing for the Division of a Focal-Plane Polarimetric Image

    No full text
    Polarization is an independent dimension of light wave information that has broad application prospects in machine vision and remote sensing tasks. Polarization imaging using a division-of-focal-plane (DoFP) polarimetric sensor can meet lightweight and real-time application requirements. Similar to Bayer filter-based color imaging, demosaicing is a basic and important processing step in DoFP polarization imaging. Due to the differences in the physical properties of polarization and the color of light waves, the widely studied color demosaicing method cannot be directly applied to polarization demosaicing. We propose a polarized intensity ratio constraint demosaicing model to efficiently account for the characteristics of polarization detection in this work. First, we discuss the special constraint relationship between the polarization channels. It can be simply described as: for a beam of light, the sum of the intensities detected by any two vertical ideal analyzers should be equal to the total light intensity. Then, based on this constraint relationship and drawing on the concept of guided filtering, a new polarization demosaicing method is developed. A method to directly use raw images captured by the DoFP detector as the ground truth for comparison experiments is then constructed to aid in the convenient collection of experimental data and extensive image scenarios. Results of both qualitative and quantitative experiments illustrate that our method is an effective and practical method to faithfully recover the full polarization information of each pixel from a single mosaic input image

    Polarized Intensity Ratio Constraint Demosaicing for the Division of a Focal-Plane Polarimetric Image

    No full text
    Polarization is an independent dimension of light wave information that has broad application prospects in machine vision and remote sensing tasks. Polarization imaging using a division-of-focal-plane (DoFP) polarimetric sensor can meet lightweight and real-time application requirements. Similar to Bayer filter-based color imaging, demosaicing is a basic and important processing step in DoFP polarization imaging. Due to the differences in the physical properties of polarization and the color of light waves, the widely studied color demosaicing method cannot be directly applied to polarization demosaicing. We propose a polarized intensity ratio constraint demosaicing model to efficiently account for the characteristics of polarization detection in this work. First, we discuss the special constraint relationship between the polarization channels. It can be simply described as: for a beam of light, the sum of the intensities detected by any two vertical ideal analyzers should be equal to the total light intensity. Then, based on this constraint relationship and drawing on the concept of guided filtering, a new polarization demosaicing method is developed. A method to directly use raw images captured by the DoFP detector as the ground truth for comparison experiments is then constructed to aid in the convenient collection of experimental data and extensive image scenarios. Results of both qualitative and quantitative experiments illustrate that our method is an effective and practical method to faithfully recover the full polarization information of each pixel from a single mosaic input image

    An Exploratory Investigation on Modelling Technologies to Flexible Loads Dispatching in A Smart Grid Environment

    No full text
    As the proportion of flexible load resources in smart grids continues to rise, resulting in increasingly complex grid structures, significant changes in grid characteristics, and increased risks to grid operation and control, it will be difficult to intelligently regulate the grid solely by relying on traditional resource regulation methods, and the dispatchable space for traditional resources will become smaller and smaller. To this end, this paper conducts an exploratory investigation on the modeling techniques for flexible loads participation in smart grid dispatching. First, a classification of the flexible loads involved in grid regulation is made. Secondly, according to the flexible load classification, the modeling techniques of different classes of flexible loads are reviewed and studied; then, the flexible load dispatching modes for different operating states and different control tasks, and under different control methods are discussed deeply. Moreover, the technological economics and feasibility of these different flexible load dispatching modes are compared. Finally, an outlook and conclusion are made
    corecore