2,924 research outputs found

    Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks

    Full text link
    Large Language Models (LLMs) evaluation is a patchy and inconsistent landscape, and it is becoming clear that the quality of automatic evaluation metrics is not keeping up with the pace of development of generative models. We aim to improve the understanding of current models' performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three NLP benchmarks: text summarisation, text simplification and grammatical error correction (GEC), using both automatic and human evaluation. We also explore the potential of the recently released GPT-4 to act as an evaluator. We find that ChatGPT consistently outperforms many other popular models according to human reviewers on the majority of metrics, while scoring much more poorly when using classic automatic evaluation metrics. We also find that human reviewers rate the gold reference as much worse than the best models' outputs, indicating the poor quality of many popular benchmarks. Finally, we find that GPT-4 is capable of ranking models' outputs in a way which aligns reasonably closely to human judgement despite task-specific variations, with a lower alignment in the GEC task.Comment: Accepted at EMNLP 202

    Quantum simulation of exotic PT-invariant topological nodal loop bands with ultracold atoms in an optical lattice

    Get PDF
    Since the well-known PT symmetry has its fundamental significance and implication in physics, where PT denotes the combined operation of space-inversion P and time-reversal T, it is extremely important and intriguing to completely classify exotic PT-invariant topological metals and to physically realize them. Here we, for the first time, establish a rigorous classification of topological metals that are protected by the PT symmetry using KO-theory. As a physically realistic example, a PT-invariant nodal loop (NL) model in a 3D Brillouin zone is constructed, whose topological stability is revealed through its PT-symmetry-protected nontrivial Z2 topological charge. Based on these exact results, we propose an experimental scheme to realize and to detect tunable PT-invariant topological NL states with ultracold atoms in an optical lattice, in which atoms with two hyperfine spin states are loaded in a spin-dependent 3D OL and two pairs of Raman lasers are used to create out-of-plane spin-flip hopping with site-dependent phase. Such a realistic cold-atom setup can yield topological NL states, having a tunable ring-shaped band-touching line with the two-fold degeneracy in the bulk spectrum and non-trivial surface states. The states are actually protected by the combined PT symmetry even in the absence of both P and T symmetries, and are characterized by a Z2-type invariant (a quantized Berry phase). Remarkably, we demonstrate with numerical simulations that (i) the characteristic NL can be detected by measuring the atomic transfer fractions in a Bloch-Zener oscillation; (ii) the topological invariant may be measured based on the time-of-flight imaging; and (iii) the surface states may be probed through Bragg spectroscopy. The present proposal for realizing topological NL states in cold atom systems may provide a unique experimental platform for exploring exotic PT-invariant topological physics.Comment: 11 pages, 6 figures; accepted for publication in Phys. Rev.

    Intergenic transcription by RNA Polymerase II coordinates Pol IV and Pol V in siRNA-directed transcriptional gene silencing in \u3ci\u3eArabidopsis\u3c/i\u3e

    Get PDF
    Intergenic transcription by RNA Polymerase II (Pol II) is widespread in plant and animal genomes, but the functions of intergenic transcription or the resulting noncoding transcripts are poorly understood. Here, we show that Arabidopsis Pol II is indispensable for endogenous siRNA-mediated transcriptional gene silencing (TGS) at intergenic low-copy-number loci, despite the presence of two other polymerases—Pol IV and Pol V—that specialize in TGS through siRNAs. We show that Pol II produces noncoding scaffold transcripts that originate outside of heterochromatic, siRNA-generating loci. Through these transcripts and physical interactions with the siRNA effector protein ARGONAUTE4 (AGO4), Pol II recruits AGO4/siRNAs to homologous loci to result in TGS. Meanwhile, Pol II transcription also recruits Pol IV and Pol V to different locations at heterochromatic loci to promote siRNA biogenesis and siRNA-mediated TGS, respectively. This study establishes that intergenic transcription by Pol II is required for siRNA-mediated TGS, and reveals an intricate collaboration and division of labor among the three polymerases in gene silencing

    Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs

    Full text link
    Multivariate time series forecasting has long received significant attention in real-world applications, such as energy consumption and traffic prediction. While recent methods demonstrate good forecasting abilities, they have three fundamental limitations. (i) Discrete neural architectures: Interlacing individually parameterized spatial and temporal blocks to encode rich underlying patterns leads to discontinuous latent state trajectories and higher forecasting numerical errors. (ii) High complexity: Discrete approaches complicate models with dedicated designs and redundant parameters, leading to higher computational and memory overheads. (iii) Reliance on graph priors: Relying on predefined static graph structures limits their effectiveness and practicability in real-world applications. In this paper, we address all the above limitations by proposing a continuous model to forecast M\textbf{M}ultivariate T\textbf{T}ime series with dynamic G\textbf{G}raph neural O\textbf{O}rdinary D\textbf{D}ifferential E\textbf{E}quations (MTGODE\texttt{MTGODE}). Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures. Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing, allowing deeper graph propagation and fine-grained temporal information aggregation to characterize stable and precise latent spatial-temporal dynamics. Our experiments demonstrate the superiorities of MTGODE\texttt{MTGODE} from various perspectives on five time series benchmark datasets.Comment: 14 pages, 6 figures, 5 table

    Hierarchical Integration Diffusion Model for Realistic Image Deblurring

    Full text link
    Diffusion models (DMs) have recently been introduced in image deblurring and exhibited promising performance, particularly in terms of details reconstruction. However, the diffusion model requires a large number of inference iterations to recover the clean image from pure Gaussian noise, which consumes massive computational resources. Moreover, the distribution synthesized by the diffusion model is often misaligned with the target results, leading to restrictions in distortion-based metrics. To address the above issues, we propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Specifically, we perform the DM in a highly compacted latent space to generate the prior feature for the deblurring process. The deblurring process is implemented by a regression-based method to obtain better distortion accuracy. Meanwhile, the highly compact latent space ensures the efficiency of the DM. Furthermore, we design the hierarchical integration module to fuse the prior into the regression-based model from multiple scales, enabling better generalization in complex blurry scenarios. Comprehensive experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods. Code and trained models are available at https://github.com/zhengchen1999/HI-Diff.Comment: Code is available at https://github.com/zhengchen1999/HI-Dif

    Gapless quantum spin liquid and global phase diagram of the spin-1/2 J1J_1-J2J_2 square antiferromagnetic Heisenberg model

    Full text link
    The nature of the zero-temperature phase diagram of the spin-1/21/2 J1J_1-J2J_2 Heisenberg model on a square lattice has been debated in the past three decades, which may hold the key to understand high temperature superconductivity. By using the state-of-the-art tensor network method, specifically, the finite projected entangled pair state (PEPS) algorithm, to simulate the global phase diagram the J1J_1-J2J_2 Heisenberg model up to 24×2424\times 24 sites, we provide very solid evidences to show that the nature of the intermediate nonmagnetic phase is a gapless quantum spin liquid (QSL), whose spin-spin and dimer-dimer correlations both decay with a power law behavior. There also exists a valence-bond solid (VBS) phase in a very narrow region 0.56≲J2/J1≤0.610.56\lesssim J_2/J_1\leq0.61 before the system enters the well known collinear antiferromagnetic phase. The physical nature of the discovered gapless QSL and potential experimental implications are also addressed. We stress that we make the first detailed comparison between the results of PEPS and the well-established density matrix renormalization group (DMRG) method through one-to-one direct benchmark for small system sizes, and thus give rise to a very solid PEPS calculation beyond DMRG. Our numerical evidences explicitly demonstrate the huge power of PEPS for precisely capturing long-range physcis for highly frustrated systems, and also demonstrate the finite PEPS method is a very powerful approach to study strongly corrleated quantum many-body problems

    The effect credit term structure of monetary policy on firms' "short-term debt for long-term investment" behavior: empirical evidence from China

    Get PDF
    This paper examines the effects and mechanism paths of monetary policy on firms' "short-term debt for long-term investment (SDFLI)" behavior using panel data of Chinese A-share listed firms from 2007-2019. The findings indicate that loose monetary policy suppresses corporate SDFLI behavior by lengthening corporate credit maturity structure through the credit maturity structure channel. In addition, heterogeneity analysis shows that loose monetary policy significantly inhibits the SDFLI behavior of state-owned enterprises(SOEs), non-high-tech firms, and firms in regions with high bank competition levels through the credit term structure channel, and the monetary policy credit term structure channel fails for non-state-owned enterprises(non-SOEs), high-tech firms, and firms in regions with low bank competition levels. The results of the heterogeneity analysis validate the plausibility that monetary policy affects firms' SDFLI behavior through the credit term structure channel
    • …
    corecore