2,924 research outputs found
Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks
Large Language Models (LLMs) evaluation is a patchy and inconsistent
landscape, and it is becoming clear that the quality of automatic evaluation
metrics is not keeping up with the pace of development of generative models. We
aim to improve the understanding of current models' performance by providing a
preliminary and hybrid evaluation on a range of open and closed-source
generative LLMs on three NLP benchmarks: text summarisation, text
simplification and grammatical error correction (GEC), using both automatic and
human evaluation. We also explore the potential of the recently released GPT-4
to act as an evaluator. We find that ChatGPT consistently outperforms many
other popular models according to human reviewers on the majority of metrics,
while scoring much more poorly when using classic automatic evaluation metrics.
We also find that human reviewers rate the gold reference as much worse than
the best models' outputs, indicating the poor quality of many popular
benchmarks. Finally, we find that GPT-4 is capable of ranking models' outputs
in a way which aligns reasonably closely to human judgement despite
task-specific variations, with a lower alignment in the GEC task.Comment: Accepted at EMNLP 202
Quantum simulation of exotic PT-invariant topological nodal loop bands with ultracold atoms in an optical lattice
Since the well-known PT symmetry has its fundamental significance and
implication in physics, where PT denotes the combined operation of
space-inversion P and time-reversal T, it is extremely important and intriguing
to completely classify exotic PT-invariant topological metals and to physically
realize them. Here we, for the first time, establish a rigorous classification
of topological metals that are protected by the PT symmetry using KO-theory. As
a physically realistic example, a PT-invariant nodal loop (NL) model in a 3D
Brillouin zone is constructed, whose topological stability is revealed through
its PT-symmetry-protected nontrivial Z2 topological charge. Based on these
exact results, we propose an experimental scheme to realize and to detect
tunable PT-invariant topological NL states with ultracold atoms in an optical
lattice, in which atoms with two hyperfine spin states are loaded in a
spin-dependent 3D OL and two pairs of Raman lasers are used to create
out-of-plane spin-flip hopping with site-dependent phase. Such a realistic
cold-atom setup can yield topological NL states, having a tunable ring-shaped
band-touching line with the two-fold degeneracy in the bulk spectrum and
non-trivial surface states. The states are actually protected by the combined
PT symmetry even in the absence of both P and T symmetries, and are
characterized by a Z2-type invariant (a quantized Berry phase). Remarkably, we
demonstrate with numerical simulations that (i) the characteristic NL can be
detected by measuring the atomic transfer fractions in a Bloch-Zener
oscillation; (ii) the topological invariant may be measured based on the
time-of-flight imaging; and (iii) the surface states may be probed through
Bragg spectroscopy. The present proposal for realizing topological NL states in
cold atom systems may provide a unique experimental platform for exploring
exotic PT-invariant topological physics.Comment: 11 pages, 6 figures; accepted for publication in Phys. Rev.
Intergenic transcription by RNA Polymerase II coordinates Pol IV and Pol V in siRNA-directed transcriptional gene silencing in \u3ci\u3eArabidopsis\u3c/i\u3e
Intergenic transcription by RNA Polymerase II (Pol II) is widespread in plant and animal genomes, but the functions of intergenic transcription or the resulting noncoding transcripts are poorly understood. Here, we show that Arabidopsis Pol II is indispensable for endogenous siRNA-mediated transcriptional gene silencing (TGS) at intergenic low-copy-number loci, despite the presence of two other polymerases—Pol IV and Pol V—that specialize in TGS through siRNAs. We show that Pol II produces noncoding scaffold transcripts that originate outside of heterochromatic, siRNA-generating loci. Through these transcripts and physical interactions with the siRNA effector protein ARGONAUTE4 (AGO4), Pol II recruits AGO4/siRNAs to homologous loci to result in TGS. Meanwhile, Pol II transcription also recruits Pol IV and Pol V to different locations at heterochromatic loci to promote siRNA biogenesis and siRNA-mediated TGS, respectively. This study establishes that intergenic transcription by Pol II is required for siRNA-mediated TGS, and reveals an intricate collaboration and division of labor among the three polymerases in gene silencing
Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs
Multivariate time series forecasting has long received significant attention
in real-world applications, such as energy consumption and traffic prediction.
While recent methods demonstrate good forecasting abilities, they have three
fundamental limitations. (i) Discrete neural architectures: Interlacing
individually parameterized spatial and temporal blocks to encode rich
underlying patterns leads to discontinuous latent state trajectories and higher
forecasting numerical errors. (ii) High complexity: Discrete approaches
complicate models with dedicated designs and redundant parameters, leading to
higher computational and memory overheads. (iii) Reliance on graph priors:
Relying on predefined static graph structures limits their effectiveness and
practicability in real-world applications. In this paper, we address all the
above limitations by proposing a continuous model to forecast
ultivariate ime series with dynamic raph
neural rdinary ifferential quations
(). Specifically, we first abstract multivariate time series
into dynamic graphs with time-evolving node features and unknown graph
structures. Then, we design and solve a neural ODE to complement missing graph
topologies and unify both spatial and temporal message passing, allowing deeper
graph propagation and fine-grained temporal information aggregation to
characterize stable and precise latent spatial-temporal dynamics. Our
experiments demonstrate the superiorities of from various
perspectives on five time series benchmark datasets.Comment: 14 pages, 6 figures, 5 table
Hierarchical Integration Diffusion Model for Realistic Image Deblurring
Diffusion models (DMs) have recently been introduced in image deblurring and
exhibited promising performance, particularly in terms of details
reconstruction. However, the diffusion model requires a large number of
inference iterations to recover the clean image from pure Gaussian noise, which
consumes massive computational resources. Moreover, the distribution
synthesized by the diffusion model is often misaligned with the target results,
leading to restrictions in distortion-based metrics. To address the above
issues, we propose the Hierarchical Integration Diffusion Model (HI-Diff), for
realistic image deblurring. Specifically, we perform the DM in a highly
compacted latent space to generate the prior feature for the deblurring
process. The deblurring process is implemented by a regression-based method to
obtain better distortion accuracy. Meanwhile, the highly compact latent space
ensures the efficiency of the DM. Furthermore, we design the hierarchical
integration module to fuse the prior into the regression-based model from
multiple scales, enabling better generalization in complex blurry scenarios.
Comprehensive experiments on synthetic and real-world blur datasets demonstrate
that our HI-Diff outperforms state-of-the-art methods. Code and trained models
are available at https://github.com/zhengchen1999/HI-Diff.Comment: Code is available at https://github.com/zhengchen1999/HI-Dif
Gapless quantum spin liquid and global phase diagram of the spin-1/2 - square antiferromagnetic Heisenberg model
The nature of the zero-temperature phase diagram of the spin-
- Heisenberg model on a square lattice has been debated in the past
three decades, which may hold the key to understand high temperature
superconductivity. By using the state-of-the-art tensor network method,
specifically, the finite projected entangled pair state (PEPS) algorithm, to
simulate the global phase diagram the - Heisenberg model up to
sites, we provide very solid evidences to show that the nature of
the intermediate nonmagnetic phase is a gapless quantum spin liquid (QSL),
whose spin-spin and dimer-dimer correlations both decay with a power law
behavior. There also exists a valence-bond solid (VBS) phase in a very narrow
region before the system enters the well known
collinear antiferromagnetic phase. The physical nature of the discovered
gapless QSL and potential experimental implications are also addressed. We
stress that we make the first detailed comparison between the results of PEPS
and the well-established density matrix renormalization group (DMRG) method
through one-to-one direct benchmark for small system sizes, and thus give rise
to a very solid PEPS calculation beyond DMRG. Our numerical evidences
explicitly demonstrate the huge power of PEPS for precisely capturing
long-range physcis for highly frustrated systems, and also demonstrate the
finite PEPS method is a very powerful approach to study strongly corrleated
quantum many-body problems
The effect credit term structure of monetary policy on firms' "short-term debt for long-term investment" behavior: empirical evidence from China
This paper examines the effects and mechanism paths of monetary policy on firms' "short-term debt for long-term investment (SDFLI)" behavior using panel data of Chinese A-share listed firms from 2007-2019. The findings indicate that loose monetary policy suppresses corporate SDFLI behavior by lengthening corporate credit maturity structure through the credit maturity structure channel. In addition, heterogeneity analysis shows that loose monetary policy significantly inhibits the SDFLI behavior of state-owned enterprises(SOEs), non-high-tech firms, and firms in regions with high bank competition levels through the credit term structure channel, and the monetary policy credit term structure channel fails for non-state-owned enterprises(non-SOEs), high-tech firms, and firms in regions with low bank competition levels. The results of the heterogeneity analysis validate the plausibility that monetary policy affects firms' SDFLI behavior through the credit term structure channel
- …