150 research outputs found
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
In this work, we evaluate 10 open-source instructed LLMs on four
representative code comprehension and generation tasks. We have the following
main findings. First, for the zero-shot setting, instructed LLMs are very
competitive on code comprehension and generation tasks and sometimes even
better than small SOTA models specifically fine-tuned on each downstream task.
We also find that larger instructed LLMs are not always better on code-related
tasks. Second, for the few-shot setting, we find that adding demonstration
examples substantially helps instructed LLMs perform better on most code
comprehension and generation tasks; however, the examples would sometimes
induce unstable or even worse performance. Furthermore, we find widely-used
BM25-based shot selection strategy significantly outperforms the basic random
selection or fixed selection only on generation problems. Third, for the
fine-tuning setting, we find that fine-tuning could further improve the model
performance on downstream code comprehension and generation tasks compared to
the zero-shot/one-shot performance. In addition, after being fine-tuned on the
same downstream task dataset, instructed LLMs outperform both the small SOTA
models and similar-scaled LLMs without instruction tuning. Based on our
findings, we further present practical implications on model and usage
recommendation, performance and cost trade-offs, and future direction
Numerical Simulation of Nonperiodic Rail Operation Diagram Characteristics
This paper succeeded in utilizing cellular automata (CA) model to simulate the process of the train operation under the four-aspect color light system and getting the nonperiodic diagram of the mixed passenger and freight tracks. Generally speaking, the concerned models could simulate well the situation of wagon in preventing trains from colliding when parking and restarting and of the real-time changes the situation of train speeds and displacement and get hold of the current train states in their departures and arrivals. Finally the model gets the train diagram that simulates the train operation in different ratios of the van and analyzes some parameter characters in the process of train running, such as time, speed, through capacity, interval departing time, and departing numbers
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation
In this work, we make the first attempt to evaluate LLMs in a more
challenging code generation scenario, i.e. class-level code generation. We
first manually construct the first class-level code generation benchmark
ClassEval of 100 class-level Python code generation tasks with approximately
500 person-hours. Based on it, we then perform the first study of 11
state-of-the-art LLMs on class-level code generation. Based on our results, we
have the following main findings. First, we find that all existing LLMs show
much worse performance on class-level code generation compared to on standalone
method-level code generation benchmarks like HumanEval; and the method-level
coding ability cannot equivalently reflect the class-level coding ability among
LLMs. Second, we find that GPT-4 and GPT-3.5 still exhibit dominate superior
than other LLMs on class-level code generation, and the second-tier models
includes Instruct-Starcoder, Instruct-Codegen, and Wizardcoder with very
similar performance. Third, we find that generating the entire class all at
once (i.e. holistic generation strategy) is the best generation strategy only
for GPT-4 and GPT-3.5, while method-by-method generation (i.e. incremental and
compositional) is better strategies for the other models with limited ability
of understanding long instructions and utilizing the middle information.
Lastly, we find the limited model ability of generating method-dependent code
and discuss the frequent error types in generated classes. Our benchmark is
available at https://github.com/FudanSELab/ClassEval
Sensing Characteristics of Side-Hole Fiber-Based Long-Period Grating
Long-period gratings (LPGs) have been fabricated in a side-hole fiber (SHF) by using a pulsed CO2 laser. Sensing characteristics of this SHF-LPG to temperature surrounding refractive index and bend have been investigated. Experimental results show that resonant wavelength of the SHF-LPG has a blue shift with temperature with sensitivity of −0.11 nm/°C, a blue shift with increasing sensitivity with surrounding refractive index ranging from 1.335 to 1.44 (the maximum sensitivity is achieved when the surrounding refractive index reaches the effective index of the fiber cladding), and a red shift with bend-direction-dependent sensitivity up to 9.36 nm/m−1
Observation of Giant Spin Splitting and d-wave Spin Texture in Room Temperature Altermagnet RuO2
Recently, a novel magnetic phase called altermagnetism has been proposed,
ushering in a third distinct magnetic phase beyond ferromagnetism and
antiferromagnetism. It is expected that this groundbreaking phase exhibits
unique physical properties such as C-paired spin-valley locking, anomalous Hall
effect, nontrivial Berry phase, and giant magnetoresistance, etc. Among all the
predicted candidates, several room temperature altermagnets are suggested to
host significant potential applications in the near future. Nevertheless,
direct evidence about the spin pattern of the room temperature altermagnet is
still unrevealed. Previous studies found that RuO2 is identified as the most
promising candidate for room temperature d-wave altermagnetism, exhibiting a
substantial spin splitting of up to 1.4 eV. In this study, utilizing
angle-resolved photoemission spectroscopy (ARPES), we report experimental
observation of the spin splitting in RuO2. Furthermore, employing spin-ARPES,
we directly observed the d-wave spin pattern. Our results unequivocally show
that RuO2 is a perfect d-wave altermagnet with great potential for upcoming
spintronic applications.Comment: 32 pages, 12 figure
A new unconventional HLA-A2-restricted epitope from HBV core protein elicits antiviral cytotoxic T lymphocytes
Cytotoxic T cells (CTLs) play a key role in the control of Hepatitis B virus (HBV) infection and viral clearance. However, most of identified CTL epitopes are derived from HBV of genotypes A and D, and few have been defined in virus of genotypes B and C which are more prevalent in Asia. As HBV core protein (HBc) is the most conservative and immunogenic component, in this study we used an overlapping 9-mer peptide pool covering HBc to screen and identify specific CTL epitopes. An unconventional HLA-A2-restricted epitope HBc141–149 was discovered and structurally characterized by crystallization analysis. The immunogenicity and anti-HBV activity were further determined in HBV and HLA-A2 transgenic mice. Finally, we show that mutations in HBc141–149 epitope are associated with viral parameters and disease progression in HBV infected patients. Our data therefore provide insights into the structure characteristics of this unconventional epitope binding to MHC-I molecules, as well as epitope specific CTL activity that orchestrate T cell response and immune evasion in HBV infected patients
Breaking K+ Concentration Limit on Cu Nanoneedles for Acidic Electrocatalytic CO2 Reduction to Multi‐Carbon Products
Electrocatalytic CO2 reduction reaction (CO2RR) to multi-carbon products (C2+) in acidic electrolyte is one of the most advanced routes for tackling our current climate and energy crisis. However, the competing hydrogen evolution reaction (HER) and the poor selectivity towards the valuable C2+ products are the major obstacles for the upscaling of these technologies. High local potassium ions (K+) concentration at the cathode's surface can inhibit proton-diffusion and accelerate the desirable carbon-carbon (C−C) coupling process. However, the solubility limit of potassium salts in bulk solution constrains the maximum achievable K+ concentration at the reaction sites and thus the overall acidic CO2RR performance of most electrocatalysts. In this work, we demonstrate that Cu nanoneedles induce ultrahigh local K+ concentrations (4.22 M) – thus breaking the K+ solubility limit (3.5 M) – which enables a highly efficient CO2RR in 3 M KCl at pH=1. As a result, a Faradaic efficiency of 90.69±2.15 % for C2+ (FEC2+) can be achieved at 1400 mA.cm−2, simultaneous with a single pass carbon efficiency (SPCE) of 25.49±0.82 % at a CO2 flow rate of 7 sccm
- …