2 research outputs found
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Code completion models have made significant progress in recent years, yet
current popular evaluation datasets, such as HumanEval and MBPP, predominantly
focus on code completion tasks within a single file. This over-simplified
setting falls short of representing the real-world software development
scenario where repositories span multiple files with numerous cross-file
dependencies, and accessing and understanding cross-file context is often
required to complete the code correctly.
To fill in this gap, we propose CrossCodeEval, a diverse and multilingual
code completion benchmark that necessitates an in-depth cross-file contextual
understanding to complete the code accurately. CrossCodeEval is built on a
diverse set of real-world, open-sourced, permissively-licensed repositories in
four popular programming languages: Python, Java, TypeScript, and C#. To create
examples that strictly require cross-file context for accurate completion, we
propose a straightforward yet efficient static-analysis-based approach to
pinpoint the use of cross-file context within the current file.
Extensive experiments on state-of-the-art code language models like CodeGen
and StarCoder demonstrate that CrossCodeEval is extremely challenging when the
relevant cross-file context is absent, and we see clear improvements when
adding these context into the prompt. However, despite such improvements, the
pinnacle of performance remains notably unattained even with the
highest-performing model, indicating that CrossCodeEval is also capable of
assessing model's capability in leveraging extensive context to make better
code completion. Finally, we benchmarked various methods in retrieving
cross-file context, and show that CrossCodeEval can also be used to measure the
capability of code retrievers.Comment: To appear at NeurIPS 2023 (Datasets and Benchmarks Track
ALEX: An Updatable Adaptive Learned Index
© 2020 Association for Computing Machinery. Recent work on "learned indexes" has changed the way we look at the decades-old field of DBMS indexing. The key idea is that indexes can be thought of as "models" that predict the position of a key in a dataset. Indexes can, thus, be learned. The original work by Kraska et al. shows that a learned index beats a B+ tree by a factor of up to three in search time and by an order of magnitude in memory footprint. However, it is limited to static, read-only workloads. In this paper, we present a new learned index called ALEX which addresses practical issues that arise when implementing learned indexes for workloads that contain a mix of point lookups, short range queries, inserts, updates, and deletes. ALEX effectively combines the core insights from learned indexes with proven storage and indexing techniques to achieve high performance and low memory footprint. On read-only workloads, ALEX beats the learned index from Kraska et al. by up to 2.2X on performance with up to 15X smaller index size. Across the spectrum of read-write workloads, ALEX beats B+ trees by up to 4.1X while never performing worse, with up to 2000X smaller index size. We believe ALEX presents a key step towards making learned indexes practical for a broader class of database workloads with dynamic updates