159 research outputs found

    CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models

    Full text link
    Code generation models based on the pre-training and fine-tuning paradigm have been increasingly attempted by both academia and industry, resulting in well-known industrial models such as Codex, CodeGen, and PanGu-Coder. To evaluate the effectiveness of these models, multiple existing benchmarks are proposed, including only cases of generating a standalone function, i.e., a function that may invoke or access only built-in functions and standard libraries. However, non-standalone functions, which typically are not included in the existing benchmarks, constitute more than 70% of the functions in popular open-source projects, and evaluating models' effectiveness on standalone functions cannot reflect these models' effectiveness on pragmatic code generation scenarios. To help bridge the preceding gap, in this paper, we propose a benchmark named CoderEval, consisting of 230 Python and 230 Java code generation tasks carefully curated from popular real-world open-source projects and a self-contained execution platform to automatically assess the functional correctness of generated code. CoderEval supports code generation tasks from six levels of context dependency, where context refers to code elements such as types, APIs, variables, and consts defined outside the function under generation but within the dependent third-party libraries, current class, file, or project. CoderEval can be used to evaluate the effectiveness of models in generating code beyond only standalone functions. By evaluating three code generation models on CoderEval, we find that the effectiveness of these models in generating standalone functions is substantially higher than that in generating non-standalone functions. Our analysis highlights the current progress and pinpoints future directions to further improve a model's effectiveness by leveraging contextual information for pragmatic code generation

    TOWARDS THE RATIONAL DESIGN OF ORGANIC SEMICONDUCTORS THROUGH COMPUTATIONAL APPROACHES

    Get PDF
    Though organic semiconductors have illustrated potential as industry-relevant materials for electronics applications, there are few guidelines that can take one from molecular design to functional materials. This limitation is, in part, due to incomplete understanding as to how the atomic-scale construction of the π-conjugated molecules that comprise the organic semiconductors determines the nature and strength of both the noncovalent intramolecular interactions that govern molecular conformation and noncovalent intermolecular interactions that regulate the energetic preference for solid-state packing. Hence, there remain several fundamental questions that need to be resolved in order to design organic semiconductors from a priori knowledge, including: What is the relevance of the relatively weak noncovalent intramolecular interactions on determining molecular structure, are current hypotheses put forward as to important interactions valid, and how does chemical substitution as various positions along the π-conjugated backbone impact these interactions? How do the intermolecular noncovalent interactions regulate solid-state packing, are there features of the molecular structure – e.g. the π-conjugated backbone, heteroatoms, or pendent alkyl chains – that play a more important role? What connections can be made between the structures/properties of the π-conjugated molecules and the resulting organic semiconductors? In this dissertation, Chapter 1 provides an introductory discussion of these questions and a brief review of previous studies. Chapter 2 details the computational approaches that were implemented throughout the course of the thesis work. Chapter 3 describes the investigation of a series of pyrene-acene molecules to illustrate the importance of choosing the right molecular structure in π-conjugated chromophores. In Chapter 4, S...F noncovalent intramolecular interactions are systematically investigated in two separate cases to highlight the varied impact that these interactions can have on molecular and solid-state packing structures. Chapter 5 describes the investigation of an oscillatory crystal packing structure observed for a series of oligothiophenes that follow the odd-even carbon-atom counts of the pendant alkyl chains. In Chapter 6, the polymorphism of functionalized pentacene molecules is studied to reveal how seemingly simple atomic substitutions can drastically alter solid-state packing. To systematically address the aforementioned fundamental questions, Chapter 7 describes the construction and application of a database of crystalline molecular organic semiconductors. Finally, perspectives regarding future research are provided in Chapter 8

    Can Programming Languages Boost Each Other via Instruction Tuning?

    Full text link
    When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.Comment: Work in progres

    Message from the Steering Committee Chair

    Get PDF
    published_or_final_versionThe 10th International Conference on Quality Software (QSIC 2010), Zhangjiajie, China, 14-15 July 2010. In Conference Proceedings, 2010, p. xi

    Towards Software Architecture at Runtime

    Get PDF

    Genetic Structure and Demographic History Should Inform Conservation: Chinese Cobras Currently Treated as Homogenous Show Population Divergence

    Get PDF
    An understanding of population structure and genetic diversity is crucial for wildlife conservation and for determining the integrity of wildlife populations. The vulnerable Chinese cobra (Naja atra) has a distribution from the mouth of the Yangtze River down to northern Vietnam and Laos, within which several large mountain ranges and water bodies may influence population structure. We combined 12 microsatellite loci and 1117 bp of the mitochondrial cytochrome b gene to explore genetic structure and demographic history in this species, using 269 individuals from various localities in Mainland China and Vietnam. High levels of genetic variation were identified for both mtDNA and microsatellites. mtDNA data revealed two main (Vietnam + southern China + southwestern China; eastern + southeastern China) and one minor (comprising only two individuals from the westernmost site) clades. Microsatellite data divided the eastern + southeastern China clade further into two genetic clusters, which include individuals from the eastern and southeastern regions, respectively. The Luoxiao and Nanling Mountains may be important barriers affecting the diversification of lineages. In the haplotype network of cytchrome b, many haplotypes were represented within a “star” cluster and this and other tests suggest recent expansion. However, microsatellite analyses did not yield strong evidence for a recent bottleneck for any population or genetic cluster. The three main clusters identified here should be considered as independent management units for conservation purposes. The release of Chinese cobras into the wild should cease unless their origin can be determined, and this will avoid problems arising from unnatural homogenization

    Application-centric Resource Provisioning for Amazon EC2 Spot Instances

    Full text link
    In late 2009, Amazon introduced spot instances to offer their unused resources at lower cost with reduced reliability. Amazon's spot instances allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds the current spot price. The spot price changes periodically based on supply and demand, and customers whose bids exceed it gain access to the available spot instances. Customers may expect their services at lower cost with spot instances compared to on-demand or reserved. However the reliability is compromised since the instances(IaaS) providing the service(SaaS) may become unavailable at any time without any notice to the customer. Checkpointing and migration schemes are of great use to cope with such situation. In this paper we study various checkpointing schemes that can be used with spot instances. Also we device some algorithms for checkpointing scheme on top of application-centric resource provisioning framework that increase the reliability while reducing the cost significantly

    Automated analysis of inter-parameter dependencies in web APIs

    Get PDF
    Web services often impose constraintsthat restrict the way in which two or more input parameters can be combined to form valid calls to the service, i.e. inter-parameter dependencies. Current web API specification languages like the OpenAPI Specification (OAS) pro vide no support for the formal description of such dependencies, making it hardly possible to interact with the services without human intervention. We propose specifying and automatically ana lyzing inter-parameter dependencies in web APIs. To this end, we propose a domain-specific language to describe these dependencies, a constraint programming-aided tool supporting their automated analysis, and an OAS extension integrating our approach and eas ing its adoption. Together, these contributions open a new range of possibilities in areas such as source code generation and testin

    Software Performance Engineering for Cloud Applications – A Survey

    Get PDF
    Cloud computing enables application service providers to lease their computing capabilities for deploying applications depending on user QoS (Quality of Service) requirements.Cloud applications have different composition, configuration and deployment requirements.Quantifying the performance of applications in Cloud computing environments is a challenging task. Software performance engineering(SPE) techniques enable us to assess performance requirements of software applications at the early stages of development. This assessment helps the developers to fine tune their design needs so that the targeted performance goals can be met. In this paper, we try to analyseperformance related issues of cloud applications and identify any SPE techniques currently available for cloud applications
    corecore