4,411 research outputs found

    Predicting prime path coverage using regression analysis at method level

    Get PDF
    Test coverage criteria help the tester in analyzing the quality of the test suite, especially in an evolving system where it can be used to guide the prioritization of regression tests and the testing effort of new code. However, coverage analysis of more powerful cri teria such as path coverage is still challenging due to the lack of supporting tools. As a consequence, the tester evaluates a test suite quality employing more basic coverage criteria (e.g., node coverage and edge coverage), which are the ones that are supported by tools. In this work, we evaluate the opportunity of using machine learning algorithms to estimate the prime-path coverage of a test suite at the method level. We followed the Knowledge Discovery in Database process and a dataset built from 9 real-world projects to devise three regression models for prime-path prediction. We compare four different machine learning algorithms and conduct a fine-grained feature analysis to investigate the factors that most impact the prediction accuracy. Our experimental results show that a suitable predictive model uses as input data only five source code metrics and one basic test coverage metric. Our evaluation shows that the best model achieves an MAE of 0.016 (1,6%) on the cross-validation (internal validation) and an MAE of 0.06 (6%) on the ex ternal validation. Finally, we observed that good prediction models can be generated from common code metrics although the use of a simple test metric such as branch coverage can improve even more the prediction performance of the model.Os critérios de cobertura de teste auxiliam o testador na análise da qualidade do conjunto de testes, em especial em sistemas em evolução onde pode ser utilizado para orientar a priorização dos testes de regressão e o esforço de teste de um novo código. No entanto, a análise da cobertura de critérios mais poderosos, tais como a cobertura de caminhos, continua a ser desafiante devido à falta de ferramentas de apoio. Como consequência, o testador avalia a qualidade de um conjunto de testes utilizando critérios de cobertura mais básicos (por exemplo, cobertura de nós e cobertura de arcos), que são os que são suporta dos por ferramentas. Neste trabalho, avaliou-se a oportunidade de utilizar algoritmos de aprendizagem de máquina para estimar a cobertura de caminhos primos de um conjunto de testes em nível de método. Seguiu-se o processo de descoberta de conhecimento em base de dados e um conjunto de dados construído a partir de 9 projetos do mundo real para se criarem três modelos de regressão para a previsão do valor de cobertura do critério de caminhos primos a partir de métricas de código. Compararam-se quatro algoritmos dife rentes de aprendizagem de máquina e realizou-se uma análise detalhada de características para identificar aquelas que mais afetam o desempenho da predição. Os resultados experi mentais mostraram que modelos preditivos de boa acurácia podem ser gerados a partir de um conjunto de métricas de código pequeno e de fácil obtenção. O melhor modelo gerado utiliza como dados de entrada apenas cinco métricas de código fonte e uma métrica básica de cobertura de teste e atinge um MAE de 0,016 (1,6%) na validação cruzada (validação interna) e um MAE de 0,06 (6%) na validação externa. Por fim, observou-se que modelos preditivos adequados podem ser gerados a partir de métricas de código comuns, embora o uso da métrica de cobertura de arcos, quando disponível, possa melhorar ainda mais o desempenho de predição

    Gas phase homolytic bond dissociation enthalpies of common laboratory solvents: A G4 theoretical study

    Get PDF
    Gas phase standard state (298.15 K, 1 atm) calculations were conducted at the Gaussian-4 (G4) composite method level of theory to estimate the bond dissociation enthalpies (BDEs) of various common laboratory solvents. Excellent agreement was obtained between experimental and G4 estimated BDEs. The current study demonstrates the BDE prediction accuracy of the G4 method, and is also intended to function as a potentially useful resource in any reevaluations of the preferred BDEs for these common laboratory solvents

    Method-Level Bug Severity Prediction using Source Code Metrics and LLMs

    Full text link
    In the past couple of decades, significant research efforts are devoted to the prediction of software bugs. However, most existing work in this domain treats all bugs the same, which is not the case in practice. It is important for a defect prediction method to estimate the severity of the identified bugs so that the higher-severity ones get immediate attention. In this study, we investigate source code metrics, source code representation using large language models (LLMs), and their combination in predicting bug severity labels of two prominent datasets. We leverage several source metrics at method-level granularity to train eight different machine-learning models. Our results suggest that Decision Tree and Random Forest models outperform other models regarding our several evaluation metrics. We then use the pre-trained CodeBERT LLM to study the source code representations' effectiveness in predicting bug severity. CodeBERT finetuning improves the bug severity prediction results significantly in the range of 29%-140% for several evaluation metrics, compared to the best classic prediction model on source code metric. Finally, we integrate source code metrics into CodeBERT as an additional input, using our two proposed architectures, which both enhance the CodeBERT model effectiveness

    Reverse Engineering Software Code in Java to Show Method Level Dependencies

    Get PDF
    With the increased dependency on the Internet and computers, the software industry continues to grow. However, just as new software is being developed, older software is still in existence and must be maintained. This tends to be a difficult task, as the developers charged with maintaining the software are not always the developers who designed it. Reverse engineering is the study of an application\u27s code and behavior, in order to better understand the system and its design. There are many existing tools that will assist the developer with this undertaking, such as Rational Rose®, jGRASP®, and Eclipse®. However, all the tools generate high level abstractions of the system in question, like the class diagram. It would be more beneficial to developers to have illustrations with more detailed information, such as the method level dependencies in the source code. In order to accomplish this task, a new framework has been developed that will allow the user to view both high level and lower level code detail. As users attempt to perform code maintenance, they will run the code through an existing tool, such as Rational Rose®, and then through the Method Level Dependency Generator component, to show the method level dependencies. These tools used together provide the software maintainer with more useful information, assisting with the software development process, including code design, implementation, and testing

    A G4MP2 theoretical study on the gas phase enthalpies of formation for various polycyclic aromatic hydrocarbons (PAHs) and other C~10~ through C~20~ unsaturated hydrocarbons

    Get PDF
    Gas phase enthalpies of formation at 298.15 and 1 atm (Δ~f~H~(g),298K~) were calculated using the atomization approach at the G4MP2 composite method level of theory for 86 polyaromatic hydrocarbons (PAHs) and other C~10~ through C~20~ unsaturated hydrocarbons. Where available, good agreement with prior experimental data and/or high level theoretical estimates was obtained. Linear regressions between semiempirical MNDO, MNDO-d, AM1, PM3, RM1, and PM6 estimated Δ~f~H~(g),298K~ and the corresponding G4MP2 values were employed to obtain G4MP2 corrected semiempirical Δ~f~H~(g),298K~ for a suite of 156 C~11~ through C~42~ unsaturated hydrocarbons and PAHs

    Gas phase bond dissociation enthalpies and enthalpies of isomerization/reaction for small hydrocarbon combustion related compounds between 300 and 1500 K: A comparison of Gaussian-4 (G4) theoretical values against experimental data

    Get PDF
    Gas phase calculations at 1 atmosphere pressure between 300 and 1500 K at 200 K intervals were conducted using the Gaussian-4 (G4) composite method level of theory on a representative set of reactions having broad relevance in hydrocarbon combustion chemistry. Reasonable agreement between the experimental and theoretical data was obtained across the temperature range under consideration for all bond dissociation enthalpies, isomerization enthalpies, and enthalpies of reaction. For some reaction schemes, chemical accuracy for the theoretical method was maintained over the complete temperature range, whereas other systems displayed up to several kcal mol^-1^ deviations from experimental data. The direction of signed errors generally increased as the temperature was raised, and no general error trends were related to molecular size or reaction class

    Measuring the impact of object-oriented techniques in grande applications: a method-level analysis

    Get PDF
    In this work we seek to provide a foundation for the study of the level of use of object-oriented techniques in Java programs in general, and scientific applications in particular. Specifically, we focus on the use of small methods, and the frequency with which they are called, since this forms the basis for the study of method inlining, an important optimisation technique. We compare the Grande and SPEC benchmark suites, and note a significant difference in the nature and composition of these suites
    corecore