20,914 research outputs found

    Organizing the Technical Debt Landscape

    Get PDF
    To date, several methods and tools for detecting source code and design anomalies have been developed. While each method focuses on identifying certain classes of source code anomalies that potentially relate to technical debt (TD), the overlaps and gaps among these classes and TD have not been rigorously demonstrated. We propose to construct a seminal technical debt landscape as a way to visualize and organize research on the subjec

    PROCESS CONFORMANCE TESTING: A METHODOLOGY TO IDENTIFY AND UNDERSTAND PROCESS VIOLATIONS IN ENACTMENT OF SOFTWARE PROCESSES

    Get PDF
    Today's software development is driven by software processes and practices that when followed increase the chances of building high quality software products. Not following these guidelines results in increased risk that the goal for the software's quality characteristics cannot be reached. Current process analysis approaches are limited in identifying and understanding process deviations and ultimately fail in comprehending why a process does not work in a given environment and what steps of the process have to be changed and tailored. In this work I will present a methodology for formulating, identifying and investigating process violations in the execution of software processes. The methodology, which can be thought of as "Process Conformance Testing", consists of a four step iterative model, compromising templates and tools. A strong focus is set on identifying violations in a cost efficient and unobtrusive manner by utilizing automatically collected data gathered through commonly used software development tools, such as version control systems. To evaluate the usefulness and correctness of the model a series of four studies have been conducted in both classroom and professional environments. A total of eight different software processes have been investigated and tested. The results of the studies show that the steps and iterative character of the methodology are useful for formulating and tailoring violation detection strategies and investigating violations in classroom study environments and professional environments. All the investigated processes were violated in some way, which emphasizes the importance of conformance measurement. This is especially important when running an empirical study to evaluate the effectiveness of a software process, as the experimenters want to make sure they are evaluating the specified process and not a variation of it. Violation detection strategies were tailored based upon analysis of the history of violations and feedback from then enactors and mangers yielding greater precision of identification of non-conformities. The overhead cost of the approach is shown to be feasible with a 3.4% (professional environment) and 12.1% (classroom environment) overhead. One interesting side result is that process enactors did not always follow the process for good reason, e.g. the process was not tailored for the environment, it was not specified at the right level of granularity, or was too difficult to follow. Two specific examples in this thesis are XP Pair Switching and Test Driven Development. In XP Pair Switching, the practice was violated because the frequency of switching was too high. The definition of Test Driven Development is simple and clear but requires a fair amount of discipline to follow, especially by novice programmers

    Streamlining code smells: Using collective intelligence and visualization

    Get PDF
    Context. Code smells are seen as major source of technical debt and, as such, should be detected and removed. Code smells have long been catalogued with corresponding mitigating solutions called refactoring operations. However, while the latter are supported in current IDEs (e.g., Eclipse), code smells detection scaffolding has still many limitations. Researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigate the problem of smells-infected code. Objective. This thesis presents a new approach to code smells detection that we have called CrowdSmelling and the results of a validation experiment for this approach. The latter is based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. Method. In the context of three consecutive years of a Software Engineering course, a total “crowd” of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Results. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). Conclusions. Obtained results suggest that Crowdsmelling is a feasible approach for the detection of code smells, but further validation experiments are required to cover more code smells and to increase external validityContexto. Os cheiros de código são a principal causa de dívida técnica (technical debt), como tal, devem ser detectados e removidos. Os cheiros de código já foram há muito tempo catalogados juntamente com as correspondentes soluções mitigadoras chamadas operações de refabricação (refactoring). No entanto, embora estas últimas sejam suportadas nas IDEs actuais (por exemplo, Eclipse), a deteção de cheiros de código têm ainda muitas limitações. Os investigadores argumentam que a subjectividade do processo de deteção de cheiros de código é um dos principais obstáculo à mitigação do problema da qualidade do código. Objectivo. Esta tese apresenta uma nova abordagem à detecção de cheiros de código, a que chamámos CrowdSmelling, e os resultados de uma experiência de validação para esta abordagem. A nossa abordagem de CrowdSmelling baseia-se em técnicas de aprendizagem automática supervisionada, onde a sabedoria da multidão (dos programadores de software) é utilizada para calibrar colectivamente algoritmos de detecção de cheiros de código, diminuindo assim a questão da subjectividade. Método. Em três anos consecutivos, no âmbito da Unidade Curricular de Engenharia de Software, uma "multidão", num total de cerca de uma centena de equipas, com uma média de três membros cada, classificou a presença de 3 cheiros de código (Long Method, God Class, and Feature Envy) em código fonte Java. Estas classificações foram a base dos oráculos utilizados para o treino de seis algoritmos de aprendizagem automática. Mais de cem modelos foram gerados e avaliados para determinar quais os algoritmos de aprendizagem de máquinas com melhor desempenho na detecção de cada um dos cheiros de código acima mencionados. Resultados. Foram obtidos bons desempenhos na detecção do God Class (ROC=0,896 para Naive Bayes) e na detecção do Long Method (ROC=0,870 para AdaBoostM1), mas muito mais baixos para Feature Envy (ROC=0,570 para Random Forrest). Conclusões. Os resultados obtidos sugerem que o Crowdsmelling é uma abordagem viável para a detecção de cheiros de código, mas são necessárias mais experiências de validação para cobrir mais cheiros de código e para aumentar a validade externa

    Molecular Genetic Diversity Study of Forest Coffee Tree (Coffea arabica L.) Populations in Ethiopia: Implications for Conservation and Breeding

    Get PDF
    Coffee provides one of the most widely drunk beverages in the world, and is a very important source of foreign exchange income for many countries. Coffea arabica, which contributes over 70 percent of the world's coffee productions, is characterized by a low genetic diversity, attributed to its allopolyploidy origin, reproductive biology and evolution. C. arabica has originated in the southwest rain forests of Ethiopia, where it is grown under four different systems, namely forest coffee, small holders coffee, semi plantation coffee and plantation coffee. Genetic diversity of the forest coffee (C. arabica) gene pool in Ethiopia is being lost at an alarming rate because of habitat destruction (deforestation), competition from other cash crops and replacement by invariable disease resistant coffee cultivars. This study focused on molecular genetic diversity study of forest coffee populations in Ethiopia using PCR based DNA markers such as random amplified polymorphic DNA (RAPD), inverse sequence-tagged repeat (ISTR), inter-simple sequence repeats (ISSR) and simple sequence repeat (SSR) or microsatellites. The objectives of the study are to estimate the extent and distribution of molecular genetic diversity of forest coffee and to design conservation strategies for it’s sustainable use in future coffee breeding. In this study, considerable samples of forest coffee collected from four coffee growing regions (provinces) of Ethiopia were analysed. The results indicate that moderate genetic diversity exists within and among few forest coffee populations, which need due attention from a conservation and breeding point of view. The cluster analysis revealed that most of the samples from the same region (province) were grouped together which could be attributed to presence of substantial gene flow between adjacent populations in each region in the form of young coffee plants through transplantation by man. In addition wild animals such as monkeys also play a significant role in coffee trees gene flow between adjacent populations. The overall variation of the forest coffee is found to reside in few populations from each region. Therefore, considering few populations from each region for either in situ or ex situ conservation may preserve most of the variation within the species. For instance, Welega-2, Ilubabor-2, Jima-2 and Bench Maji-2 populations should be given higher priority. In addition, some populations or genotypes have displayed unique amplification profiles particularly for RAPD and ISTR markers. Whether these unique bands are linked to any of the important agronomic traits and serve in marker assisted selections in future coffee breeding requires further investigations

    A heuristic-based approach to code-smell detection

    Get PDF
    Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

    Paraphrase Plagiarism Identifcation with Character-level Features

    Full text link
    [EN] Several methods have been proposed for determining plagiarism between pairs of sentences, passages or even full documents. However, the majority of these methods fail to reliably detect paraphrase plagiarism due to the high complexity of the task, even for human beings. Paraphrase plagiarism identi cation consists in automatically recognizing document fragments that contain re-used text, which is intentionally hidden by means of some rewording practices such as semantic equivalences, discursive changes, and morphological or lexical substitutions. Our main hypothesis establishes that the original author's writing style ngerprint prevails in the plagiarized text even when paraphrases occur. Thus, in this paper we propose a novel text representation scheme that gathers both content and style characteristics of texts, represented by means of character-level features. As an additional contribution, we describe the methodology followed for the construction of an appropriate corpus for the task of paraphrase plagiarism identi cation, which represents a new valuable resource to the NLP community for future research work in this field.This work is the result of the collaboration in the framework of the CONACYT Thematic Networks program (RedTTL Language Technologies Network) and the WIQ-EI IRSES project (Grant No. 269180) within the FP7 Marie Curie action. The first author was supported by CONACYT (Scholarship 258345/224483). The second, third, and sixth authors were partially supported by CONACyT (Project Grants 258588 and 2410). The work of the fourth author was partially supported by the SomEMBED TIN2015-71147-C2-1-P MINECO research project and by the Generalitat Valenciana under the Grant ALMAMATER (PrometeoII/2014/030).Sánchez-Vega, F.; Villatoro-Tello, E.; Montes-Y-Gómez, M.; Rosso, P.; Stamatatos, E.; Villaseñor-Pineda, L. (2019). Paraphrase Plagiarism Identifcation with Character-level Features. Pattern Analysis and Applications. 22(2):669-681. https://doi.org/10.1007/s10044-017-0674-zS66968122

    A Misuse-Based Intrusion Detection System for ITU-T G.9959 Wireless Networks

    Get PDF
    Wireless Sensor Networks (WSNs) provide low-cost, low-power, and low-complexity systems tightly integrating control and communication. Protocols based on the ITU-T G.9959 recommendation specifying narrow-band sub-GHz communications have significant growth potential. The Z-Wave protocol is the most common implementation. Z-Wave developers are required to sign nondisclosure and confidentiality agreements, limiting the availability of tools to perform open source research. This work discovers vulnerabilities allowing the injection of rogue devices or hiding information in Z-Wave packets as a type of covert channel attack. Given existing vulnerabilities and exploitations, defensive countermeasures are needed. A Misuse-Based Intrusion Detection System (MBIDS) is engineered, capable of monitoring Z-Wave networks. Experiments are designed to test the detection accuracy of the system against attacks. Results from the experiments demonstrate the MBIDS accurately detects intrusions in a Z-Wave network with a mean misuse detection rate of 99%. Overall, this research contributes new Z-Wave exploitations and an MBIDS to detect rogue devices and packet injection attacks, enabling a more secure Z-Wave network
    • …
    corecore