15,842 research outputs found

    An integrated search-based approach for automatic testing from extended finite state machine (EFSM) models

    Get PDF
    This is the post-print version of the Article - Copyright @ 2011 ElsevierThe extended finite state machine (EFSM) is a modelling approach that has been used to represent a wide range of systems. When testing from an EFSM, it is normal to use a test criterion such as transition coverage. Such test criteria are often expressed in terms of transition paths (TPs) through an EFSM. Despite the popularity of EFSMs, testing from an EFSM is difficult for two main reasons: path feasibility and path input sequence generation. The path feasibility problem concerns generating paths that are feasible whereas the path input sequence generation problem is to find an input sequence that can traverse a feasible path. While search-based approaches have been used in test automation, there has been relatively little work that uses them when testing from an EFSM. In this paper, we propose an integrated search-based approach to automate testing from an EFSM. The approach has two phases, the aim of the first phase being to produce a feasible TP (FTP) while the second phase searches for an input sequence to trigger this TP. The first phase uses a Genetic Algorithm whose fitness function is a TP feasibility metric based on dataflow dependence. The second phase uses a Genetic Algorithm whose fitness function is based on a combination of a branch distance function and approach level. Experimental results using five EFSMs found the first phase to be effective in generating FTPs with a success rate of approximately 96.6%. Furthermore, the proposed input sequence generator could trigger all the generated feasible TPs (success rate = 100%). The results derived from the experiment demonstrate that the proposed approach is effective in automating testing from an EFSM

    Discovering Communities of Community Discovery

    Get PDF
    Discovering communities in complex networks means grouping nodes similar to each other, to uncover latent information about them. There are hundreds of different algorithms to solve the community detection task, each with its own understanding and definition of what a "community" is. Dozens of review works attempt to order such a diverse landscape -- classifying community discovery algorithms by the process they employ to detect communities, by their explicitly stated definition of community, or by their performance on a standardized task. In this paper, we classify community discovery algorithms according to a fourth criterion: the similarity of their results. We create an Algorithm Similarity Network (ASN), whose nodes are the community detection approaches, connected if they return similar groupings. We then perform community detection on this network, grouping algorithms that consistently return the same partitions or overlapping coverage over a span of more than one thousand synthetic and real world networks. This paper is an attempt to create a similarity-based classification of community detection algorithms based on empirical data. It improves over the state of the art by comparing more than seventy approaches, discovering that the ASN contains well-separated groups, making it a sensible tool for practitioners, aiding their choice of algorithms fitting their analytic needs

    eCrash: a framework for performing evolutionary testing on third-party Java components

    Get PDF
    The focus of this paper is on presenting a tool for generating test data by employing evolutionary search techniques, with basis on the information provided by the structural analysis and interpretation of the Java bytecode of third-party Java components, and on the dynamic execution of the instrumented test object. The main objective of this approach is that of evolving a set of test cases that yields full structural code coverage of the test object. Such a test set can be used for effectively performing the testing activity, providing confidence in the quality and robustness of the test object. The rationale of working at the bytecode level is that even when the source code is unavailable structural testing requirements can still be derived, and used to assess the quality of a test set and to guide the evolutionary search towards reaching specific test goals

    UniFuzz: Optimizing Distributed Fuzzing via Dynamic Centralized Task Scheduling

    Full text link
    Fuzzing is one of the most efficient technology for vulnerability detection. Since the fuzzing process is computing-intensive and the performance improved by algorithm optimization is limited, recent research seeks to improve fuzzing performance by utilizing parallel computing. However, parallel fuzzing has to overcome challenges such as task conflicts, scalability in a distributed environment, synchronization overhead, and workload imbalance. In this paper, we design and implement UniFuzz, a distributed fuzzing optimization based on a dynamic centralized task scheduling. UniFuzz evaluates and distributes seeds in a centralized manner to avoid task conflicts. It uses a "request-response" scheme to dynamically distribute fuzzing tasks, which avoids workload imbalance. Besides, UniFuzz can adaptively switch the role of computing cores between evaluating, and fuzzing, which avoids the potential bottleneck of seed evaluation. To improve synchronization efficiency, UniFuzz shares different fuzzing information in a different way according to their characteristics, and the average overhead of synchronization is only about 0.4\%. We evaluated UniFuzz with real-world programs, and the results show that UniFuzz outperforms state-of-the-art tools, such as AFL, PAFL and EnFuzz. Most importantly, the experiment reveals a counter-intuitive result that parallel fuzzing can achieve a super-linear acceleration to the single-core fuzzing. We made a detailed explanation and proved it with additional experiments. UniFuzz also discovered 16 real-world vulnerabilities.Comment: 14 pages, 4 figure

    Augmenting American Fuzzy Lop to Increase the Speed of Bug Detection

    Get PDF
    Whitebox fuzz testing is a vital part of the software testing process in the software development life cycle (SDLC). It is used for bug detection and security vulnerability checking as well. But current tools lack the ability to detect all the bugs and cover the entire code under test in a reasonable time. This study will explore some of the various whitebox fuzzing techniques and tools (AFL, SAGE, Driller, etc.) currently in use followed by a discussion of their strategies and the challenges facing them. One of the most popular state-of-the-art fuzzers, American Fuzzy Lop (AFL) will be discussed in detail and the modifications proposed to reduce the time required by it while functioning under QEMU emulation mode will be put forth. The study found that the AFL fuzzer can be sped up by injecting an intermediary layer of code in the Tiny Code Generator (TCG) that helps in translating blocks between the two architectures being used for testing. The modified version of AFL was able to find a mean 1.6 crashes more than the basic AFL running in QEMU mode. The study will then recommend future research avenues in the form of hybrid techniques to resolve the challenges faced by the state of the art fuzzers and create an optimal fuzzing tool. The motivation behind the study is to optimize the fuzzing process in order to reduce the time taken to perform software testing and produce robust, error-free software products

    Ontology-based transformation of natural language queries into SPARQL queries by evolutionary algorithms

    Get PDF
    In dieser Arbeit wird ein ontologiegetriebenes evolutionäres Lernsystem für natürlichsprachliche Abfragen von RDF-Graphen vorgestellt. Das lernende System beantwortet die Anfrage nicht selbst, sondern generiert eine SPARQL-Abfrage gegen die Datenbank. Zu diesem Zweck wird das Evolutionary Dataflow Agents Framework eingeführt, ein allgemeines Lernsystem, das auf der Grundlage evolutionärer Algorithmen Agenten erzeugt, die lernen, ein Problem zu lösen. Die Hauptidee des Frameworks ist es, Probleme zu unterstützen, die einen mittelgroßen Suchraum (Anwendungsfall: Analyse von natürlichsprachlichen Abfragen) von streng formal strukturierten Lösungen (Anwendungsfall: Synthese von Datenbankabfragen) mit eher lokalen klassischen strukturellen und algorithmischen Aspekten kombinieren. Dabei kombinieren die Agenten lokale algorithmische Funktionalität von Knoten mit einem flexiblen Datenfluss zwischen den Knoten zu einem globalen Problemlösungsprozess. Grob gesagt gibt es Knoten, die Informationsfragmente generieren, indem sie Eingabedaten und/oder frühere Fragmente kombinieren, oft unter Verwendung von auf Heuristik basierenden Vermutungen. Andere Knoten kombinieren, sammeln und reduzieren solche Fragmente auf mögliche Lösungen und grenzen diese auf die endgültige Lösung ein. Zu diesem Zweck werden die Informationen von den Agenten weitergegeben. Die Konfiguration dieser Agenten, welche Knoten sie kombinieren und wohin genau die Daten fließen, ist Gegenstand des Lernens. Das Training beginnt mit einfachen Agenten, die - wie in Lern-Frameworks üblich - eine Reihe von Aufgaben lösen und dafür bewertet werden. Da die erzeugten Antworten in der Regel komplexe Strukturen aufweisen, setzt das Framework einen neuartigen feinkörnigen energiebasierten Bewertungs- und Auswahlschritt ein. Die ausgewählten Agenten bilden dann die Grundlage für die Population der nächsten Runde. Die Evolution wird wie üblich durch Mutationen und Agentenfusion gewährleistet. Als Anwendungsfall wurde EvolNLQ implementiert, ein System zur Beantwortung natürlichsprachlicher Abfragen gegen RDF-Datenbanken. Hierfür wird die zugrundeliegende Ontologie medatata (extern) algorithmisch vorverarbeitet. Für die Agenten werden geeignete Datenelementtypen und Knotentypen definiert, die die Prozesse der Sprachanalyse und der Anfragesynthese in mehr oder weniger elementare Operationen zerlegen. Die "Größe" der Operationen wird bestimmt durch die Grenze zwischen Berechnungen, d.h. rein algorithmischen Schritten (implementiert in einzelnen mächtigen Knoten) und einfachen heuristischen Schritten (ebenfalls realisiert durch einfache Knoten), und freiem Datenfluss, der beliebige Verkettungen und Verzweigungskonfigurationen der Agenten erlaubt. EvolNLQ wird mit einigen anderen Ansätzen verglichen und zeigt konkurrenzfähige Ergebnisse.In this thesis an ontology-driven evolutionary learning system for natural language querying of RDF graphs is presented. The learning system itself does not answer the query, but generates a SPARQL query against the database. For this purpose, the Evolutionary Dataflow Agents framework, a general learning framework is introduced that, based on evolutionary algorithms, creates agents that learn to solve a problem. The main idea of the framework is to support problems that combine a medium-sized search space (use case: analysis of natural language queries) of strictly, formally structured solutions (use case: synthesis of database queries), with rather local classical structural and algorithmic aspects. For this, the agents combine local algorithmic functionality of nodes with a flexible dataflow between the nodes to a global problem solving process. Roughly, there are nodes that generate informational fragments by combining input data and/or earlier fragments, often using heuristics-based guessing. Other nodes combine, collect, and reduce such fragments towards possible solutions, and narrowing these towards the unique final solution. For this, informational items are floating through the agents. The configuration of these agents, what nodes they combine, and where exactly the data items are flowing, is subject to learning. The training starts with simple agents, which –as usual in learning frameworks– solve a set of tasks, and are evaluated for it. Since the produced answers usually have complex structures answers, the framework employs a novel fine-grained energy-based evaluation and selection step. The selected agents then are the basis for the population of the next round. Evolution is provided as usual by mutations and agent fusion. As a use case, EvolNLQ has been implemented, a system for answering natural language queries against RDF databases. For this, the underlying ontology medatata is (externally) algorithmically preprocessed. For the agents, appropriate data item types and node types are defined that break down the processes of language analysis and query synthesis into more or less elementary operations. The "size" of operations is determined by the border between computations, i.e., purely algorithmic steps (implemented in individual powerful nodes) and simple heuristic steps (also realized by simple nodes), and free dataflow allowing for arbitrary chaining and branching configurations of the agents. EvolNLQ is compared with some other approaches, showing competitive results.2022-01-2
    corecore