55 research outputs found

    Flexible State-Merging for learning (P)DFAs in Python

    Get PDF
    We present a Python package for learning (non-)probabilistic deterministic finite state automata and provide heuristics in the red-blue framework. As our package is built along the API of the popular \texttt{scikit-learn} package, it is easy to use and new learning methods are easy to add. It provides PDFA learning as an additional tool for sequence prediction or classification to data scientists, without the need to understand the algorithm itself but rather the limitations of PDFA as a model. With applications of automata learning in diverse fields such as network traffic analysis, software engineering and biology, a stratified package opens opportunities for practitioners

    Reductions of Automata Used in Network Traffic Filtering

    Get PDF
    Cieľom tejto práce je navrhnúť škálovateľné metódy pre redukciu nedeterministických konečných automatov používaných vo filtrácii paketov. Uvádzame dva prísty redukcie automatov založené na elminácii stavov. Aby sme dosiahli významnú redukciu automatu, používame techniky nezachovávajúce jazyk so zameraním na nad-aproximáciu, keďže redukcie so zachovaním pôvodného jazyka nemusia byť dostatočne účinné. Implementovali sme dané metódy a vyhodnotili presnosť redukovaných automatov na reálnych vzorkoch. Náš prístup neposkytuje žiadne formále záruky vzhľadom na nepoužité dáta, ale može byť hladko použitý na automaty akejkoľvek veľkosti, čo je hlavný problém existujúcich metód, ktoré majú vysokou časovou zložitosťou a nemôžu byť aplikované na veľké automaty.The aim of this work is to propose scalable methods for reducing non-deterministic finite automata used in network traffic filtering. We introduce two approaches of NFAs reduction based on states elimination. To achieve a substantial reduction of automata, we use language non-preserving techniques with a primary focus on language over-approximation, since language preserving methods may not provide sufficient reduction. We implemented the methods and evaluated the accuracy of the reduced automata on real traffic. Our approach does not provide any formal guarantee wrt unseen input traffic, but on the other hand, it can be smoothly used on automata of any size, which is a significant problem for existing methods that have very high time complexity and cannot be applied on really large automata.

    Facilitating High Performance Code Parallelization

    Get PDF
    With the surge of social media on one hand and the ease of obtaining information due to cheap sensing devices and open source APIs on the other hand, the amount of data that can be processed is as well vastly increasing. In addition, the world of computing has recently been witnessing a growing shift towards massively parallel distributed systems due to the increasing importance of transforming data into knowledge in today’s data-driven world. At the core of data analysis for all sorts of applications lies pattern matching. Therefore, parallelizing pattern matching algorithms should be made efficient in order to cater to this ever-increasing abundance of data. We propose a method that automatically detects a user’s single threaded function call to search for a pattern using Java’s standard regular expression library, and replaces it with our own data parallel implementation using Java bytecode injection. Our approach facilitates parallel processing on different platforms consisting of shared memory systems (using multithreading and NVIDIA GPUs) and distributed systems (using MPI and Hadoop). The major contributions of our implementation consist of reducing the execution time while at the same time being transparent to the user. In addition to that, and in the same spirit of facilitating high performance code parallelization, we present a tool that automatically generates Spark Java code from minimal user-supplied inputs. Spark has emerged as the tool of choice for efficient big data analysis. However, users still have to learn the complicated Spark API in order to write even a simple application. Our tool is easy to use, interactive and offers Spark’s native Java API performance. To the best of our knowledge and until the time of this writing, such a tool has not been yet implemented

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

    The BG News August 26, 2005

    Get PDF
    The BGSU campus student newspaper August 26, 2005. Volume 96 - Issue 6https://scholarworks.bgsu.edu/bg-news/8463/thumbnail.jp

    Improving Programming Support for Hardware Accelerators Through Automata Processing Abstractions

    Full text link
    The adoption of hardware accelerators, such as Field-Programmable Gate Arrays, into general-purpose computation pipelines continues to rise, driven by recent trends in data collection and analysis as well as pressure from challenging physical design constraints in hardware. The architectural designs of many of these accelerators stand in stark contrast to the traditional von Neumann model of CPUs. Consequently, existing programming languages, maintenance tools, and techniques are not directly applicable to these devices, meaning that additional architectural knowledge is required for effective programming and configuration. Current programming models and techniques are akin to assembly-level programming on a CPU, thus placing significant burden on developers tasked with using these architectures. Because programming is currently performed at such low levels of abstraction, the software development process is tedious and challenging and hinders the adoption of hardware accelerators. This dissertation explores the thesis that theoretical finite automata provide a suitable abstraction for bridging the gap between high-level programming models and maintenance tools familiar to developers and the low-level hardware representations that enable high-performance execution on hardware accelerators. We adopt a principled hardware/software co-design methodology to develop a programming model providing the key properties that we observe are necessary for success, namely performance and scalability, ease of use, expressive power, and legacy support. First, we develop a framework that allows developers to port existing, legacy code to run on hardware accelerators by leveraging automata learning algorithms in a novel composition with software verification, string solvers, and high-performance automata architectures. Next, we design a domain-specific programming language to aid programmers writing pattern-searching algorithms and develop compilation algorithms to produce finite automata, which supports efficient execution on a wide variety of processing architectures. Then, we develop an interactive debugger for our new language, which allows developers to accurately identify the locations of bugs in software while maintaining support for high-throughput data processing. Finally, we develop two new automata-derived accelerator architectures to support additional applications, including the detection of security attacks and the parsing of recursive and tree-structured data. Using empirical studies, logical reasoning, and statistical analyses, we demonstrate that our prototype artifacts scale to real-world applications, maintain manageable overheads, and support developers' use of hardware accelerators. Collectively, the research efforts detailed in this dissertation help ease the adoption and use of hardware accelerators for data analysis applications, while supporting high-performance computation.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155224/1/angstadt_1.pd

    Runtime Monitoring for Uncertain Times

    Get PDF
    In Runtime Verification (RV), monitors check programs for correct operation at execution time. Also called Runtime Monitoring, RV offers advantages over other approaches to program verification. Efficient monitoring is possible for programs where static checking is cost-prohibitive. Runtime monitors may test for execution faults like hardware failure, as well as logical faults. Unlike simple log checking, monitors are typically constructed using formal languages and methods that precisely define expectations and guarantees. Despite the advantages of RV, however, adoption remains low. Applying Runtime Monitoring techniques to real systems requires addressing practical concerns that have garnered little attention from researchers. System operators need monitors that provide immediate diagnostic information before and after failures, that are simple to operate over distributed systems, and that remain reliable when communication is not. These challenges are solvable, and solving them is a necessary step towards widespread RV deployment. This thesis provides solutions to these and other barriers to practical Runtime Monitoring. We address the need for reporting diagnostic information from monitored programs with nfer, a language and system for event stream abstraction. Nfer supports the automatic extraction of the structure of real-time software and includes integrations with popular programming languages. We also provide for the operation of nfer and other monitoring tools over distributed systems with Palisade, a framework built for low-latency detection of embedded system anomalies. Finally, we supply a method to ensure program properties may be monitored despite unreliable communication channels. We classify monitorable properties over general unreliable conditions and define an algorithm for when more specific conditions are known

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

    QUALITY IMPROVEMENT AND VALIDATION TECHNIQUES ON SOFTWARE SPECIFICATION AND DESIGN

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore