36 research outputs found
From computability to executability : a process-theoretic view on automata theory
The theory of automata and formal language was devised in the 1930s to provide models for and to reason about computation. Here we mean by computation a procedure that transforms input into output, which was the sole mode of operation of computers at the time. Nowadays, computers are systems that interact with us and also each other; they are non-deterministic, reactive systems. Concurrency theory, split off from classical automata theory a few decades ago, provides a model of computation similar to the model given by the theory of automata and formal language, but focuses on concurrent, reactive and interactive systems. This thesis investigates the integration of the two theories, exposing the differences and similarities between them. Where automata and formal language theory focuses on computations and languages, concurrency theory focuses on behaviour. To achieve integration, we look for process-theoretic analogies of classic results from automata theory. The most prominent difference is that we use an interpretation of automata as labelled transition systems modulo (divergence-preserving) branching bisimilarity instead of treating automata as language acceptors. We also consider similarities such as grammars as recursive specifications and finite automata as labelled finite transition systems. We investigate whether the classical results still hold and, if not, what extra conditions are sufficient to make them hold. We especially look into three levels of Chomsky's hierarchy: we study the notions of finite-state systems, pushdown systems, and computable systems. Additionally we investigate the notion of parallel pushdown systems. For each class we define the central notion of automaton and its behaviour by associating a transition system with it. Then we introduce a suitable specification language and investigate the correspondence with the respective automaton (via its associated transition system). Because we not only want to study interaction with the environment, but also the interaction within the automaton, we make it explicit by means of communicating parallel components: one component representing the finite control of the automaton and one component representing the memory. First, we study finite-state systems by reinvestigating the relation between finite-state automata, left- and right-linear grammars, and regular expressions, but now up to (divergence-preserving) branching bisimilarity. For pushdown systems we augment the finite-state systems with stack memory to obtain the pushdown automata and consider different termination styles: termination on empty stack, on final state, and on final state and empty stack. Unlike for language equivalence, up to (divergence-preserving) branching bisimilarity the associated transition systems for the different termination styles fall into different classes. We obtain (under some restrictions) the correspondence between context-free grammars and pushdown automata for termination on final state and empty stack. We show how for contrasimulation, a weaker equivalence than branching bisimilarity, we can obtain the correspondence result without some of the restrictions. Finally, we make the interaction within a pushdown automaton explicit, but in a different way depending on the termination style. By analogy of pushdown systems we investigate the parallel pushdown systems, obtained by augmenting finite-state systems with bag memory, and consider analogous termination styles. We investigate the correspondence between context-free grammars that use parallel composition instead of sequential composition and parallel pushdown automata. While the correspondence itself is rather tight, it unfortunately only covers a small subset of the parallel pushdown automata, i.e. the single-state parallel pushdown automata. When making the interaction within parallel pushdown automata explicit, we obtain a rather uniform result for all termination styles. Finally, we study computable systems and the relation with exective and computable transition systems and Turing machines. For this we present the reactive Turing machine, a classical Turing machine augmented with capabilities for interaction. Again, we make the interaction in the reactive Turing machine between its finite control and the tape memory explicit
IDE for SCADA Development at CERN
Cílem této magisterské práce je navrhnout a implementovat IDE (integrované vývojové prostředí), které zvýší efektivitu a bezpečnost vývoje pro SIMATIC WinCC Open Architecture. Tato práce je založena na výzkumu provedeném týmem z Technické univerzity v Eindhovenu a splňuje požadavky pocházející ze SCD sekce v CERN (Evropské organizace pro jaderný výzkum). Vyvinuté IDE je postaveno na platformě Eclipse, přičemž pro syntaktickou analýzu, linkování a sémantickou analýzu kódu používá Xtext framework. IDE nabízí také podporu pro nově vytvořený programovací jazyk, který umožňuje programátorům jednoduše nadefinovat šablonu pro konfigurační soubory používané WinCC OA. Interpret tohoto nového jazyka je schopen provést syntaktickou analýzu šablony a konfiguračního souboru a rozhodnout, zdali konfigurační soubor odpovídá šabloně. Praktickým výstupem této práce je integrované vývojové prostředí, které podporuje vývoj WinCC OA aplikací v CERN a periodicky provádí analýzu kódu těchto aplikací napsaného v jazyce Control script.The goal of this master's thesis is to design and implement an IDE (Integrated Development Environment) that makes development for SIMATIC WinCC Open Architecture more effective and secure. This thesis is based on a research made by Eindhoven University of Technology and it meets needs of CERN EN ICE SCD section. The developed IDE is built on top of the Eclipse Platform and it uses Xtext for code parsing, scoping, linking and static code analysis. The IDE also supports a new programming language that allows programmers to easily define templates for WinCC OA configuration files. The interpreter of this new language is able to parse a template and a configuration file and decide whether the configuration file matches the template. The practical result of this thesis is an IDE that supports WinCC OA developers at CERN and performs periodical analysis of CERN code written in Control script Language.
FINGERPRINTING MALICIOUS IP TRAFFIC
In the new global economy, cyber-attacks have become a central issue. The detection, mitigation and attribution of such cyber-attacks require efficient and practical techniques to fingerprint malicious IP traffic. By fingerprinting, we refer to: (1) the detection of malicious network flows and, (2) the attribution of the detected flows to malware families that generate them. In this thesis, we firstly address the detection problem and solve it by using a classification technique. The latter uses features that exploit only high-level properties of traffic flows and therefore does not rely on deep packet inspection. As such, our technique is effective even in the presence of encrypted traffic. Secondly, whenever a malicious flow is detected, we propose another technique to attribute such a flow to the malware family that generated it. The attribution technique is built upon k-means clustering, sequence mining and Pushdown Automata (PDAs) to capture the network behaviors of malware family groups. Indeed, the generated PDAs are actually network signatures for malware family groups. Our results show that the proposed malicious detection and attribution techniques achieve high accuracy with low false (positive and negative) alerts
Recommended from our members
Automated Testing and Debugging for Big Data Analytics
The prevalence of big data analytics in almost every large-scale software system has generated a substantial push to build data-intensive scalable computing (DISC) frameworks such as Google MapReduce and Apache Spark that can fully harness the power of existing data centers. However, frameworks once used by domain experts are now being leveraged by data scientists, business analysts, and researchers. This shift in user demographics calls for immediate advancements in the development, debugging, and testing practices of big data applications, which are falling behind compared to the DISC framework design and implementation. In practice, big data applications often fail as users are unable to test all behaviors emerging from interleaving dataflow operators, user-defined functions, and framework's code. "Testing based on a random sample" rarely guarantees the reliability and "trial and error" and "print" debugging methods are expensive and time-consuming. Thus, the current practice of developing a big data application must be improved and the tools built to enhance the developer's productivity must adapt to the distinct characteristics of data-intensive scalable computing. By synthesizing ideas from software engineering and database systems, our hypothesis is that we can design effective and scalable testing and debugging algorithms for big data analytics without compromising the performance and efficiency of the underlying DISC framework. To design such techniques, we investigate how we can build interactive and responsive debugging primitives that significantly reduce the debugging time, yet do not pose much performance overhead on big data applications. Furthermore, we investigate how we can leverage data provenance techniques from databases and fault-isolation algorithms from software engineering to pinpoint the minimal subset of failure-inducing inputs efficiently. To improve the reliability of big data analytics, we investigate how we can abstract the semantics of dataflow operators and use them in tandem with the semantics of user-defined functions to generate a minimum set of synthetic test inputs capable of revealing more defects than the entire input dataset.To examine the first hypothesis, we introduce interactive, real-time debugging primitives for big data analytics through innovative and scalable debugging features such as simulated breakpoint, dynamic watchpoint, and crash culprit identification. Second, we design a new automated fault localization approach that combines insights from both the software engineering and database literature to bring delta debugging closer to a reality in the big data applications by leveraging data provenance and by constructing systems optimizations for debugging provenance queries. Lastly, we devise a new symbolic-execution based white-box testing algorithm for big data applications that abstracts the implementation of dataflow operators using logical specifications instead of modeling their implementations and combines them with the semantics of any arbitrary user-defined function. We instantiate the idea of an interactive debugging algorithm as BigDebug, the idea of an automated debugging algorithm as BigSift, and the idea of symbolic execution-based testing as BigTest. Our investigation shows that the interactive debugging primitives can scale to terabytes---our record-level tracing incurs less than 25% overhead on average and provides up to 100% time saving compared to the baseline replay debugger. Second, we observe that by combining data provenance with delta debugging, we can identify the minimum faulty input in just under 30% of the original job execution time. Lastly, we verify that by abstracting dataflow operators using logical specifications, we can efficiently generate the most concise test data suitable for local testing while revealing twice as many faults as prior approaches. Our investigations collectively demonstrate that developer productivity can be significantly improved through effective and scalable testing and debugging techniques for big data analytics, without impacting the DISC framework's performance. This dissertation affirms the feasibility of automated debugging and testing techniques for big data analytics---techniques that were previously considered infeasible for large-scale data processing
An Analytical Approach to Programs as Data Objects
This essay accompanies a selection of 32 articles (referred to in bold face in the text and marginally marked in the bibliographic references) submitted to Aarhus University towards a Doctor Scientiarum degree in Computer Science.The author's previous academic degree, beyond a doctoral degree in June 1986, is an "Habilitation à diriger les recherches" from the Université Pierre et Marie Curie (Paris VI) in France; the corresponding material was submitted in September 1992 and the degree was obtained in January 1993.The present 32 articles have all been written since 1993 and while at DAIMI.Except for one other PhD student, all co-authors are or have been the author's students here in Aarhus
Foundations of Software Science and Computation Structures
This open access book constitutes the proceedings of the 24th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 28 regular papers presented in this volume were carefully reviewed and selected from 88 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems