68 research outputs found

    Left Recursion in Parsing Expression Grammars

    Full text link
    Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic context-free languages through a set of rules that specify a top-down parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming languages. A frequently missed feature of PEGs is left recursion, which is commonly used in Context-Free Grammars (CFGs) to encode left-associative operations. We present a simple conservative extension to the semantics of PEGs that gives useful meaning to direct and indirect left-recursive rules, and show that our extensions make it easy to express left-recursive idioms from CFGs in PEGs, with similar results. We prove the conservativeness of these extensions, and also prove that they work with any left-recursive PEG. PEGs can also be compiled to programs in a low-level parsing machine. We present an extension to the semantics of the operations of this parsing machine that let it interpret left-recursive PEGs, and prove that this extension is correct with regards to our semantics for left-recursive PEGs.Comment: Extended version of the paper "Left Recursion in Parsing Expression Grammars", that was published on 2012 Brazilian Symposium on Programming Language

    PARALLELIZING TIME-SERIES SESSION DATA ANALYSIS WITH A TYPE-ERASURE BASED DSEL

    Get PDF
    The Science Information Network (SINET) is a Japanese academic backbone network.  SINET consists of more than 800 universities and research institutions.  In the operation of a huge academic backbone network, more flexible querying technology is required to cope with massive time series session data and analysis of sophisticated cyber-attacks. This paper proposes a parallelizing DSEL (Domain Specific Embedded Language) processing for huge time-series session data. In our DESL, the function object is implemented by type erasure for constructing internal DSL for processing time-series data. Type erasure enables our parser to store function pointer and function object into the same *void type with class templates. We apply to scatter/gather pattern for concurrent DSEL parsing. Each thread parses DSEL to extract the tuple timestamp, source IP, and destination IP in the gather phase. In the scattering phase, we use a concurrent hash map to handle multiple thread outputs with our DSEL. In the experiment, we have measured the elapsed time in parsing and inserting IPv4 address and timestamp data format ranging from 1,000 to 50,000 lines with 24-row items. We have also measured CPU idle time in processing 100,000,000 lines of session data with 5, 10 and 20 multiple threads. It has been turned out that the proposed method can work in feasible computing time in both cases

    Well-Formed and Scalable Invasive Software Composition

    Get PDF
    Software components provide essential means to structure and organize software effectively. However, frequently, required component abstractions are not available in a programming language or system, or are not adequately combinable with each other. Invasive software composition (ISC) is a general approach to software composition that unifies component-like abstractions such as templates, aspects and macros. ISC is based on fragment composition, and composes programs and other software artifacts at the level of syntax trees. Therefore, a unifying fragment component model is related to the context-free grammar of a language to identify extension and variation points in syntax trees as well as valid component types. By doing so, fragment components can be composed by transformations at respective extension and variation points so that always valid composition results regarding the underlying context-free grammar are yielded. However, given a language’s context-free grammar, the composition result may still be incorrect. Context-sensitive constraints such as type constraints may be violated so that the program cannot be compiled and/or interpreted correctly. While a compiler can detect such errors after composition, it is difficult to relate them back to the original transformation step in the composition system, especially in the case of complex compositions with several hundreds of such steps. To tackle this problem, this thesis proposes well-formed ISC—an extension to ISC that uses reference attribute grammars (RAGs) to specify fragment component models and fragment contracts to guard compositions with context-sensitive constraints. Additionally, well-formed ISC provides composition strategies as a means to configure composition algorithms and handle interferences between composition steps. Developing ISC systems for complex languages such as programming languages is a complex undertaking. Composition-system developers need to supply or develop adequate language and parser specifications that can be processed by an ISC composition engine. Moreover, the specifications may need to be extended with rules for the intended composition abstractions. Current approaches to ISC require complete grammars to be able to compose fragments in the respective languages. Hence, the specifications need to be developed exhaustively before any component model can be supplied. To tackle this problem, this thesis introduces scalable ISC—a variant of ISC that uses island component models as a means to define component models for partially specified languages while still the whole language is supported. Additionally, a scalable workflow for agile composition-system development is proposed which supports a development of ISC systems in small increments using modular extensions. All theoretical concepts introduced in this thesis are implemented in the Skeletons and Application Templates framework SkAT. It supports “classic”, well-formed and scalable ISC by leveraging RAGs as its main specification and implementation language. Moreover, several composition systems based on SkAT are discussed, e.g., a well-formed composition system for Java and a C preprocessor-like macro language. In turn, those composition systems are used as composers in several example applications such as a library of parallel algorithmic skeletons

    Grammars for Indentation-Sensitive Parsing

    Get PDF
    Adams\u27 extension of parsing expression grammars enables specifying indentation sensitivity using two non-standard grammar constructs - indentation by a binary relation and alignment. This paper is a theoretical study of Adams\u27 grammars. It proposes a step-by-step transformation of well-formed Adams\u27 grammars for elimination of the alignment construct from the grammar. The idea that alignment could be avoided was suggested by Adams but no process for achieving this aim has been described before. This paper also establishes general conditions that binary relations used in indentation constructs must satisfy in order to enable efficient parsing

    PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets

    Full text link
    Over the years, Stack Overflow (SO) has accumulated numerous code snippets, with developers going to SO for problem solutions and code references. However, in the case of the Python programming language, Python 3 is not necessarily backward compatible with Python 2. The major implication of this versioning problem is that code written in Python 2 may not be interpreted by Python 3 without modifications. This issue may affect the usability of Python code snippets on SO. We investigate how many Python code snippets on SO suffer from version compatibility issues, and find that about 10% of the snippets exhibit this problem. Moreover, of the code snippets that are interpretable only by Python 2 or Python 3, less than 17% are tagged with the Python version.In this paper, we present a Chrome extension called PyVerDetector. This extension allows the user to select a given version of Python and verifies whether the code snippets on a given SO question are compatible with the user's selected Python version, providing error messages if not. The tool parses snippets and can determine versioning errors due to differences in syntax and also provides the user with a list of Python versions capable of interpreting each code snippet.Yang S., Kanda T., Pizzolotto D., et al. PyVerDetector: A Chrome Extension Detecting the Python Version of Stack Overflow Code Snippets. IEEE International Conference on Program Comprehension 2023-May, 25 (2023); https://doi.org/10.1109/ICPC58990.2023.00013
    • 

    corecore