9 research outputs found

    Linear Matching of JavaScript Regular Expressions

    Full text link
    Modern regex languages have strayed far from well-understood traditional regular expressions: they include features that fundamentally transform the matching problem. In exchange for these features, modern regex engines at times suffer from exponential complexity blowups, a frequent source of denial-of-service vulnerabilities in JavaScript applications. Worse, regex semantics differ across languages, and the impact of these divergences on algorithmic design and worst-case matching complexity has seldom been investigated. This paper provides a novel perspective on JavaScript's regex semantics by identifying a larger-than-previously-understood subset of the language that can be matched with linear time guarantees. In the process, we discover several cases where state-of-the-art algorithms were either wrong (semantically incorrect), inefficient (suffering from superlinear complexity) or excessively restrictive (assuming certain features could not be matched linearly). We introduce novel algorithms to restore correctness and linear complexity. We further advance the state-of-the-art in linear regex matching by presenting the first nonbacktracking algorithms for matching lookarounds in linear time: one supporting captureless lookbehinds in any regex language, and another leveraging a JavaScript property to support unrestricted lookaheads and lookbehinds. Finally, we describe new time and space complexity tradeoffs for regex engines. All of our algorithms are practical: we validated them in a prototype implementation, and some have also been merged in the V8 JavaScript implementation used in Chrome and Node.js

    Backtracking reference stores

    Get PDF
    National audienceFrançois Pottier's unionFind library is parameterized over an underlying store of mutable references, and provides the usual references, transactional reference stores (for rolling back some changes in case of higher-level errors), and persistent reference stores. We extend this library with a new implementation of backtracking reference stores, to get a Union-Find implementation that efficiently supports arbitrary backtracking and also subsumes the transactional interface. Our backtracking reference stores are not specific to unionFind, they can be used to build arbitrary backtracking data structures. The natural implementation, using a journal to record all writes, provides amortized-constant-time operations with a space overhead linear in the number of store updates. A refined implementation reduces the memory overhead to be linear in the number of store cells updated, and gives performance that match non-backtracking references in practice

    Kindly Bent to Free Us

    Get PDF
    Systems programming often requires the manipulation of resources like file handles, network connections, or dynamically allocated memory. Programmers need to follow certain protocols to handle these resources correctly. Violating these protocols causes bugs ranging from type mismatches over data races to use-after-free errors and memory leaks. These bugs often lead to security vulnerabilities. While statically typed programming languages guarantee type soundness and memory safety by design, most of them do not address issues arising from improper handling of resources. An important step towards handling resources is the adoption of linear and affine types that enforce single-threaded resource usage. However, the few languages supporting such types require heavy type annotations. We present Affe, an extension of ML that manages linearity and affinity properties using kinds and constrained types. In addition Affe supports the exclusive and shared borrowing of affine resources, inspired by features of Rust. Moreover, Affe retains the defining features of the ML family: it is an impure, strict, functional expression language with complete principal type inference and type abstraction. Affe does not require any linearity annotations in expressions and supports common functional programming idioms.Comment: ICFP 202

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 31st European Symposium on Programming, ESOP 2022, which was held during April 5-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 21 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Kindly bent to free us

    Get PDF
    International audienceSystems programming often requires the manipulation of resources like file handles, network connections, or dynamically allocated memory. Programmers need to follow certain protocols to handle these resources correctly. Violating these protocols causes bugs ranging from type mismatches over data races to use-after-free errors and memory leaks. These bugs often lead to security vulnerabilities. While statically typed programming languages guarantee type soundness and memory safety by design, most of them do not address issues arising from improper handling of resources. An important step towards handling resources is the adoption of linear and affine types that enforce single-threaded resource usage. However, the few languages supporting such types require heavy type annotations. We present Affe, an extension of ML that manages linearity and affinity properties using kinds and constrained types. In addition Affe supports the exclusive and shared borrowing of affine resources, inspired by features of Rust. Moreover, Affe retains the defining features of the ML family: it is an impure, strict, functional expression language with complete principal type inference and type abstraction. Affe does not require any linearity annotations in expressions and supports common functional programming idioms

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 31st European Symposium on Programming, ESOP 2022, which was held during April 5-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 21 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Des types aux assertions logiques : preuve automatique ou assistée de propriétés sur les programmes fonctionnels.

    Get PDF
    This work studies two approaches to improve the safety of computer programs using static analysis.The first one is typing which guarantees that the evaluation of program cannot fail. The functionallanguage ML has a very rich type system and also an algorithm that infers automatically the types.We focus on its adaptation to generalized algebraic data types (GADTs). In this setting, efficientcomputation of a most general type is impossible. We propose a stratification of the language thatretain the usual characteristics of the ML fragment and make explicit the use of GADTs. The re-sulting language, MLGX, entails a burden on the programmer who must annotate its programs toomuch. A second stratum, MLGI, offers a mechanism to infer locally, in a predictable and efficient way,incomplete yet, most of the type annotations. The first part concludes on an illustration of the expres-siveness of GADTs to encode the invariants of pushdown automata used in LR parsing. The secondapproach augments the language with logic assertions that enables arbitrarily complex specificationsto be expressed. We check the compliance of the program semantics with respect to these specifica-tions thanks to a method called Hoare logic and thanks to semi-automatic computer-based proofs.The design choices permit to handle first-class functions. They are directed by an implementationwhich is illustrated by the certification of a module of trees that denote finite sets.Cette thèse étudie deux approches fondées sur l’analyse statique pour augmenter la sûreté defonctionnement et la correction des programmes informatiques.La première approche est le typage qui permet de prouver automatiquement qu’un programmes’évalue sans échouer. Le langage fonctionnel ML possède un système de type très riche et un algorithmeeffectuant une synthèse automatique de ces types. On s’intéresse à l’adaptation de cet algorithme auxtypes algébriques généralisés (GADT), une forme restreinte des inductifs de Coq, qui ont été introduitspar Hongwei Xi en 2003.Dans ce cadre, le calcul efficace d’un type plus général est impossible. On propose une stratificationqui maintient les caractéristiques habituelles sur le fragment ML et qui isole le traitement des GADTen explicitant leur utilisation. Le langage obtenu, MLGX, nécessite des annotations de type qui alour-dissent les programmes. Une seconde strate, MLGI, offre au programmeur un mécanisme de synthèselocale, prédictible et efficace bien qu’incomplet, de la plupart de ces annotations. La première parties’achève avec une démonstration de l’expressivité des GADT pour coder les invariants des automatesà pile utilisés par l’analyse syntaxique LR.La seconde approche augmente le langage de programmation par des assertions logiques permettantd’exprimer des spécifications de complexité arbitraire dans la logique d’ordre supérieur polymorphi-quement typée. On vérifie statiquement la conformité de la sémantique du programme vis-à-vis de cesspécifications à l’aide d’une technique appelée logique de Hoare qui consiste à engendrer un ensembled’obligations de preuves à partir d’un programme annoté. Une fois ces obligations de preuve traitées,si un programme est utilisé correctement et si il renvoie une valeur alors il est certain que celle-ci estcorrecte.Habituellement, cette technique est employée sur les langages impératifs. Avec un langage fonc-tionnel pur, les problèmes liés à l’état de la mémoire d’évanouissent tandis que l’ordre supérieur etle polymorphisme en posent de nouveaux. Nos choix de conceptions cherchent à maximiser les op-portunités d’utilisation de prouveurs automatiques en traduisant minutieusement les objets d’ordresupérieur en objets du premier ordre. Une implantation prototype du système en fournit une illustra-tion dans la preuve presque totalement automatique d’un module CAML d’arbres équilibrés dénotantdes ensembles finis

    A persistent union-find data structure

    No full text
    corecore