17 research outputs found

    Publication list of Zoltán Ésik

    Get PDF

    Derived-Term Automata of Multitape Expressions with Composition

    Get PDF
    Rational expressions are powerful tools to define automata, but often restricted to single-tape automata. Our goal is to unleash their expressive power for transducers, and more generally, any multitape automaton; for instance (a + |x+b + |y) ∗ . We generalize the construction of the derived-term automaton by using expansions. This approach generates small automata, and even allows us to support a composition operator

    Formalizing BPE Tokenization

    Full text link
    In this paper, we formalize practical byte pair encoding tokenization as it is used in large language models and other NLP systems, in particular we formally define and investigate the semantics of the SentencePiece and HuggingFace tokenizers, in particular how they relate to each other, depending on how the tokenization rules are constructed. Beyond this we consider how tokenization can be performed in an incremental fashion, as well as doing it left-to-right using an amount of memory constant in the length of the string, enabling e.g. using a finite state string-to-string transducer.Comment: In Proceedings NCMA 2023, arXiv:2309.0733

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 13371 and 13372 constitutes the refereed proceedings of the 34rd International Conference on Computer Aided Verification, CAV 2022, which was held in Haifa, Israel, in August 2022. The 40 full papers presented together with 9 tool papers and 2 case studies were carefully reviewed and selected from 209 submissions. The papers were organized in the following topical sections: Part I: Invited papers; formal methods for probabilistic programs; formal methods for neural networks; software Verification and model checking; hyperproperties and security; formal methods for hardware, cyber-physical, and hybrid systems. Part II: Probabilistic techniques; automata and logic; deductive verification and decision procedures; machine learning; synthesis and concurrency. This is an open access book

    Acta Cybernetica : Volume 23. Number 1.

    Get PDF
    corecore