19 research outputs found

    Schemas and types for JSON data

    Get PDF
    The last few years have seen the fast and ubiquitous diffusion of JSON as one of the most widely used formats for publishing and interchanging data, as it combines the flexibility of semistructured data models with well-known data structures like records and arrays. The user willing to effectively manage JSON data collections can rely on several schema languages, like JSON Schema, JSound, and Joi, or on the type abstractions offered by modern programming languages like Swift or TypeScript. The main aim of this tutorial is to provide the audience with the basic notions for enjoying all the benefits that schemas and types can offer while processing and manipulating JSON data. This tutorial focuses on four main aspects of the relation between JSON and schemas: (1) we survey existing schema language proposals and discuss their prominent features; (2) we review how modern programming languages support JSON data as first-class citizens; (3) we analyze tools that can infer schemas from data, or that exploit schema information for improving data parsing and management; and (4) we discuss some open research challenges and opportunities related to JSON data

    Schemas and types for JSON data: From theory to practice

    Get PDF
    The last few years have seen the fast and ubiquitous diffusion of JSON as one of the most widely used formats for publishing and interchanging data, as it combines the flexibility of semistructured data models with well-known data structures like records and arrays. The user willing to effectively manage JSON data collections can rely on several schema languages, like JSON Schema, JSound, and Joi, as well as on the type abstractions offered by modern programming and scripting languages like Swift or TypeScript. The main aim of this tutorial is to provide the audience (both researchers and practitioners) with the basic notions for enjoying all the benefits that schema and types can offer while processing and manipulating JSON data. This tutorial focuses on four main aspects of the relation between JSON and schemas: (1) we survey existing schema language proposals and discuss their prominent features; (2) we analyze tools that can infer schemas from data, or that exploit schema information for improving data parsing and management; and (3) we discuss some open research challenges and opportunities related to JSON data

    An Empirical Study on the “Usage of Not” in Real-World JSON Schema Documents

    No full text
    We study the usage of negation in JSON Schema data modeling. Negation is a logical operator rarely present in type systems and schema description languages, since it complicates decision problems: many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. This motivates us to study whether negation is actually used in practice, for which aims, and whether it could—in principle—be replaced by simpler operators. We have collected a large corpus of 80k open source JSON Schema documents from GitHub. We perform a systematic analysis, quantify usage patterns of negation, and also qualitatively analyze schemas. We show that negation is indeed used, albeit infrequently, following a stable set of patterns

    The Usage of Negation in Real-World JSON Schema Documents

    No full text
    Many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. This motivates us to study whether negation is actually used in practice, for which aims, and whether it could, in principle, be replaced by simpler operators. We have collected a large corpus of 80k open source JSON Schema documents. We perform a systematic analysis, quantify usage patterns of negation, and also qualitatively analyze schemas. We show that negation is indeed used, albeit infrequently, following a stable set of patterns

    Negation-closure for JSON Schema

    No full text
    JSON Schema is an evolving standard for describing families of JSON documents. It is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural combinators for these assertions, including a negation operator. Most logical languages with negation enjoy negation closure: for every operator, they have a negation-dual that allows negation to be pushed through the operator. We show that this is not the case for JSON Schema, study how that changed with the latest versions of the Draft, and discuss how the language may be enriched accordingly. To this aim, we exploit an algebraic reformulation of JSON Schema, which is helpful for the formal manipulation of the language

    Human-in-the-loop schema inference for massive JSON datasets

    No full text
    JSON established itself as a popular data format for representing data whose structure is irregular or unknown a priori. JSON collections are usually massive and schema-less. Inferring a schema describing the structure of these collections is crucial for formulating meaningful queries and for adopting schema-based optimizations. In a recent work, we proposed a Map/Reduce schema inference approach that either infers a compact representation of the input collection or a precise description of every possible shape in the data. Since no level of precision is ideal, it is more appealing to give the analyst the freedom of choosing between different levels of precisions in an interactive fashion. In this paper we describe a schema inference system offering this important functionality

    Challenges in Checking JSON Schema Containment over Evolving Real-World Schemas

    No full text
    JSON Schema is maturing into the de-facto schema language for JSON documents. When JSON Schema declarations evolve, the question arises how the new schema will deal with JSON documents that still adhere to the legacy schema. This is particularly crucial in the maintenance of software APIs. In this paper, we present the results of our empirical study of the first generation of tools for checking JSON Schema containment which we apply to a diverse collection of over 230 real-world schemas and their altogether 1k historic versions. We assess two such special-purpose tools w.r.t. their applicability to real-world schemas and identify weak spots. Based on this analysis, we enumerate specific open research challenges that are based on real-world problems

    Witness Generation for JSON Schema

    No full text
    JSON Schema is a schema language for JSON documents, based on a complex combination of structural operators, Boolean operators (negation included), and recursive variables. The static analysis of JSON Schema documents comprises practically relevant problems, including schema satisfiability, inclusion, and equivalence. These problems can be reduced to witness generation: given a schema, generate an element of the schema — if it exists — and report failure otherwise. Schema satisfiability, inclusion, and equivalence have been shown to be decidable. However, no witness generation algorithm has yet been formally described. We contribute a first, direct algorithm for JSON Schema witness generation, and study its effectiveness and efficiency in experiments over several schema collections, including thousands of real-world schemas
    corecore