1,194,656 research outputs found

    Rumble: Data Independence for Large Messy Data Sets

    Full text link
    This paper introduces Rumble, an engine that executes JSONiq queries on large, heterogeneous and nested collections of JSON objects, leveraging the parallel capabilities of Spark so as to provide a high degree of data independence. The design is based on two key insights: (i) how to map JSONiq expressions to Spark transformations on RDDs and (ii) how to map JSONiq FLWOR clauses to Spark SQL on DataFrames. We have developed a working implementation of these mappings showing that JSONiq can efficiently run on Spark to query billions of objects into, at least, the TB range. The JSONiq code is concise in comparison to Spark's host languages while seamlessly supporting the nested, heterogeneous data sets that Spark SQL does not. The ability to process this kind of input, commonly found, is paramount for data cleaning and curation. The experimental analysis indicates that there is no excessive performance loss, occasionally even a gain, over Spark SQL for structured data, and a performance gain over PySpark. This demonstrates that a language such as JSONiq is a simple and viable approach to large-scale querying of denormalized, heterogeneous, arborescent data sets, in the same way as SQL can be leveraged for structured data sets. The results also illustrate that Codd's concept of data independence makes as much sense for heterogeneous, nested data sets as it does on highly structured tables.Comment: Preprint, 9 page

    Experimental analysis and numerical simulation of sintered micro-fluidic

    Get PDF
    This paper investigates the use of numerical simulations to describe solid state diffusion of a sintering stage during a Powder Hot Embossing (PHE) process for micro-fluidic components. Finite element analysis based on a thermo-elasto-viscoplastic model was established to describe the densification process of a PHE stainless steel porous component during sintering. The corresponding parameters such as the bulk viscosity, shearing viscosity and sintering stress are identified from dilatometer experimental data. The numerical analyses, which were performed on a 3D micro-structured component, allowed comparison between the numerical predictions and experimental results of during a sintering stage. This comparison demonstrates that the FE simulation results are in better agreement with the experimental results at high temperatures

    Asking the experts : developing and validating parental diaries to assess children's minor injuries

    Get PDF
    The methodological issues involved in parental reporting of events in children's everyday lives are discussed with reference to the development and validation of an incident diary, collecting concurrent data on minor injuries in a community study of children under eight years old. Eighty-two mothers participated in a comparison over nine days of daily telephone interviews and structured incident diaries. Telephone methods resulted in more missing data, and participants in both groups expressed a preference for the diary method. This diary was then validated on a sample of 56 preschool and school-aged children by comparing injury recording by a research health visitor with that of their mothers. Each failed to report some injuries, but there was good agreement overall, and in descriptive data on injuries reported by both. Parental diaries have the potential to provide rich data, of acceptable validity, on minor events in everyday life
    • …
    corecore