1,775 research outputs found

    ArrayBridge: Interweaving declarative array processing with high-performance computing

    Full text link
    Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aims to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.Comment: 12 pages, 13 figure

    Efficient Change Management of XML Documents

    Get PDF
    XML-based documents play a major role in modern information architectures and their corresponding work-flows. In this context, the ability to identify and represent differences between two versions of a document is essential. A second important aspect is the merging of document versions, which becomes crucial in parallel editing processes. Many different approaches exist that meet these challenges. Most rely on operational transformation or document annotation. In both approaches, the operations leading to changes are tracked, which requires corresponding editing applications. In the context of software development, however, a state-based approach is common. Here, document versions are compared and merged using external tools, called diff and patch. This allows users for freely editing documents without being tightened to special tools. Approaches exist that are able to compare XML documents. A corresponding merge capability is still not available. In this thesis, I present a comprehensive framework that allows for comparing and merging of XML documents using a state-based approach. Its design is based on an analysis of XML documents and their modification patterns. The heart of the framework is a context-oriented delta model. I present a diff algorithm that appears to be highly efficient in terms of speed and delta quality. The patch algorithm is able to merge document versions efficiently and reliably. The efficiency and the reliability of my approach are verified using a competitive test scenario

    NiftyNet: a deep-learning platform for medical imaging

    Get PDF
    Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this application requires substantial implementation effort. Thus, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D and 3D images and computational graphs by default. We present 3 illustrative medical image analysis applications built using NiftyNet: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6 figures; Update includes additional applications, updated author list and formatting for journal submissio

    Enhanced Version Control for Unconventional Applications

    Get PDF
    The Extensible Markup Language (XML) is widely used to store, retrieve, and share digital documents. Recently, a form of Version Control System has been applied to the language, resulting in Version-Aware XML allowing for enhanced portability and scalability. While Version Control Systems are able to keep track of changes made to documents, we think that there is untapped potential in the technology. In this dissertation, we present novel ways of using Version Control System to enhance the security and performance of existing applications. We present a framework to maintain integrity in offline XML documents and provide non-repudiation security features that are independent of central certificate repositories. In addition, we use Version Control information to enhance the performance of Automated Policy Enforcement eXchange framework (APEX), an existing document security framework developed by Hewlett-Packard (HP) Labs. Finally, we present an interactive and scalable visualization framework to represent Version-Aware-related data that helps users visualize and understand version control data, delete specific revisions of a document, and access a comprehensive overview of the entire versioning history

    Putting the Semantics into Semantic Versioning

    Full text link
    The long-standing aspiration for software reuse has made astonishing strides in the past few years. Many modern software development ecosystems now come with rich sets of publicly-available components contributed by the community. Downstream developers can leverage these upstream components, boosting their productivity. However, components evolve at their own pace. This imposes obligations on and yields benefits for downstream developers, especially since changes can be breaking, requiring additional downstream work to adapt to. Upgrading too late leaves downstream vulnerable to security issues and missing out on useful improvements; upgrading too early results in excess work. Semantic versioning has been proposed as an elegant mechanism to communicate levels of compatibility, enabling downstream developers to automate dependency upgrades. While it is questionable whether a version number can adequately characterize version compatibility in general, we argue that developers would greatly benefit from tools such as semantic version calculators to help them upgrade safely. The time is now for the research community to develop such tools: large component ecosystems exist and are accessible, component interactions have become observable through automated builds, and recent advances in program analysis make the development of relevant tools feasible. In particular, contracts (both traditional and lightweight) are a promising input to semantic versioning calculators, which can suggest whether an upgrade is likely to be safe.Comment: to be published as Onward! Essays 202

    Develop a generic Rules Engine to quality control a CV database

    Get PDF
    This bachelor’s thesis presents a software solution to enhance Bouvet’s quality control process for employee CVs. By implementing a generic rule engine with extended functionalities, we identified that 90% of the CVs at Bouvet did not meet the company’s business standards. Using Scrum with Extreme Programming as our project management system, we developed a scalable and maintainable pilot, employing Microservices, Event-Driven, and Command and Query Responsibility Segregation architecture. Our pilot allows for future modifications using create, read, update and delete operations. The software solution presented in this thesis can be extended to a production-ready state by implementing an Role-based access control and an API-Gateway. When the event bus project by another group at Bouvet is completed, our implementation will be able to notify employees about their CVs’ status, further improving the quality control process. Overall, our results demonstrate the our software solution and project management system in enhancing the quality control of employee CVs at Bouvet.This bachelor’s thesis presents a software solution to enhance Bouvet’s quality control process for employee CVs. By implementing a generic rule engine with extended functionalities, we identified that 90% of the CVs at Bouvet did not meet the company’s business standards. Using Scrum with Extreme Programming as our project management system, we developed a scalable and maintainable pilot, employing Microservices, Event-Driven, and Command and Query Responsibility Segregation architecture. Our pilot allows for future modifications using create, read, update and delete operations. The software solution presented in this thesis can be extended to a production-ready state by implementing an Role-based access control and an API-Gateway. When the event bus project by another group at Bouvet is completed, our implementation will be able to notify employees about their CVs’ status, further improving the quality control process. Overall, our results demonstrate the our software solution and project management system in enhancing the quality control of employee CVs at Bouvet
    • …
    corecore