1,775 research outputs found
ArrayBridge: Interweaving declarative array processing with high-performance computing
Scientists are increasingly turning to datacenter-scale computers to produce
and analyze massive arrays. Despite decades of database research that extols
the virtues of declarative query processing, scientists still write, debug and
parallelize imperative HPC kernels even for the most mundane queries. This
impedance mismatch has been partly attributed to the cumbersome data loading
process; in response, the database community has proposed in situ mechanisms to
access data in scientific file formats. Scientists, however, desire more than a
passive access method that reads arrays from files.
This paper describes ArrayBridge, a bi-directional array view mechanism for
scientific file formats, that aims to make declarative array manipulations
interoperable with imperative file-centric analyses. Our prototype
implementation of ArrayBridge uses HDF5 as the underlying array storage library
and seamlessly integrates into the SciDB open-source array database system. In
addition to fast querying over external array objects, ArrayBridge produces
arrays in the HDF5 file format just as easily as it can read from it.
ArrayBridge also supports time travel queries from imperative kernels through
the unmodified HDF5 API, and automatically deduplicates between array versions
for space efficiency. Our extensive performance evaluation in NERSC, a
large-scale scientific computing facility, shows that ArrayBridge exhibits
statistically indistinguishable performance and I/O scalability to the native
SciDB storage engine.Comment: 12 pages, 13 figure
Efficient Change Management of XML Documents
XML-based documents play a major role in modern information architectures and their corresponding work-flows. In this context, the ability to identify and represent differences between two versions of a document is essential. A second important aspect is the merging of document versions, which becomes crucial in parallel editing processes. Many different approaches exist that meet these challenges. Most rely on operational transformation or document annotation. In both approaches, the operations leading to changes are tracked, which requires corresponding editing applications. In the context of software development, however, a state-based approach is common. Here, document versions are compared and merged using external tools, called diff and patch. This allows users for freely editing documents without being tightened to special tools. Approaches exist that are able to compare XML documents. A corresponding merge capability is still not available. In this thesis, I present a comprehensive framework that allows for comparing and merging of XML documents using a state-based approach. Its design is based on an analysis of XML documents and their modification patterns. The heart of the framework is a context-oriented delta model. I present a diff algorithm that appears to be highly efficient in terms of speed and delta quality. The patch algorithm is able to merge document versions efficiently and reliably. The efficiency and the reliability of my approach are verified using a competitive test scenario
NiftyNet: a deep-learning platform for medical imaging
Medical image analysis and computer-assisted intervention problems are
increasingly being addressed with deep-learning-based solutions. Established
deep-learning platforms are flexible but do not provide specific functionality
for medical image analysis and adapting them for this application requires
substantial implementation effort. Thus, there has been substantial duplication
of effort and incompatible infrastructure developed across many research
groups. This work presents the open-source NiftyNet platform for deep learning
in medical imaging. The ambition of NiftyNet is to accelerate and simplify the
development of these solutions, and to provide a common mechanism for
disseminating research outputs for the community to use, adapt and build upon.
NiftyNet provides a modular deep-learning pipeline for a range of medical
imaging applications including segmentation, regression, image generation and
representation learning applications. Components of the NiftyNet pipeline
including data loading, data augmentation, network architectures, loss
functions and evaluation metrics are tailored to, and take advantage of, the
idiosyncracies of medical image analysis and computer-assisted intervention.
NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D
and 3D images and computational graphs by default.
We present 3 illustrative medical image analysis applications built using
NiftyNet: (1) segmentation of multiple abdominal organs from computed
tomography; (2) image regression to predict computed tomography attenuation
maps from brain magnetic resonance images; and (3) generation of simulated
ultrasound images for specified anatomical poses.
NiftyNet enables researchers to rapidly develop and distribute deep learning
solutions for segmentation, regression, image generation and representation
learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge
Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6
figures; Update includes additional applications, updated author list and
formatting for journal submissio
Enhanced Version Control for Unconventional Applications
The Extensible Markup Language (XML) is widely used to store, retrieve, and share digital documents. Recently, a form of Version Control System has been applied to the language, resulting in Version-Aware XML allowing for enhanced portability and scalability. While Version Control Systems are able to keep track of changes made to documents, we think that there is untapped potential in the technology. In this dissertation, we present novel ways of using Version Control System to enhance the security and performance of existing applications. We present a framework to maintain integrity in offline XML documents and provide non-repudiation security features that are independent of central certificate repositories. In addition, we use Version Control information to enhance the performance of Automated Policy Enforcement eXchange framework (APEX), an existing document security framework developed by Hewlett-Packard (HP) Labs.
Finally, we present an interactive and scalable visualization framework to represent Version-Aware-related data that helps users visualize and understand version control data, delete specific revisions of a document, and access a comprehensive overview of the entire versioning history
Putting the Semantics into Semantic Versioning
The long-standing aspiration for software reuse has made astonishing strides
in the past few years. Many modern software development ecosystems now come
with rich sets of publicly-available components contributed by the community.
Downstream developers can leverage these upstream components, boosting their
productivity.
However, components evolve at their own pace. This imposes obligations on and
yields benefits for downstream developers, especially since changes can be
breaking, requiring additional downstream work to adapt to. Upgrading too late
leaves downstream vulnerable to security issues and missing out on useful
improvements; upgrading too early results in excess work. Semantic versioning
has been proposed as an elegant mechanism to communicate levels of
compatibility, enabling downstream developers to automate dependency upgrades.
While it is questionable whether a version number can adequately characterize
version compatibility in general, we argue that developers would greatly
benefit from tools such as semantic version calculators to help them upgrade
safely. The time is now for the research community to develop such tools: large
component ecosystems exist and are accessible, component interactions have
become observable through automated builds, and recent advances in program
analysis make the development of relevant tools feasible. In particular,
contracts (both traditional and lightweight) are a promising input to semantic
versioning calculators, which can suggest whether an upgrade is likely to be
safe.Comment: to be published as Onward! Essays 202
Develop a generic Rules Engine to quality control a CV database
This bachelor’s thesis presents a software solution to enhance Bouvet’s quality control process
for employee CVs. By implementing a generic rule engine with extended functionalities, we
identified that 90% of the CVs at Bouvet did not meet the company’s business standards.
Using Scrum with Extreme Programming as our project management system, we developed a
scalable and maintainable pilot, employing Microservices, Event-Driven, and Command and
Query Responsibility Segregation architecture. Our pilot allows for future modifications using
create, read, update and delete operations. The software solution presented in this thesis can
be extended to a production-ready state by implementing an Role-based access control and
an API-Gateway. When the event bus project by another group at Bouvet is completed, our
implementation will be able to notify employees about their CVs’ status, further improving
the quality control process. Overall, our results demonstrate the our software solution and
project management system in enhancing the quality control of employee CVs at Bouvet.This bachelor’s thesis presents a software solution to enhance Bouvet’s quality control process
for employee CVs. By implementing a generic rule engine with extended functionalities, we
identified that 90% of the CVs at Bouvet did not meet the company’s business standards.
Using Scrum with Extreme Programming as our project management system, we developed a
scalable and maintainable pilot, employing Microservices, Event-Driven, and Command and
Query Responsibility Segregation architecture. Our pilot allows for future modifications using
create, read, update and delete operations. The software solution presented in this thesis can
be extended to a production-ready state by implementing an Role-based access control and
an API-Gateway. When the event bus project by another group at Bouvet is completed, our
implementation will be able to notify employees about their CVs’ status, further improving
the quality control process. Overall, our results demonstrate the our software solution and
project management system in enhancing the quality control of employee CVs at Bouvet
- …