31,637 research outputs found
Decision-focussed resource modelling for design decision support
Resource management including resource allocation, levelling, configuration and monitoring has been recognised as critical to design decision making. It has received increasing research interests in recent years. Different definitions, models and systems have been developed and published in literature. One common issue with existing research is that the resource modelling has focussed on the information view of resources. A few acknowledged the importance of resource capability to design management, but none has addressed the evaluation analysis of resource fitness to effectively support design decisions. This paper proposes a decision-focused resource model framework that addresses the combination of resource evaluation with resource information from multiple perspectives. A resource management system constructed on the resource model framework can provide functions for design engineers to efficiently search and retrieve the best fit resources (based on the evaluation results) to meet decision requirements. Thus, the system has the potential to provide improved decision making performance compared with existing resource management systems
Database independent Migration of Objects into an Object-Relational Database
This paper reports on the CERN-based WISDOM project which is studying the
serialisation and deserialisation of data to/from an object database
(objectivity) and ORACLE 9i.Comment: 26 pages, 18 figures; CMS CERN Conference Report cr02_01
Automatic Verification of Transactions on an Object-Oriented Database
In the context of the object-oriented data model, a compiletime approach is given that provides for a significant reduction of the amount of run-time transaction overhead due to integrity constraint checking. The higher-order logic Isabelle theorem prover is used to automatically prove which constraints might, or might not be violated by a given transaction in a manner analogous to the one used by Sheard and Stemple (1989) for the relational data model. A prototype transaction verification tool has been implemented, which automates the semantic mappings and generates proof goals for Isabelle. Test results are discussed to illustrate the effectiveness of our approach
Theory and Practice of Data Citation
Citations are the cornerstone of knowledge propagation and the primary means
of assessing the quality of research, as well as directing investments in
science. Science is increasingly becoming "data-intensive", where large volumes
of data are collected and analyzed to discover complex patterns through
simulations and experiments, and most scientific reference works have been
replaced by online curated datasets. Yet, given a dataset, there is no
quantitative, consistent and established way of knowing how it has been used
over time, who contributed to its curation, what results have been yielded or
what value it has.
The development of a theory and practice of data citation is fundamental for
considering data as first-class research objects with the same relevance and
centrality of traditional scientific products. Many works in recent years have
discussed data citation from different viewpoints: illustrating why data
citation is needed, defining the principles and outlining recommendations for
data citation systems, and providing computational methods for addressing
specific issues of data citation.
The current panorama is many-faceted and an overall view that brings together
diverse aspects of this topic is still missing. Therefore, this paper aims to
describe the lay of the land for data citation, both from the theoretical (the
why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association
for Information Science and Technology (JASIST), 201
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development
This paper describes PlinyCompute, a system for development of
high-performance, data-intensive, distributed computing tools and libraries. In
the large, PlinyCompute presents the programmer with a very high-level,
declarative interface, relying on automatic, relational-database style
optimization to figure out how to stage distributed computations. However, in
the small, PlinyCompute presents the capable systems programmer with a
persistent object data model and API (the "PC object model") and associated
memory management system that has been designed from the ground-up for high
performance, distributed, data-intensive computing. This contrasts with most
other Big Data systems, which are constructed on top of the Java Virtual
Machine (JVM), and hence must at least partially cede performance-critical
concerns such as memory management (including layout and de/allocation) and
virtual method/function dispatch to the JVM. This hybrid approach---declarative
in the large, trusting the programmer's ability to utilize PC object model
efficiently in the small---results in a system that is ideal for the
development of reusable, data-intensive tools and libraries. Through extensive
benchmarking, we show that implementing complex objects manipulation and
non-trivial, library-style computations on top of PlinyCompute can result in a
speedup of 2x to more than 50x or more compared to equivalent implementations
on Spark.Comment: 48 pages, including references and Appendi
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
Analyzing high energy physics data using database computing: Preliminary report
A proof of concept system is described for analyzing high energy physics (HEP) data using data base computing. The system is designed to scale up to the size required for HEP experiments at the Superconducting SuperCollider (SSC) lab. These experiments will require collecting and analyzing approximately 10 to 100 million 'events' per year during proton colliding beam collisions. Each 'event' consists of a set of vectors with a total length of approx. one megabyte. This represents an increase of approx. 2 to 3 orders of magnitude in the amount of data accumulated by present HEP experiments. The system is called the HEPDBC System (High Energy Physics Database Computing System). At present, the Mark 0 HEPDBC System is completed, and can produce analysis of HEP experimental data approx. an order of magnitude faster than current production software on data sets of approx. 1 GB. The Mark 1 HEPDBC System is currently undergoing testing and is designed to analyze data sets 10 to 100 times larger
- …