Search CORE

7,844 research outputs found

User Applications Driven by the Community Contribution Framework MPContribs in the Materials Project

Author: Cholia Shreyas
Gunter Dan
Huck Patrick
N'Diaye Alpha
Persson Kristin
Winston Donald
Publication venue: 'Wiley'
Publication date: 19/10/2015
Field of study

This work discusses how the MPContribs framework in the Materials Project (MP) allows user-contributed data to be shown and analyzed alongside the core MP database. The Materials Project is a searchable database of electronic structure properties of over 65,000 bulk solid materials that is accessible through a web-based science-gateway. We describe the motivation for enabling user contributions to the materials data and present the framework's features and challenges in the context of two real applications. These use-cases illustrate how scientific collaborations can build applications with their own "user-contributed" data using MPContribs. The Nanoporous Materials Explorer application provides a unique search interface to a novel dataset of hundreds of thousands of materials, each with tables of user-contributed values related to material adsorption and density at varying temperature and pressure. The Unified Theoretical and Experimental x-ray Spectroscopy application discusses a full workflow for the association, dissemination and combined analyses of experimental data from the Advanced Light Source with MP's theoretical core data, using MPContribs tools for data formatting, management and exploration. The capabilities being developed for these collaborations are serving as the model for how new materials data can be incorporated into the Materials Project website with minimal staff overhead while giving powerful tools for data search and display to the user community.Comment: 12 pages, 5 figures, Proceedings of 10th Gateway Computing Environments Workshop (2015), to be published in "Concurrency in Computation: Practice and Experience

arXiv.org e-Print Archive

eScholarship - University of California

Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

Author: A. Bassi
A. Thomasian
B. Allcock
B.S. White
G. Antoniu
G. Antoniu
G. Antoniu
K. Douglas
M. Nicola
M.A. Casey
O. Tatebe
P. Honeyman
P.Z. Kunszt
R. Bolze
R. Jin
R.C. Merkle
S. Rhea
Publication venue
Publication date: 01/01/2008
Field of study

This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

A Provenance Tracking Model for Data Updates

Author: Ciobanu Gabriel
Horne Ross
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2012
Field of study

For data-centric systems, provenance tracking is particularly important when the system is open and decentralised, such as the Web of Linked Data. In this paper, a concise but expressive calculus which models data updates is presented. The calculus is used to provide an operational semantics for a system where data and updates interact concurrently. The operational semantics of the calculus also tracks the provenance of data with respect to updates. This provides a new formal semantics extending provenance diagrams which takes into account the execution of processes in a concurrent setting. Moreover, a sound and complete model for the calculus based on ideals of series-parallel DAGs is provided. The notion of provenance introduced can be used as a subjective indicator of the quality of data in concurrent interacting systems.Comment: In Proceedings FOCLASA 2012, arXiv:1208.432

arXiv.org e-Print Archive

University of Strathclyde Institutional Repository

Directory of Open Access Journals

DR-NTU (Digital Repository of NTU)

Functional Data Analysis in Electronic Commerce Research

Author: Jank Wolfgang
Shmueli Galit
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their usefulness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to address, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe several applications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statistics.Comment: Published at http://dx.doi.org/10.1214/088342306000000132 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Protocols for Integrity Constraint Checking in Federated Databases

Author: Grefen Paul
Widom Jennifer
Publication venue: Kluwer Academic Publishers
Publication date: 01/01/1996
Field of study

A federated database is comprised of multiple interconnected database systems that primarily operate independently but cooperate to a certain extent. Global integrity constraints can be very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. This paper presents a threefold contribution to integrity constraint checking in federated databases: (1) The problem of constraint checking in a federated database environment is clearly formulated. (2) A family of protocols for constraint checking is presented. (3) The differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed by the protocols, and processing and communication costs. Thus, our work yields a suite of options from which a protocol can be chosen to suit the system capabilities and integrity requirements of a particular federated database environment

CiteSeerX

University of Twente Research Information

Concurrency Control for Perceivedly Instantaneous Transactions in Valid-Time Databases

Author: Finger M
McBrien P
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Although temporal databases have received considerable attention as a topic for research, little work in the area has paid attention to the concurrency control mechanisms that might be employed in temporal databases. This paper describes how the notion of the current time --- also called `now' --- in valid-time databases can cause standard serialisation theory to give what are at least unintuitive results, if not actually incorrect results. The paper then describes two modifications to standard serialisation theory which correct the behaviour to give what we term perceivably instantaneous transactions; transactions where serialising T 1 and T 2 as [T 1 ; T 2 ] always implies that the current time seen by T 1 is less than or equal to the current time seen by T 2 . 1 Introduction Query languages for valid-time temporal database normally contain a notion of "currenttime " [TCG + 93, Sno95], usually represented as the value of a special variable now. While it is agreed that the value of..

CiteSeerX

Spiral - Imperial College Digital Repository