4,313 research outputs found
The Dark Energy Survey Data Management System
The Dark Energy Survey collaboration will study cosmic acceleration with a
5000 deg2 griZY survey in the southern sky over 525 nights from 2011-2016. The
DES data management (DESDM) system will be used to process and archive these
data and the resulting science ready data products. The DESDM system consists
of an integrated archive, a processing framework, an ensemble of astronomy
codes and a data access framework. We are developing the DESDM system for
operation in the high performance computing (HPC) environments at NCSA and
Fermilab. Operating the DESDM system in an HPC environment offers both speed
and flexibility. We will employ it for our regular nightly processing needs,
and for more compute-intensive tasks such as large scale image coaddition
campaigns, extraction of weak lensing shear from the full survey dataset, and
massive seasonal reprocessing of the DES data. Data products will be available
to the Collaboration and later to the public through a virtual-observatory
compatible web portal. Our approach leverages investments in publicly available
HPC systems, greatly reducing hardware and maintenance costs to the project,
which must deploy and maintain only the storage, database platforms and
orchestration and web portal nodes that are specific to DESDM. In Fall 2007, we
tested the current DESDM system on both simulated and real survey data. We used
Teragrid to process 10 simulated DES nights (3TB of raw data), ingesting and
calibrating approximately 250 million objects into the DES Archive database. We
also used DESDM to process and calibrate over 50 nights of survey data acquired
with the Mosaic2 camera. Comparison to truth tables in the case of the
simulated data and internal crosschecks in the case of the real data indicate
that astrometric and photometric data quality is excellent.Comment: To be published in the proceedings of the SPIE conference on
Astronomical Instrumentation (held in Marseille in June 2008). This preprint
is made available with the permission of SPIE. Further information together
with preprint containing full quality images is available at
http://desweb.cosmology.uiuc.edu/wik
Building knowledge repositories with enterprise modelling and patterns ā from theory to practice
An approach to building knowledge repositories, Enterprise Knowledge Patterns (EKP), has been developed and applied throughout a number of research projects, most recently in the ELEKTRA, HyperKnowledge1 and EKLƤr projects. The EKP approach combines Enterprise Modelling with organisational patterns. Systematic evaluations of applying the approach have been carried out in two of the projects, while the third project is currently running. The aim of this paper is to provide an overview of the evaluation results and to share practical experiences from building knowledge repositories with Enterprise Modelling and organisational patterns. We discuss issues concerning the knowledge content of pattern based knowledge repositories, the language used to express knowledge in organisational patterns and technology support for storing and retrieving knowledge components.
On Constructing Persistent Identifiers with Persistent Resolution Targets
Persistent Identifiers (PID) are the foundation referencing digital assets in
scientific publications, books, and digital repositories. In its realization,
PIDs contain metadata and resolving targets in form of URLs that point to data
sets located on the network. In contrast to PIDs, the target URLs are typically
changing over time; thus, PIDs need continuous maintenance -- an effort that is
increasing tremendously with the advancement of e-Science and the advent of the
Internet-of-Things (IoT). Nowadays, billions of sensors and data sets are
subject of PID assignment. This paper presents a new approach of embedding
location independent targets into PIDs that allows the creation of
maintenance-free PIDs using content-centric network technology and overlay
networks. For proving the validity of the presented approach, the Handle PID
System is used in conjunction with Magnet Link access information encoding,
state-of-the-art decentralized data distribution with BitTorrent, and Named
Data Networking (NDN) as location-independent data access technology for
networks. Contrasting existing approaches, no green-field implementation of PID
or major modifications of the Handle System is required to enable
location-independent data dissemination with maintenance-free PIDs.Comment: Published IEEE paper of the FedCSIS 2016 (SoFAST-WS'16) conference,
11.-14. September 2016, Gdansk, Poland. Also available online:
http://ieeexplore.ieee.org/document/7733372
Electronic Data, Electronic Searching, Inadvertent Production of Privileged Data: A Perfect Storm
This article suggests that the practical impact of treating electronic searching as an expert function is to permit attorneys to focus and strategize on the process of electronic searching rather than on the completeness of document production. In effect, electronic searching permits attorneys to quit focusing on finding documents and begin focusing on identifying electronic sources of information on which reside relevant documents that can be extracted by means of electronic searching protocols
A Guide to Distributed Digital Preservation
This volume is devoted to the broad topic of distributed digital preservation, a still-emerging field of practice for the cultural memory arena. Replication and distribution hold out the promise of indefinite preservation of materials without degradation, but establishing effective organizational and technical processes to enable this form of digital preservation is daunting. Institutions need practical examples of how this task can be accomplished in manageable, low-cost ways."--P. [4] of cove
Shared and Searchable Encrypted Data for Untrusted Servers
Current security mechanisms pose a risk for organisations that outsource their data management to untrusted servers. Encrypting and decrypting sensitive data at the client side is the normal approach in this situation but has high communication and computation overheads if only a subset of the data is required, for example, selecting records in a database table based on a keyword search. New cryptographic schemes have been proposed that support encrypted queries over encrypted data but all depend on a single set of secret keys, which implies single user access or sharing keys among multiple users, with key revocation requiring costly data re-encryption. In this paper, we propose an encryption scheme where each authorised user in the system has his own keys to encrypt and decrypt data. The scheme supports keyword search which enables the server to return only the encrypted data that satisfies an encrypted query without decrypting it. We provide two constructions of the scheme giving formal proofs of their security. We also report on the results of a prototype implementation.
This research was supported by the UKās EPSRC research grant EP/C537181/1. The authors would like to thank the members of the Policy Research Group at Imperial College for their support
- ā¦