Search CORE

3 research outputs found

The MultiDark Database: Release of the Bolshoi and MultiDark Cosmological Simulations

Author: Aarseth
Aarseth
Allgood
Bennett
Bower
Boylan-Kolchin
Bryan
Bullock
Conroy
Croton
Davis
De Lucia
Dubinski
Efstathiou
Gao
Gott
Iliev
Jenkins
Jing
Kauffmann
Kim
Klypin
Klypin
Knollmann
Kravtsov
Kravtsov
Kravtsov
Kuhlen
Lahav
Macciö
Moore
More
Muñoz-Cuartas
Navarro
Neto
Peebles
Prada
Prada
Schneider
Sheth
Somerville
Somerville
Springel
Springel
Springel
Stadel
Teyssier
Tinker
Tinker
Trujillo-Gomez
Vale
van den Bosch
Warren
Wechsler
Wetzel
White
Zentner
Zhao
Publication venue: 'Wiley'
Publication date: 02/09/2011
Field of study

We present the online MultiDark Database -- a Virtual Observatory-oriented, relational database for hosting various cosmological simulations. The data is accessible via an SQL (Structured Query Language) query interface, which also allows users to directly pose scientific questions, as shown in a number of examples in this paper. Further examples for the usage of the database are given in its extensive online documentation (www.multidark.org). The database is based on the same technology as the Millennium Database, a fact that will greatly facilitate the usage of both suites of cosmological simulations. The first release of the MultiDark Database hosts two 8.6 billion particle cosmological N-body simulations: the Bolshoi (250/h Mpc simulation box, 1/h kpc resolution) and MultiDark Run1 simulation (MDR1, or BigBolshoi, 1000/h Mpc simulation box, 7/h kpc resolution). The extraction methods for halos/subhalos from the raw simulation data, and how this data is structured in the database are explained in this paper. With the first data release, users get full access to halo/subhalo catalogs, various profiles of the halos at redshifts z=0-15, and raw dark matter data for one time-step of the Bolshoi and four time-steps of the MultiDark simulation. Later releases will also include galaxy mock catalogs and additional merging trees for both simulations as well as new large volume simulations with high resolution. This project is further proof of the viability to store and present complex data using relational database technology. We encourage other simulators to publish their results in a similar manner.Comment: 28 pages, 9 figures, submitted to New Astronom

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Implementing a General Spatial Indexing Library for Relational Databases of Large Numerical Simulations

Author: E. Lawrence
H. Sagan
H. Samet
J. Diemand
M. Boylan-Kolchin
V. Springel
V. Springel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Ranked Similarity Search of Scientific Datasets: An Information Retrieval Approach

Author: Megler Veronika Margaret
Publication venue: PDXScholar
Publication date: 04/06/2014
Field of study

In the past decade, the amount of scientific data collected and generated by scientists has grown dramatically. This growth has intensified an existing problem: in large archives consisting of datasets stored in many files, formats and locations, how can scientists find data relevant to their research interests? We approach this problem in a new way: by adapting Information Retrieval techniques, developed for searching text documents, into the world of (primarily numeric) scientific data. We propose an approach that uses a blend of automated and curated methods to extract metadata from large repositories of scientific data. We then perform searches over this metadata, returning results ranked by similarity to the search criteria. We present a model of this approach, and describe a specific implementation thereof performed at an ocean-observatory data archive and now running in production. Our prototype implements scanners that extract metadata from datasets that contain different kinds of environmental observations, and a search engine with a candidate similarity measure for comparing a set of search terms to the extracted metadata. We evaluate the utility of the prototype by performing two user studies; these studies show that the approach resonates with users, and that our proposed similarity measure performs well when analyzed using standard Information Retrieval evaluation methods. We performed performance tests to explore how continued archive growth will affect our goal of interactive response, developed and applied techniques that mitigate the effects of that growth, and show that the techniques are effective. Lastly, we describe some of the research needed to extend this initial work into a true Google for data

PDXScholar (Portland State University)