32,857 research outputs found
Toward porting Astrophysics Visual Analytics Services to the European Open Science Cloud
The European Open Science Cloud (EOSC) aims to create a federated environment
for hosting and processing research data to support science in all disciplines
without geographical boundaries, such that data, software, methods and
publications can be shared as part of an Open Science community of practice.
This work presents the ongoing activities related to the implementation of
visual analytics services, integrated into EOSC, towards addressing the diverse
astrophysics user communities needs. These services rely on visualisation to
manage the data life cycle process under FAIR principles, integrating data
processing for imaging and multidimensional map creation and mosaicing, and
applying machine learning techniques for detection of structures in large scale
multidimensional maps
Integration of Exploration and Search: A Case Study of the M3 Model
International audienceEffective support for multimedia analytics applications requires exploration and search to be integrated seamlessly into a single interaction model. Media metadata can be seen as defining a multidimensional media space, casting multimedia analytics tasks as exploration, manipulation and augmentation of that space. We present an initial case study of integrating exploration and search within this multidimensional media space. We extend the M3 model, initially proposed as a pure exploration tool, and show that it can be elegantly extended to allow searching within an exploration context and exploring within a search context. We then evaluate the suitability of relational database management systems, as representatives of today’s data management technologies, for implementing the extended M3 model. Based on our results, we finally propose some research directions for scalability of multimedia analytics
Benchmarking SciDB Data Import on HPC Systems
SciDB is a scalable, computational database management system that uses an
array model for data storage. The array data model of SciDB makes it ideally
suited for storing and managing large amounts of imaging data. SciDB is
designed to support advanced analytics in database, thus reducing the need for
extracting data for analysis. It is designed to be massively parallel and can
run on commodity hardware in a high performance computing (HPC) environment. In
this paper, we present the performance of SciDB using simulated image data. The
Dynamic Distributed Dimensional Data Model (D4M) software is used to implement
the benchmark on a cluster running the MIT SuperCloud software stack. A peak
performance of 2.2M database inserts per second was achieved on a single node
of this system. We also show that SciDB and the D4M toolbox provide more
efficient ways to access random sub-volumes of massive datasets compared to the
traditional approaches of reading volumetric data from individual files. This
work describes the D4M and SciDB tools we developed and presents the initial
performance results. This performance was achieved by using parallel inserts, a
in-database merging of arrays as well as supercomputing techniques, such as
distributed arrays and single-program-multiple-data programming.Comment: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC)
2016, best paper finalis
- …