Search CORE

27,755 research outputs found

IMP Science Gateway: from the Portal to the Hub of Virtual Experimental Labs in Materials Science

Author: Baskova Olexandra
Bekenev Lev
Gatsenko Olexander
Gordienko Yuri
Stirenko Sergii
Zasimchuk Elena
Publication venue
Publication date: 22/04/2014
Field of study

"Science gateway" (SG) ideology means a user-friendly intuitive interface between scientists (or scientific communities) and different software components + various distributed computing infrastructures (DCIs) (like grids, clouds, clusters), where researchers can focus on their scientific goals and less on peculiarities of software/DCI. "IMP Science Gateway Portal" (http://scigate.imp.kiev.ua) for complex workflow management and integration of distributed computing resources (like clusters, service grids, desktop grids, clouds) is presented. It is created on the basis of WS-PGRADE and gUSE technologies, where WS-PGRADE is designed for science workflow operation and gUSE - for smooth integration of available resources for parallel and distributed computing in various heterogeneous distributed computing infrastructures (DCI). The typical scientific workflows with possible scenarios of its preparation and usage are presented. Several typical use cases for these science applications (scientific workflows) are considered for molecular dynamics (MD) simulations of complex behavior of various nanostructures (nanoindentation of graphene layers, defect system relaxation in metal nanocrystals, thermal stability of boron nitride nanotubes, etc.). The user experience is analyzed in the context of its practical applications for MD simulations in materials science, physics and nanotechnologies with available heterogeneous DCIs. In conclusion, the "science gateway" approach - workflow manager (like WS-PGRADE) + DCI resources manager (like gUSE)- gives opportunity to use the SG portal (like "IMP Science Gateway Portal") in a very promising way, namely, as a hub of various virtual experimental labs (different software components + various requirements to resources) in the context of its practical MD applications in materials science, physics, chemistry, biology, and nanotechnologies.Comment: 6 pages, 5 figures, 3 tables; 6th International Workshop on Science Gateways, IWSG-2014 (Dublin, Ireland, 3-5 June, 2014). arXiv admin note: substantial text overlap with arXiv:1404.545

arXiv.org e-Print Archive

Crossref

Hardware-accelerated interactive data visualization for neuroscience in Python.

Author: Harris KD
Rossant C
Publication venue
Publication date: 01/01/2013
Field of study

Large datasets are becoming more and more common in science, particularly in neuroscience where experimental techniques are rapidly evolving. Obtaining interpretable results from raw data can sometimes be done automatically; however, there are numerous situations where there is a need, at all processing stages, to visualize the data in an interactive way. This enables the scientist to gain intuition, discover unexpected patterns, and find guidance about subsequent analysis steps. Existing visualization tools mostly focus on static publication-quality figures and do not support interactive visualization of large datasets. While working on Python software for visualization of neurophysiological data, we developed techniques to leverage the computational power of modern graphics cards for high-performance interactive data visualization. We were able to achieve very high performance despite the interpreted and dynamic nature of Python, by using state-of-the-art, fast libraries such as NumPy, PyOpenGL, and PyTables. We present applications of these methods to visualization of neurophysiological data. We believe our tools will be useful in a broad range of domains, in neuroscience and beyond, where there is an increasing need for scalable and fast interactive visualization

UCL Discovery

PubMed Central

Frontiers - Publisher Connector

AstroGrid-D: Grid Technology for Astronomical Science

Author: Aarseth
Alexander Beck-Ratzka
Alexander Reinefeld
Allan
Angelika Reiser
Arthur Carlson
Berczik
Berczik
Berentzen
Elstner
Frank Breitling
Fukushige
Hans-Martin Adorf
Harfst
Harry Enke
Hessman
Hurley
Iliya Nickelt
Joachim Wambsganß
Jürgen Steinacker
Kuntschke
Makino
Matthias Steinmetz
Mikael Högqvist
Rainer Spurzem
Reinecke
Scholl
Springel
Springel
Spurzem
Spurzem
Spurzem
Steve White
Strassmeier
Thomas Brüsemeister
Thomas Radke
Tobias Scholl
Torsten Ensslin
Wolfgang Voges
Publication venue: 'Elsevier BV'
Publication date: 23/07/2010
Field of study

We present status and results of AstroGrid-D, a joint effort of astrophysicists and computer scientists to employ grid technology for scientific applications. AstroGrid-D provides access to a network of distributed machines with a set of commands as well as software interfaces. It allows simple use of computer and storage facilities and to schedule or monitor compute tasks and data management. It is based on the Globus Toolkit middleware (GT4). Chapter 1 describes the context which led to the demand for advanced software solutions in Astrophysics, and we state the goals of the project. We then present characteristic astrophysical applications that have been implemented on AstroGrid-D in chapter 2. We describe simulations of different complexity, compute-intensive calculations running on multiple sites, and advanced applications for specific scientific purposes, such as a connection to robotic telescopes. We can show from these examples how grid execution improves e.g. the scientific workflow. Chapter 3 explains the software tools and services that we adapted or newly developed. Section 3.1 is focused on the administrative aspects of the infrastructure, to manage users and monitor activity. Section 3.2 characterises the central components of our architecture: The AstroGrid-D information service to collect and store metadata, a file management system, the data management system, and a job manager for automatic submission of compute tasks. We summarise the successfully established infrastructure in chapter 4, concluding with our future plans to establish AstroGrid-D as a platform of modern e-Astronomy.Comment: 14 pages, 12 figures Subjects: data analysis, image processing, robotic telescopes, simulations, grid. Accepted for publication in New Astronom

arXiv.org e-Print Archive

Crossref

MPG.PuRe

State-of-the-Art in Parallel Computing with R

Author: Eddelbuettel Dirk
Mansmann Ulrich
Morgan Martin
Schmidberger Markus
Tierney Luke
Yu Hao
Publication venue
Publication date: 01/01/2009
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix

Crossref

Directory of Open Access Journals

Open Access LMU

Journal of Statistical Software

Simulation modelling and visualisation: toolkits for building artificial worlds

Author: Gerdelan A.P.
Hawick K.A.
Leist A.
Playne D.P.
Scogings C.J.
Publication venue: 'Massey University'
Publication date: 01/01/2008
Field of study

Simulations users at all levels make heavy use of compute resources to drive computational simulations for greatly varying applications areas of research using different simulation paradigms. Simulations are implemented in many software forms, ranging from highly standardised and general models that run in proprietary software packages to ad hoc hand-crafted simulations codes for very specific applications. Visualisation of the workings or results of a simulation is another highly valuable capability for simulation developers and practitioners. There are many different software libraries and methods available for creating a visualisation layer for simulations, and it is often a difficult and time-consuming process to assemble a toolkit of these libraries and other resources that best suits a particular simulation model. We present here a break-down of the main simulation paradigms, and discuss differing toolkits and approaches that different researchers have taken to tackle coupled simulation and visualisation in each paradigm

Massey Research Online

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

Author: Antcheva Ilka
Ballintijn Maarten
Bellenot Bertrand
Biskup Marek
Brun Rene
Buncic Nenad
Canal Philippe
Casadei Diego
Couet Olivier
Fine Valery
Franco Leandro
Ganis Gerardo
Gheata Andrei
Goto Masaharu
Iwaszkiewicz Jan
Kreshuk Anna
Maline David Gonzalez
Maunder Richard
Moneta Lorenzo
Naumann Axel
Offermann Eddy
Onuchin Valeriy
Panacek Suzanne
Rademakers Fons
Russo Paul
Segura Diego Marcos
Tadel Matevz
Publication venue: 'Elsevier BV'
Publication date: 31/08/2015
Field of study

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

arXiv.org e-Print Archive

CERN Document Server

Distributed-based massive processing of activity logs for efficient user modeling in a Virtual Campus

Author: Caballé Llobet Santiago
Xhafa Xhafa Fatos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper reports on a multi-fold approach for the building of user models based on the identification of navigation patterns in a virtual campus, allowing for adapting the campus’ usability to the actual learners’ needs, thus resulting in a great stimulation of the learning experience. However, user modeling in this context implies a constant processing and analysis of user interaction data during long-term learning activities, which produces huge amounts of valuable data stored typically in server log files. Due to the large or very large size of log files generated daily, the massive processing is a foremost step in extracting useful information. To this end, this work studies, first, the viability of processing large log data files of a real Virtual Campus using different distributed infrastructures. More precisely, we study the time performance of massive processing of daily log files implemented following the master-slave paradigm and evaluated using Cluster Computing and PlanetLab platforms. The study reveals the complexity and challenges of massive processing in the big data era, such as the need to carefully tune the log file processing in terms of chunk log data size to be processed at slave nodes as well as the bottleneck in processing in truly geographically distributed infrastructures due to the overhead caused by the communication time among the master and slave nodes. Then, an application of the massive processing approach resulting in log data processed and stored in a well-structured format is presented. We show how to extract knowledge from the log data analysis by using the WEKA framework for data mining purposes showing its usefulness to effectively build user models in terms of identifying interesting navigation patters of on-line learners. The study is motivated and conducted in the context of the actual data logs of the Virtual Campus of the Open University of Catalonia.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Oberta in open access

State of the Art in Parallel Computing with R

Author: Dirk Eddelbuettel
Hao Yu
Luke Tierney
Markus Schmidberger
Martin Morgan
Ulrich Mansmann
Publication venue
Publication date
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.

Research Papers in Economics