Search CORE

5,635 research outputs found

A Model-Driven Approach to Automate Data Visualization in Big Data Analytics

Author: Matteo Golfarelli
Stefano Rizzi
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

In big data analytics, advanced analytic techniques operate on big data sets aimed at complementing the role of traditional OLAP for decision making. To enable companies to take benefit of these techniques despite the lack of in-house technical skills, the H2020 TOREADOR Project adopts a model-driven architecture for streamlining analysis processes, from data preparation to their visualization. In this paper we propose a new approach named SkyViz focused on the visualization area, in particular on (i) how to specify the user's objectives and describe the dataset to be visualized, (ii) how to translate this specification into a platform-independent visualization type, and (iii) how to concretely implement this visualization type on the target execution platform. To support step (i) we define a visualization context based on seven prioritizable coordinates for assessing the user's objectives and conceptually describing the data to be visualized. To automate step (ii) we propose a skyline-based technique that translates a visualization context into a set of most-suitable visualization types. Finally, to automate step (iii) we propose a skyline-based technique that, with reference to a specific platform, finds the best bindings between the columns of the dataset and the graphical coordinates used by the visualization type chosen by the user. SkyViz can be transparently extended to include more visualization types on the one hand, more visualization coordinates on the other. The paper is completed by an evaluation of SkyViz based on a case study excerpted from the pilot applications of the TOREADOR Project

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Continuous Performance Benchmarking Framework for ROOT

Author: Bockelman Brian Paul
Shadura Oksana
Vassilev Vassil
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

Foundational software libraries such as ROOT are under intense pressure to avoid software regression, including performance regressions. Continuous performance benchmarking, as a part of continuous integration and other code quality testing, is an industry best-practice to understand how the performance of a software product evolves over time. We present a framework, built from industry best practices and tools, to help to understand ROOT code performance and monitor the efficiency of the code for a several processor architectures. It additionally allows historical performance measurements for ROOT I/O, vectorization and parallelization sub-systems.Comment: 8 pages, 5 figures, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physic

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Data-driven model reduction and transfer operator approximation

Author: Kevrekidis Ioannis
Klus Stefan
Koltai Péter
Noé Frank
Nüske Feliks
Schütte Christof
Wu Hao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2017
Field of study

In this review paper, we will present different data-driven dimension reduction techniques for dynamical systems that are based on transfer operator theory as well as methods to approximate transfer operators and their eigenvalues, eigenfunctions, and eigenmodes. The goal is to point out similarities and differences between methods developed independently by the dynamical systems, fluid dynamics, and molecular dynamics communities such as time-lagged independent component analysis (TICA), dynamic mode decomposition (DMD), and their respective generalizations. As a result, extensions and best practices developed for one particular method can be carried over to other related methods

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

MPG.PuRe

Software Challenges For HL-LHC Data Analysis

Author: Amadio Guilherme
An Sitong
Bellenot Bertrand
Blomer Jakob
Brann Kim Albertsson
Canal Philippe
Couet Olivier
Galli Massimiliano
Guiraud Enrico
Hageboeck Stephan
Linev Sergey
Moneta Lorenzo
Naumann Axel
Padulano Vincenzo Eduardo
Pla Xavier Valls
Rademakers Fons
ROOT Team
Saavedra Enric Tejedor
Shadura Oksana
Tadel Alja Mrak
Tadel Matevz
Vassilev Vassil
Vila Pere Mato
Wunsch Stefan
Publication venue
Publication date: 04/05/2020
Field of study

The high energy physics community is discussing where investment is needed to prepare software for the HL-LHC and its unprecedented challenges. The ROOT project is one of the central software players in high energy physics since decades. From its experience and expectations, the ROOT team has distilled a comprehensive set of areas that should see research and development in the context of data analysis software, for making best use of HL-LHC's physics potential. This work shows what these areas could be, why the ROOT team believes investing in them is needed, which gains are expected, and where related work is ongoing. It can serve as an indication for future research proposals and cooperations

arXiv.org e-Print Archive

CERN Document Server

Cytoscape: the network visualization tool for GenomeSpace workflows.

Author: Demchak Barry
Hull Tim
Ideker Trey
Liefeld Ted
Mesirov Jill P
Reich Michael
Smoot Michael
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

Modern genomic analysis often requires workflows incorporating multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows between them. One such tool is Cytoscape 3, a popular application that enables analysis and visualization of graph-oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over 850 times since the release of its first version in September, 2013

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Obvious: a meta-toolkit to encapsulate information visualization toolkits. One toolkit to bind them all

Author: Baudel T.
Fekete J.
Hemery P.
Wood J.
Publication venue
Publication date: 01/01/2011
Field of study

This article describes “Obvious”: a meta-toolkit that abstracts and encapsulates information visualization toolkits implemented in the Java language. It intends to unify their use and postpone the choice of which concrete toolkit(s) to use later-on in the development of visual analytics applications. We also report on the lessons we have learned when wrapping popular toolkits with Obvious, namely Prefuse, the InfoVis Toolkit, partly Improvise, JUNG and other data management libraries. We show several examples on the uses of Obvious, how the different toolkits can be combined, for instance sharing their data models. We also show how Weka and RapidMiner, two popular machine-learning toolkits, have been wrapped with Obvious and can be used directly with all the other wrapped toolkits. We expect Obvious to start a co-evolution process: Obvious is meant to evolve when more components of Information Visualization systems will become consensual. It is also designed to help information visualization systems adhere to the best practices to provide a higher level of interoperability and leverage the domain of visual analytics

HAL-CentraleSupelec

CiteSeerX

City Research Online

INRIA a CCSD electronic archive server

HAL-Rennes 1

Metabolomics Data Processing and Data Analysis—Current Best Practices

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

Metabolomics data analysis strategies are central to transforming raw metabolomics data files into meaningful biochemical interpretations that answer biological questions or generate novel hypotheses. This book contains a variety of papers from a Special Issue around the theme “Best Practices in Metabolomics Data Analysis”. Reviews and strategies for the whole metabolomics pipeline are included, whereas key areas such as metabolite annotation and identification, compound and spectral databases and repositories, and statistical analysis are highlighted in various papers. Altogether, this book contains valuable information for researchers just starting in their metabolomics career as well as those that are more experienced and look for additional knowledge and best practice to complement key parts of their metabolomics workflows

Directory of Open Access Books (DOAB)