Search CORE

581 research outputs found

Graph Summarization

Author: Bonifati Angela
Dumbrava Stefania
Kondylakis Haridimos
Publication venue
Publication date: 01/04/2020
Field of study

The continuous and rapid growth of highly interconnected datasets, which are both voluminous and complex, calls for the development of adequate processing and analytical techniques. One method for condensing and simplifying such datasets is graph summarization. It denotes a series of application-specific algorithms designed to transform graphs into more compact representations while preserving structural patterns, query answers, or specific property distributions. As this problem is common to several areas studying graph topologies, different approaches, such as clustering, compression, sampling, or influence detection, have been proposed, primarily based on statistical and optimization methods. The focus of our chapter is to pinpoint the main graph summarization methods, but especially to focus on the most recent approaches and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

Hal-Diderot

Parallel, distributed and GPU computing technologies in single-particle electron microscopy

Author: Busche Boris
Hauer Florian
Heisen Burkhard C.
Knauber Karl-Heinz
Koske Tobias
Luettich Mario
Schmeisser Martin
Stark Holger
Publication venue: International Union of Crystallography
Publication date: 01/01/2009
Field of study

An introduction to the current paradigm shift towards concurrency in software

Crossref

PubMed Central

MPG.PuRe

Recommended from our members

Computational Strategies for Scalable Genomics Analysis.

Author: Shi Lizhen
Wang Zhong
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications

eScholarship - University of California

API design for machine learning software: experiences from the scikit-learn project

Author: Blondel Mathieu
Buitinck Lars
Gramfort Alexandre
Grisel Olivier
Grobler Jaques
Holt Brian
Joly Arnaud
Layton Robert
Louppe Gilles
Mueller Andreas
Niculae Vlad
Pedregosa Fabian
Prettenhofer Peter
Vanderplas Jake
Varoquaux Gaël
Publication venue
Publication date: 01/09/2013
Field of study

Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Federation ResearchOnline

HAL-CEA

A Survey on Vertical and Horizontal Scaling Platforms for Big Data Analytics

Author: Ali Ahmed Hussein
Publication venue: 'Penerbit UTHM'
Publication date: 12/09/2019
Field of study

There is no doubt that we are entering the era of big data. The challenge is on how to store, search, and analyze the huge amount of data that is being generated per second. One of the main obstacles to the big data researchers is how to find the appropriate big data analysis platform. The basic aim of this work is to present a complete investigation of all the available platforms for big data analysis in terms of vertical and horizontal scaling, and its compatible framework and applications in detail. Finally, this article will outline some research trends and other open issues in big data analytic

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

International Journal of Integrated Engineering