Search CORE

2 research outputs found

A Review of Infrastructures to Process Big Multimedia Data

Author: Garcia-Rodriguez Jose
Ruiz Zoila
Salvador Jaime
Publication venue: 'IGI Global'
Publication date: 01/01/2017
Field of study

In the last years, the volume of information is growing faster than ever before, moving from small to huge, structured to unstructured datasets like text, image, audio and video. The purpose of processing the data is aimed to extract relevant information on trends, challenges and opportunities; all these studies with large volumes of data. The increase in the power of parallel computing enabled the use of Machine Learning (ML) techniques to take advantage of the processing capabilities offered by new architectures on large volumes of data. For this reason, it is necessary to find mechanisms that allow classify and organize them to facilitate to the users the extraction of the required information. The processing of these data requires the use of classification techniques that will be reviewed. This work analyzes different studies carried out on the use of ML for processing large volumes of data (Big Multimedia Data) and proposes a classification, using as criteria, the hardware infrastructures used in works of machine learning parallel approaches applied to large volumes of data

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Static and dynamic big data partitioning on Apache Spark

Author: Carlini Emanuele
Dazzi Patrizio
Lulli Alessandro
Ricci Laura
Publication venue: 'IOS Press'
Publication date: 01/01/2016
Field of study

Many of today’s large datasets are organized as a graph. Due to their size it is often infeasible to process these graphs using a single machine. Therefore, many software frameworks and tools have been proposed to process graph on top of distributed infrastructures. This software is often bundled with generic data decomposition strategies that are not optimised for specific algorithms. In this paper we study how a specific data partitioning strategy affects the performances of graph algorithms executing on Apache Spark. To this end, we implemented different graph algorithms and we compared their performances using a naive partitioning solution against more elaborate strategies, both static and dynamic

Archivio della Ricerca - Università di Pisa