576 research outputs found
Parallel Processing of Large Graphs
More and more large data collections are gathered worldwide in various IT
systems. Many of them possess the networked nature and need to be processed and
analysed as graph structures. Due to their size they require very often usage
of parallel paradigm for efficient computation. Three parallel techniques have
been compared in the paper: MapReduce, its map-side join extension and Bulk
Synchronous Parallel (BSP). They are implemented for two different graph
problems: calculation of single source shortest paths (SSSP) and collective
classification of graph nodes by means of relational influence propagation
(RIP). The methods and algorithms are applied to several network datasets
differing in size and structural profile, originating from three domains:
telecommunication, multimedia and microblog. The results revealed that
iterative graph processing with the BSP implementation always and
significantly, even up to 10 times outperforms MapReduce, especially for
algorithms with many iterations and sparse communication. Also MapReduce
extension based on map-side join usually noticeably presents better efficiency,
although not as much as BSP. Nevertheless, MapReduce still remains the good
alternative for enormous networks, whose data structures do not fit in local
memories.Comment: Preprint submitted to Future Generation Computer System
Deep Learning in the Automotive Industry: Applications and Tools
Deep Learning refers to a set of machine learning techniques that utilize
neural networks with many hidden layers for tasks, such as image
classification, speech recognition, language understanding. Deep learning has
been proven to be very effective in these domains and is pervasively used by
many Internet services. In this paper, we describe different automotive uses
cases for deep learning in particular in the domain of computer vision. We
surveys the current state-of-the-art in libraries, tools and infrastructures
(e.\,g.\ GPUs and clouds) for implementing, training and deploying deep neural
networks. We particularly focus on convolutional neural networks and computer
vision use cases, such as the visual inspection process in manufacturing plants
and the analysis of social media data. To train neural networks, curated and
labeled datasets are essential. In particular, both the availability and scope
of such datasets is typically very limited. A main contribution of this paper
is the creation of an automotive dataset, that allows us to learn and
automatically recognize different vehicle properties. We describe an end-to-end
deep learning application utilizing a mobile app for data collection and
process support, and an Amazon-based cloud backend for storage and training.
For training we evaluate the use of cloud and on-premises infrastructures
(including multiple GPUs) in conjunction with different neural network
architectures and frameworks. We assess both the training times as well as the
accuracy of the classifier. Finally, we demonstrate the effectiveness of the
trained classifier in a real world setting during manufacturing process.Comment: 10 page
Teadusarvutuse algoritmide taandamine hajusarvutuse raamistikele
Teadusarvutuses kasutatakse arvuteid ja algoritme selleks, et lahendada probleeme erinevates reaalteadustes nagu geneetika, bioloogia ja keemia. Tihti on eesmärgiks selliste loodusnähtuste modelleerimine ja simuleerimine, mida päris keskkonnas oleks väga raske uurida.
Näiteks on võimalik luua päikesetormi või meteoriiditabamuse mudel ning arvutisimulatsioonide abil hinnata katastroofi mõju keskkonnale. Mida keerulisemad ja täpsemad on sellised simulatsioonid, seda rohkem arvutusvõimsust on vaja. Tihti kasutatakse selleks suurt hulka arvuteid, mis kõik samaaegselt töötavad ühe probleemi kallal. Selliseid arvutusi nimetatakse paralleel- või hajusarvutusteks.
Hajusarvutuse programmide loomine on aga keeruline ning nõuab palju rohkem aega ja ressursse, kuna vaja on sünkroniseerida erinevates arvutites samaaegselt tehtavat tööd. On loodud mitmeid tarkvararaamistikke, mis lihtsustavad seda tööd automatiseerides osa hajusprogrammeerimisest.
Selle teadustöö eesmärk oli uurida selliste hajusarvutusraamistike sobivust keerulisemate teadusarvutuse algoritmide jaoks. Tulemused näitasid, et olemasolevad raamistikud on üksteisest väga erinevad ning neist ükski ei ole sobiv kõigi erinevat tüüpi algoritmide jaoks. Mõni raamistik on sobiv ainult lihtsamate algoritmide jaoks; mõni ei sobi olukorras, kus andmed ei mahu arvutite mällu. Algoritmi jaoks kõige sobivama hajusarvutisraamistiku valimine võib olla väga keeruline ülesanne, kuna see nõuab olemasolevate raamistike uurimist ja rakendamist.
Sellele probleemile lahendust otsides otsustati luua dünaamiline algoritmide modelleerimise rakendus (DAMR), mis oskab simuleerida algoritmi implementatsioone erinevates hajusarvutusraamistikes. DAMR aitab hinnata milline hajusraamistik on kõige sobivam ette antud algoritmi jaoks, ilma algoritmi reaalselt ühegi hajusraamistiku peale implementeerimata.
Selle uurimustöö peamine panus on hajusarvutusraamistike kasutuselevõtu lihtsamaks tegemine teadlastele, kes ei ole varem nende kasutamisega kokku puutunud. See peaks märkimisväärselt aega ja ressursse kokku hoidma, kuna ei pea ükshaaval kõiki olemasolevaid hajusraamistikke tundma õppima ja rakendama.Scientific computing uses computers and algorithms to solve problems in various sciences such as genetics, biology and chemistry. Often the goal is to model and simulate different natural phenomena which would otherwise be very difficult to study in real environments.
For example, it is possible to create a model of a solar storm or a meteor hit and run computer simulations to assess the impact of the disaster on the environment. The more sophisticated and accurate the simulations are the more computing power is required. It is often necessary to use a large number of computers, all working simultaneously on a single problem. These kind of computations are called parallel or distributed computing.
However, creating distributed computing programs is complicated and requires a lot more time and resources, because it is necessary to synchronize different computers working at the same time. A number of software frameworks have been created to simplify this process by automating part of a distributed programming.
The goal of this research was to assess the suitability of such distributed computing frameworks for complex scientific computing algorithms. The results showed that existing frameworks are very different from each other and none of them are suitable for all different types of algorithms. Some frameworks are only suitable for simple algorithms; others are not suitable when data does not fit into the computer memory. Choosing the most appropriate distributed computing framework for an algorithm can be a very complex task, because it requires studying and applying the existing frameworks.
While searching for a solution to this problem, it was decided to create a Dynamic Algorithms Modelling Application (DAMA), which is able to simulate the implementation of the algorithm in different distributed computing frameworks. DAMA helps to estimate which distributed framework is the most appropriate for a given algorithm, without actually implementing it in any of the available frameworks.
This main contribution of this study is simplifying the adoption of distributed computing frameworks for researchers who are not yet familiar with using them. It should save significant time and resources as it is not necessary to study each of the available distributed computing frameworks in detail
Comparing MapReduce and pipeline implementations for counting triangles
A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide and Conquer paradigm, named dynamic pipeline. The main features of dynamic pipelines are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different topologies, sizes, and densities. Observed results suggest that dynamic pipelines allows for an efficient implementation of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks
A number of graph-parallel processing frameworks have been proposed to address the needs of processing complex and large-scale graph structured datasets in recent years. Although significant performance improvement made by those frameworks were reported, comparative advantages of each of these frameworks over the others have not been fully studied, which impedes the best utilization of those frameworks for a specific graph computing task and setting. In this work, we conducted a comparison study on parallel processing systems for large-scale graph computations in a systematic manner, aiming to reveal the characteristics of those systems in performing common graph algorithms with real-world datasets on the same ground. We selected three popular graph-parallel processing frameworks (Giraph, GPS and GraphLab) for the study and also include a representative general data-parallel computing system— Spark—in the comparison in order to understand how well a general data-parallel system can run graph problems. We applied basic performance metrics measuring speed, resource utilization, and scalability to answer a basic question of which graph-parallel processing platform is better suited for what applications and datasets. Three widely-used graph algorithms— clustering coefficient, shortest path length, and PageRank score—were used for benchmarking on the targeted computing systems.We ran those algorithms against three real world network datasets with diverse characteristics and scales on a research cluster and have obtained a number of interesting observations. For instance, all evaluated systems showed poor scalability (i.e., the runtime increases with more computing nodes) with small datasets likely due to communication overhead. Further, out of the evaluated graphparallel computing platforms, PowerGraph consistently exhibits better performance than others
- …