Search CORE

19 research outputs found

PiCo: A Domain-Specific Language for Data Analytics Pipelines

Author: Misale Claudia
Publication venue
Publication date: 01/01/2017
Field of study

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models—for which only informal (and often confusing) semantics is generally provided—all share a common under- lying model, namely, the Dataflow model. Using this model as a starting point, it is possible to categorize and analyze almost all aspects about Big Data analytics tools from a high level perspective. This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. By putting clear separations between all levels of abstraction (i.e., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics frameworks. From the user-level perspective, we think that a clearer and simple semantics is preferable, together with a strong separation of concerns. For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack of layers that build a prototypical framework for Big Data analytics. The contribution of this thesis is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm, Google Dataflow), thus making it easier to understand high-level data-processing applications written in such frameworks. As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). The main entity of this programming model is the Pipeline, basically a DAG-composition of processing elements. This model is intended to give the user an unique interface for both stream and batch processing, hiding completely data management and focusing only on operations, which are represented by Pipeline stages. Our DSL will be built on top of the FastFlow library, exploiting both shared and distributed parallelism, and implemented in C++11/14 with the aim of porting C++ into the Big Data world

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institutional Research Information System University of Turin

Connected Attribute Filtering Based on Contour Smoothness

Author: Ouzounis Georgios
Urbach Erik R.
Wilkinson M.H.F.
Publication venue: The Russian Academie of Science
Publication date: 01/01/2013
Field of study

University of Groningen

Connected Attribute Filtering Based on Contour Smoothness

Author: Ouzounis Georgios
Urbach Erik R.
Wilkinson M.H.F.
Publication venue: The Russian Academie of Science
Publication date: 01/01/2013
Field of study

Dissertations of the University of Groningen

Connected Attribute Filtering Based on Contour Smoothness

Author: Ouzounis Georgios
Urbach Erik R.
Wilkinson M.H.F.
Publication venue: The Russian Academie of Science
Publication date: 01/01/2013
Field of study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Recent Advances in Signal Processing

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

Directory of Open Access Books (DOAB)

The STAPL Parallel Container Framework

Author: Tanase Ilie Gabriel
Publication venue
Publication date
Field of study

The Standard Template Adaptive Parallel Library (STAPL) is a parallel programming infrastructure that extends C with support for parallelism. STAPL provides a run-time system, a collection of distributed data structures (pContainers) and parallel algorithms (pAlgorithms), and a generic methodology for extending them to provide customized functionality. Parallel containers are data structures addressing issues related to data partitioning, distribution, communication, synchronization, load balancing, and thread safety. This dissertation presents the STAPL Parallel Container Framework (PCF), which is designed to facilitate the development of generic parallel containers. We introduce a set of concepts and a methodology for assembling a pContainer from existing sequential or parallel containers without requiring the programmer to deal with concurrency or data distribution issues. The STAPL PCF provides a large number of basic data parallel structures (e.g., pArray, pList, pVector, pMatrix, pGraph, pMap, pSet). The STAPL PCF is distinguished from existing work by offering a class hierarchy and a composition mechanism which allows users to extend and customize the current container base for improved application expressivity and performance. We evaluate the performance of the STAPL pContainers on various parallel machines including a massively parallel CRAY XT4 system and an IBM P5-575 cluster. We show that the pContainer methods, generic pAlgorithms, and different applications, all provide good scalability on more than 10^4 processors

Texas A&M Repository

Multimedia Retrieval

Author
Publication venue: Springer
Publication date: 01/01/2007
Field of study

University of Twente Research Information

Runtime support for load balancing of parallel adaptive and irregular applications

Author: Barker Kevin James
Publication venue: W&M ScholarWorks
Publication date: 01/01/2004
Field of study

Applications critical to today\u27s engineering research often must make use of the increased memory and processing power of a parallel machine. While advances in architecture design are leading to more and more powerful parallel systems, the software tools needed to realize their full potential are in a much less advanced state. In particular, efficient, robust, and high-performance runtime support software is critical in the area of dynamic load balancing. While the load balancing of loosely synchronous codes, such as field solvers, has been studied extensively for the past 15 years, there exists a class of problems, known as asynchronous and highly adaptive , for which the dynamic load balancing problem remains open. as we discuss, characteristics of this class of problems render compile-time or static analysis of little benefit, and complicate the dynamic load balancing task immensely.;We make two contributions to this area of research. The first is the design and development of a runtime software toolkit, known as the Parallel Runtime Environment for Multi-computer Applications, or PREMA, which provides interprocessor communication, a global namespace, a framework for the implementation of customized scheduling policies, and several such policies which are prevalent in the load balancing literature. The PREMA system is designed to support coarse-grained domain decompositions with the goals of portability, flexibility, and maintainability in mind, so that developers will quickly feel comfortable incorporating it into existing codes and developing new codes which make use of its functionality. We demonstrate that the programming model and implementation are efficient and lead to the development of robust and high-performance applications.;Our second contribution is in the area of performance modeling. In order to make the most effective use of the PREMA runtime software, certain parameters governing its execution must be set off-line. Optimal values for these parameters may be determined through repeated executions of the target application; however, this is not always possible, particularly in large-scale environments and long-running applications. We present an analytic model that allows the user to quickly and inexpensively predict application performance and fine-tune applications built on the PREMA platform

College of William & Mary: W&M Publish

Semantic object watermark re-synchronization based on skeleton vertex corresponds

Author: Kollias S
Ntalianis K
Raftopoulos K
Tzouveli P
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

DSpace at NTUA

Interactions in Virtual Worlds:Proceedings Twente Workshop on Language Technology 15

Author
Publication venue: 'University Library/University of Twente'
Publication date: 19/05/1999
Field of study

University of Twente Research Information