Search CORE

1,387 research outputs found

An efficient recursive shortest spanning tree algorithm using linking properties

Author: Constantinides AG
Kwok SH
Siu WC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Published versio

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

Spiral - Imperial College Digital Repository

Hong Kong University of Science and Technology Institutional Repository

Privaatsust säilitavad paralleelarvutused graafiülesannete jaoks

Author: Anagreh Mohammad
Publication venue
Publication date: 15/05/2023
Field of study

Turvalisel mitmeosalisel arvutusel põhinevate reaalsete privaatsusrakenduste loomine on SMC-protokolli arvutusosaliste ümmarguse keerukuse tõttu keeruline. Privaatsust säilitavate tehnoloogiate uudsuse ja nende probleemidega kaasnevate suurte arvutuskulude tõttu ei ole paralleelseid privaatsust säilitavaid graafikualgoritme veel uuritud. Graafikalgoritmid on paljude arvutiteaduse rakenduste selgroog, nagu navigatsioonisüsteemid, kogukonna tuvastamine, tarneahela võrk, hüperspektraalne kujutis ja hõredad lineaarsed lahendajad. Graafikalgoritmide suurte privaatsete andmekogumite töötlemise kiirendamiseks ja kõrgetasemeliste arvutusnõuete täitmiseks on vaja privaatsust säilitavaid paralleelseid algoritme. Seetõttu esitleb käesolev lõputöö tipptasemel protokolle privaatsuse säilitamise paralleelarvutustes erinevate graafikuprobleemide jaoks, ühe allika lühima tee, kõigi paaride lühima tee, minimaalse ulatuva puu ja metsa ning algebralise tee arvutamise. Need uued protokollid on üles ehitatud kombinatoorsete ja algebraliste graafikualgoritmide põhjal lisaks SMC protokollidele. Nende protokollide koostamiseks kasutatakse ka ühe käsuga mitut andmeoperatsiooni, et vooru keerukust tõhusalt vähendada. Oleme väljapakutud protokollid juurutanud Sharemind SMC platvormil, kasutades erinevaid graafikuid ja võrgukeskkondi. Selles lõputöös kirjeldatakse uudseid paralleelprotokolle koos nendega seotud algoritmide, tulemuste, kiirendamise, hindamiste ja ulatusliku võrdlusuuringuga. Privaatsust säilitavate ühe allika lühimate teede ja minimaalse ulatusega puuprotokollide tegelike juurutuste tulemused näitavad tõhusat meetodit, mis vähendas tööaega võrreldes varasemate töödega sadu kordi. Lisaks ei ole privaatsust säilitavate kõigi paaride lühima tee protokollide hindamine ja ulatuslik võrdlusuuringud sarnased ühegi varasema tööga. Lisaks pole kunagi varem käsitletud privaatsust säilitavaid metsa ja algebralise tee arvutamise protokolle.Constructing real-world privacy applications based on secure multiparty computation is challenging due to the round complexity of the computation parties of SMC protocol. Due to the novelty of privacy-preserving technologies and the high computational costs associated with these problems, parallel privacy-preserving graph algorithms have not yet been studied. Graph algorithms are the backbone of many applications in computer science, such as navigation systems, community detection, supply chain network, hyperspectral image, and sparse linear solvers. In order to expedite the processing of large private data sets for graphs algorithms and meet high-end computational demands, privacy-preserving parallel algorithms are needed. Therefore, this Thesis presents the state-of-the-art protocols in privacy-preserving parallel computations for different graphs problems, single-source shortest path (SSSP), All-pairs shortest path (APSP), minimum spanning tree (MST) and forest (MSF), and algebraic path computation. These new protocols have been constructed based on combinatorial and algebraic graph algorithms on top of the SMC protocols. Single-instruction-multiple-data (SIMD) operations are also used to build those protocols to reduce the round complexities efficiently. We have implemented the proposed protocols on the Sharemind SMC platform using various graphs and network environments. This Thesis outlines novel parallel protocols with their related algorithms, the results, speed-up, evaluations, and extensive benchmarking. The results of the real implementations of the privacy-preserving single-source shortest paths and minimum spanning tree protocols show an efficient method that reduced the running time hundreds of times compared with previous works. Furthermore, the evaluation and extensive benchmarking of privacy-preserving All-pairs shortest path protocols are not similar to any previous work. Moreover, the privacy-preserving minimum spanning forest and algebraic path computation protocols have never been addressed before.https://www.ester.ee/record=b555865

DSpace at Tartu University Library

Variational methods and its applications to computer vision

Author: Pellegrino Erika
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/08/2022
Field of study

Many computer vision applications such as image segmentation can be formulated in a ''variational'' way as energy minimization problems. Unfortunately, the computational task of minimizing these energies is usually difficult as it generally involves non convex functions in a space with thousands of dimensions and often the associated combinatorial problems are NP-hard to solve. Furthermore, they are ill-posed inverse problems and therefore are extremely sensitive to perturbations (e.g. noise). For this reason in order to compute a physically reliable approximation from given noisy data, it is necessary to incorporate into the mathematical model appropriate regularizations that require complex computations. The main aim of this work is to describe variational segmentation methods that are particularly effective for curvilinear structures. Due to their complex geometry, classical regularization techniques cannot be adopted because they lead to the loss of most of low contrasted details. In contrast, the proposed method not only better preserves curvilinear structures, but also reconnects some parts that may have been disconnected by noise. Moreover, it can be easily extensible to graphs and successfully applied to different types of data such as medical imagery (i.e. vessels, hearth coronaries etc), material samples (i.e. concrete) and satellite signals (i.e. streets, rivers etc.). In particular, we will show results and performances about an implementation targeting new generation of High Performance Computing (HPC) architectures where different types of coprocessors cooperate. The involved dataset consists of approximately 200 images of cracks, captured in three different tunnels by a robotic machine designed for the European ROBO-SPECT project.Open Acces

Spiral - Imperial College Digital Repository

High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

Author: Gebbie Tim
Hendricks Dieter
Wilcox Diane
Publication venue: 'Academy of Science of South Africa'
Publication date: 02/08/2015
Field of study

We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of implementatio

arXiv.org e-Print Archive

Crossref

Academy of Science of South Africa (ASSAf): Open Journal Systems

Directory of Open Access Journals

Active Object Classification from 3D Range Data with Mobile Robots

Author: Patten Timothy
Publication venue: Faculty of Engineering and Information Technologies, School of Aerospace, Mechanical and Mechatronic Engineering
Publication date: 01/01/2017
Field of study

This thesis addresses the problem of how to improve the acquisition of 3D range data with a mobile robot for the task of object classification. Establishing the identities of objects in unknown environments is fundamental for robotic systems and helps enable many abilities such as grasping, manipulation, or semantic mapping. Objects are recognised by data obtained from sensor observations, however, data is highly dependent on viewpoint; the variation in position and orientation of the sensor relative to an object can result in large variation in the perception quality. Additionally, cluttered environments present a further challenge because key data may be missing. These issues are not always solved by traditional passive systems where data are collected from a fixed navigation process then fed into a perception pipeline. This thesis considers an active approach to data collection by deciding where is most appropriate to make observations for the perception task. The core contributions of this thesis are a non-myopic planning strategy to collect data efficiently under resource constraints, and supporting viewpoint prediction and evaluation methods for object classification. Our approach to planning uses Monte Carlo methods coupled with a classifier based on non-parametric Bayesian regression. We present a novel anytime and non-myopic planning algorithm, Monte Carlo active perception, that extends Monte Carlo tree search to partially observable environments and the active perception problem. This is combined with a particle-based estimation process and a learned observation likelihood model that uses Gaussian process regression. To support planning, we present 3D point cloud prediction algorithms and utility functions that measure the quality of viewpoints by their discriminatory ability and effectiveness under occlusion. The utility of viewpoints is quantified by information-theoretic metrics, such as mutual information, and an alternative utility function that exploits learned data is developed for special cases. The algorithms in this thesis are demonstrated in a variety of scenarios. We extensively test our online planning and classification methods in simulation as well as with indoor and outdoor datasets. Furthermore, we perform hardware experiments with different mobile platforms equipped with different types of sensors. Most significantly, our hardware experiments with an outdoor robot are to our knowledge the first demonstrations of online active perception in a real outdoor environment. Active perception has broad significance in many applications. This thesis emphasises the advantages of an active approach to object classification and presents its assimilation with a wide range of robotic systems, sensors, and perception algorithms. By demonstration of performance enhancements and diversity, our hope is that the concept of considering perception and planning in an integrated manner will be of benefit in improving current systems that rely on passive data collection

Sydney eScholarship

Improving Scalability and Usability of Parallel Runtime Environments for High Availability and High Performance Systems

Author: Angskun Thara
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2007
Field of study

The number of processors embedded in high performance computing platforms is growing daily to solve larger and more complex problems. Hence, parallel runtime environments have to support and adapt to the underlying platforms that require scalability and fault management in more and more dynamic environments. This dissertation aims to analyze, understand and improve the state of the art mechanisms for managing highly dynamic, large scale applications. This dissertation demonstrates that the use of new scalable and fault-tolerant topologies, combined with rerouting techniques, builds parallel runtime environments, which are able to efficiently and reliably deliver sets of information to a large number of processes. Several important graph properties are provided to illustrate the theoretical capability of these topologies in terms of both scalability and fault-tolerance, such as reasonable degree, regular graph, low diameter, symmetric graph, low cost factor, low message traffic density, optimal connectivity, low fault-diameter and strongly resilient. The dissertation builds a communication framework based on these topologies to support parallel runtime environments. Such a framework can handle multiple types of messages, e.g., unicast, multicast, broadcast and all-gather. Additionally, the communication framework has been formally verified to work in both normal and failure circumstances without creating any of the common problems such as broadcast storm, deadlock and non-progress cycle

University of Tennessee, Knoxville: Trace

Recommended from our members

A High-Performance Domain-Specific Language and Code Generator for General N-body Problems

Author: Aghababaie Beni Laleh
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

General N-body problems are a set of problems in which an update to a single element in the system depends on every other element. N-body problems are ubiquitous, with applications in various domains ranging from scientific computing simulations in molecular dynamics, astrophysics, acoustics, and fluid dynamics all the way to computer vision, data mining and machine learning problems. Different N-body algorithms have been designed and implemented in these various fields. However, there is a big gap between the algorithm one designs on paper and the code that runs efficiently on a parallel system. It is time-consuming to write fast, parallel, and scalable code for these problems. On the other hand, the sheer scale and growth of modern scientific datasets necessitate exploiting the power of both parallel and approximation algorithms where there is a potential to trade-off accuracy for performance. The main problem that we are tackling in this thesis is how to automatically generate asymptotically optimal N-body algorithms from the high-level specification of the problem. We combine the body of work in performance optimizations, compilers and the domain of N-body problems to build a unified system where domain scientists can write programs at the high level while attaining performance of code written by an expert at the low level.In order to generate a high-performance, scalable code for this group of problems, we take the following steps in this thesis; first, we propose a unified algorithmic framework named PASCAL in order to address the challenge of designing a general algorithmic template to represent the class of N-body problems. PASCAL utilizes space-partitioning trees and user-controlled pruning/approximations to reduce the asymptotic runtime complexity from linear to logarithmic in the number of data points. In PASCAL, we design an algorithm that automatically generates conditions for pruning or approximation of an N-body problem considering the problem's definition. In order to evaluate PASCAL, we developed tree-based algorithms for six well-known problems: k-nearest neighbors, range search, minimum spanning tree, kernel density estimation, expectation maximization, and Hausdorff distance. We show that applying domain-specific optimizations and parallelization to the algorithms written in PASCAL achieves 10x to 230x speedup compared to state-of-the-art libraries on a dual-socket Intel Xeon processor with 16 cores on real-world datasets. Second, we extend the PASCAL framework to build PASCAL-X that adds support for NUMA-aware parallelization. PASCAL-X also presents insights on the influence of tuning parameters. Tuning parameters such as leaf size (influences the shape of the tree) and cut-off level (controls the granularity of tasks) of the space-partitioning trees result in performance improvement of up to 4.6x. A key goal is to generate scalable and high-performance code automatically without sacrificing productivity. That implies minimizing the effort the users have to put in to generate the desired high-performance code. Another critical factor is the adaptivity, which indicates the amount of effort that is required to extend the high-performance code generation to new N-body problems. Finally, we consider these factors and develop a domain-specific language and code generator named Portal, which is built on top of PASCAL-X. Portal's language design is inspired by the mathematical representation of N-body problems, resulting in an intuitive language for rapid implementation of a variety of problems. Portal's back-end is designed and implemented to generate optimized, parallel, and scalable implementations for multi-core systems. We demonstrate that the performance achieved by using Portal is comparable to that of expert hand-optimized code while providing productivity for domain scientists. For instance, using Portal for the k-nearest neighbors problem gains performance that is similar to the hand-optimized code, while reducing the lines of code by 68x. To the best of our knowledge, there are no known libraries or frameworks that implement parallel asymptotically optimal algorithms for the class of general N-body problems and this thesis primarily aims to fill this gap. Finally, we present a case study of Portal for the real-world problem of face clustering. In this case study, we show that Portal not only provides a fast solution for the face clustering problem with similar accuracy as the state-of-the-art algorithm, but also it provides productivity by implementing the face clustering algorithm in only 14 lines of Portal code

eScholarship - University of California

Fourteenth Biennial Status Report: März 2017 - February 2019

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2019
Field of study

MPG.PuRe

Towards Automatic and Adaptive Optimizations of MPI Collective Operations

Author: Pjesivac-Grbovic Jelena
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2007
Field of study

Message passing is one of the most commonly used paradigms of parallel programming. Message Passing Interface, MPI, is a standard used in scientific and high-performance computing. Collective operations are a subset of MPI standard that deals with processes synchronization, data exchange and computation among a group of processes. The collective operations are commonly used and can be application performance bottleneck. The performance of collective operations depends on many factors, some of which are the input parameters (e.g., communicator and message size); system characteristics (e.g., interconnect type); the application computation and communication pattern; and internal algorithm parameters (e.g., internal segment size). We refer to an algorithm and its internal parameters as a method. The goal of this dissertation is a performance improvement of MPI collective operations and applications that use them. In our framework, during a collective call, a system-specific decision function is invoked to select the most appropriate method for the particular collective instance. This dissertation focuses on automatic techniques for system-specific decision function generation. Our approach takes the following steps: first, we collect method performance information on the system of interest; second, we analyze this information using parallel communication models, graphical encoding methods, and decision trees; third, based on the previous step, we automatically generate the system-specific decision function to be used at run-time. In situation when a detailed performance measurement is not feasible, method performance models can be used to supplement the measured method performance information. We build and evaluate parallel communication models of 35 different collective algorithms. These models are built on top of the three commonly used point-to-point communication models, Hockney, LogGP, and PLogP.We use the method performance information on a system to build quadtrees and C4.5 decision trees of variable sizes and accuracies. The collective method selection functions are then generated automatically from these trees. Our experiments show that quadtrees of three or four levels are often enough to approximate experimentally optimal decision with a small mean performance penalty (less than 10%). The C4.5 decision trees are even more accurate (with mean performance penalty of less than 5%). The size and accuracy of C4.5 decision trees can be further improved with use of appropriate composite attributes (such as “total message size”, or “even communicator size”.) Finally, we apply these techniques to tune the collective operations on the Grig cluster at the University of Tennessee and to improve an application performance on the Cray XT4 system at Oak Ridge National Laboratory. The tuned collective is able to achieve more than 40% mean performance improvement over the native broadcast implementation. Using the platform-specific reduce on Cray XT4 lead to 10% improvement in the overall application performance. Our results show that the methods we explored are both applicable and effective for the system-specific optimizations of collective operations and are a right step toward automatically tunable, adaptive, MPI collectives

University of Tennessee, Knoxville: Trace

Region-based representations of image and video: segmentation tools for multimedia services

Author: Marqués Acosta Fernando
Salembier Clairon Philippe Jean
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standards are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC