334,905 research outputs found

    Parallel programming in biomedical signal processing

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia BiomédicaPatients with neuromuscular and cardiorespiratory diseases need to be monitored continuously. This constant monitoring gives rise to huge amounts of multivariate data which need to be processed as soon as possible, so that their most relevant features can be extracted. The field of parallel processing, an area from the computational sciences, comes naturally as a way to provide an answer to this problem. For the parallel processing to succeed it is necessary to adapt the pre-existing signal processing algorithms to the modern architectures of computer systems with several processing units. In this work parallel processing techniques are applied to biosignals, connecting the area of computer science to the biomedical domain. Several considerations are made on how to design parallel algorithms for signal processing, following the data parallel paradigm. The emphasis is given to algorithm design, rather than the computing systems that execute these algorithms. Nonetheless, shared memory systems and distributed memory systems are mentioned in the present work. Two signal processing tools integrating some of the parallel programming concepts mentioned throughout this work were developed. These tools allow a fast and efficient analysis of long-term biosignals. The two kinds of analysis are focused on heart rate variability and breath frequency, and aim to the processing of electrocardiograms and respiratory signals, respectively. The proposed tools make use of the several processing units that most of the actual computers include in their architecture, giving the clinician a fast tool without him having to set up a system specifically meant to run parallel programs

    The Role of Distributed Computing in Big Data Science: Case Studies in Forensics and Bioinformatics

    Get PDF
    2014 - 2015The era of Big Data is leading the generation of large amounts of data, which require storage and analysis capabilities that can be only ad- dressed by distributed computing systems. To facilitate large-scale distributed computing, many programming paradigms and frame- works have been proposed, such as MapReduce and Apache Hadoop, which transparently address some issues of distributed systems and hide most of their technical details. Hadoop is currently the most popular and mature framework sup- porting the MapReduce paradigm, and it is widely used to store and process Big Data using a cluster of computers. The solutions such as Hadoop are attractive, since they simplify the transformation of an application from non-parallel to the distributed one by means of general utilities and without many skills. However, without any algorithm engineering activity, some target applications are not alto- gether fast and e cient, and they can su er from several problems and drawbacks when are executed on a distributed system. In fact, a distributed implementation is a necessary but not su cient condition to obtain remarkable performance with respect to a non-parallel coun- terpart. Therefore, it is required to assess how distributed solutions are run on a Hadoop cluster, and/or how their performance can be improved to reduce resources consumption and completion times. In this dissertation, we will show how Hadoop-based implementations can be enhanced by using carefully algorithm engineering activity, tuning, pro ling and code improvements. It is also analyzed how to achieve these goals by working on some critical points, such as: data local computation, input split size, number and granularity of tasks, cluster con guration, input/output representation, etc. i In particular, to address these issues, we choose some case studies coming from two research areas where the amount of data is rapidly increasing, namely, Digital Image Forensics and Bioinformatics. We mainly describe full- edged implementations to show how to design, engineer, improve and evaluate Hadoop-based solutions for Source Camera Identi cation problem, i.e., recognizing the camera used for taking a given digital image, adopting the algorithm by Fridrich et al., and for two of the main problems in Bioinformatics, i.e., alignment- free sequence comparison and extraction of k-mer cumulative or local statistics. The results achieved by our improved implementations show that they are substantially faster than the non-parallel counterparts, and re- markably faster than the corresponding Hadoop-based naive imple- mentations. In some cases, for example, our solution for k-mer statis- tics is approximately 30× faster than our Hadoop-based naive im- plementation, and about 40× faster than an analogous tool build on Hadoop. In addition, our applications are also scalable, i.e., execution times are (approximately) halved by doubling the computing units. Indeed, algorithm engineering activities based on the implementation of smart improvements and supported by careful pro ling and tun- ing may lead to a much better experimental performance avoiding potential problems. We also highlight how the proposed solutions, tips, tricks and insights can be used in other research areas and problems. Although Hadoop simpli es some tasks of the distributed environ- ments, we must thoroughly know it to achieve remarkable perfor- mance. It is not enough to be an expert of the application domain to build Hadop-based implementations, indeed, in order to achieve good performance, an expert of distributed systems, algorithm engi- neering, tuning, pro ling, etc. is also required. Therefore, the best performance depend heavily on the cooperation degree between the domain expert and the distributed algorithm engineer. [edited by Author]XIV n.s

    Deterministic voting in distributed systems using error-correcting codes

    Get PDF
    Distributed voting is an important problem in reliable computing. In an N Modular Redundant (NMR) system, the N computational modules execute identical tasks and they need to periodically vote on their current states. In this paper, we propose a deterministic majority voting algorithm for NMR systems. Our voting algorithm uses error-correcting codes to drastically reduce the average case communication complexity. In particular, we show that the efficiency of our voting algorithm can be improved by choosing the parameters of the error-correcting code to match the probability of the computational faults. For example, consider an NMR system with 31 modules, each with a state of m bits, where each module has an independent computational error probability of 10^-3. In, this NMR system, our algorithm can reduce the average case communication complexity to approximately 1.0825 m compared with the communication complexity of 31 m of the naive algorithm in which every module broadcasts its local result to all other modules. We have also implemented the voting algorithm over a network of workstations. The experimental performance results match well the theoretical predictions

    A Multi-Core Solver for Parity Games

    Get PDF
    We describe a parallel algorithm for solving parity games,\ud with applications in, e.g., modal mu-calculus model\ud checking with arbitrary alternations, and (branching) bisimulation\ud checking. The algorithm is based on Jurdzinski's Small Progress\ud Measures. Actually, this is a class of algorithms, depending on\ud a selection heuristics.\ud \ud Our algorithm operates lock-free, and mostly wait-free (except for\ud infrequent termination detection), and thus allows maximum\ud parallelism. Additionally, we conserve memory by avoiding storage\ud of predecessor edges for the parity graph through strictly\ud forward-looking heuristics.\ud \ud We evaluate our multi-core implementation's behaviour on parity games\ud obtained from mu-calculus model checking problems for a set of\ud communication protocols, randomly generated problem instances, and\ud parametric problem instances from the literature.\ud \u

    CCL: a portable and tunable collective communication library for scalable parallel computers

    Get PDF
    A collective communication library for parallel computers includes frequently used operations such as broadcast, reduce, scatter, gather, concatenate, synchronize, and shift. Such a library provides users with a convenient programming interface, efficient communication operations, and the advantage of portability. A library of this nature, the Collective Communication Library (CCL), intended for the line of scalable parallel computer products by IBM, has been designed. CCL is part of the parallel application programming interface of the recently announced IBM 9076 Scalable POWERparallel System 1 (SP1). In this paper, we examine several issues related to the functionality, correctness, and performance of a portable collective communication library while focusing on three novel aspects in the design and implementation of CCL: 1) the introduction of process groups, 2) the definition of semantics that ensures correctness, and 3) the design of new and tunable algorithms based on a realistic point-to-point communication model

    Parallel symbolic state-space exploration is difficult, but what is the alternative?

    Full text link
    State-space exploration is an essential step in many modeling and analysis problems. Its goal is to find the states reachable from the initial state of a discrete-state model described. The state space can used to answer important questions, e.g., "Is there a dead state?" and "Can N become negative?", or as a starting point for sophisticated investigations expressed in temporal logic. Unfortunately, the state space is often so large that ordinary explicit data structures and sequential algorithms cannot cope, prompting the exploration of (1) parallel approaches using multiple processors, from simple workstation networks to shared-memory supercomputers, to satisfy large memory and runtime requirements and (2) symbolic approaches using decision diagrams to encode the large structured sets and relations manipulated during state-space generation. Both approaches have merits and limitations. Parallel explicit state-space generation is challenging, but almost linear speedup can be achieved; however, the analysis is ultimately limited by the memory and processors available. Symbolic methods are a heuristic that can efficiently encode many, but not all, functions over a structured and exponentially large domain; here the pitfalls are subtler: their performance varies widely depending on the class of decision diagram chosen, the state variable order, and obscure algorithmic parameters. As symbolic approaches are often much more efficient than explicit ones for many practical models, we argue for the need to parallelize symbolic state-space generation algorithms, so that we can realize the advantage of both approaches. This is a challenging endeavor, as the most efficient symbolic algorithm, Saturation, is inherently sequential. We conclude by discussing challenges, efforts, and promising directions toward this goal
    • 

    corecore