595 research outputs found
Evaluating linear recursive filters on clusters of workstations
The aim of this paper is to show that the recently developed high performance algorithm for solving linear recurrence systems with constant coefficients together with the new BLAS-based algorithm for narrow-banded triangular Toeplitz matrix-vector multiplication allow to evaluate linear recursive filters efficiently, even on clusters of workstations. The results of experiments performed on a cluster of twelve Linux workstations are also presented. The performance of the algorithm is comparable with the performance of two processors of Cray SV-1 for such kind of recursive problems
Vision Science and Technology at NASA: Results of a Workshop
A broad review is given of vision science and technology within NASA. The subject is defined and its applications in both NASA and the nation at large are noted. A survey of current NASA efforts is given, noting strengths and weaknesses of the NASA program
Automated Genome-Wide Protein Domain Exploration
Exploiting the exponentially growing genomics and proteomics data requires high quality, automated analysis. Protein domain modeling is a key area of molecular biology as it unravels the mysteries of evolution, protein structures, and protein functions. A plethora of sequences exist in protein databases with incomplete domain knowledge. Hence this research explores automated bioinformatics tools for faster protein domain analysis. Automated tool chains described in this dissertation generate new protein domain models thus enabling more effective genome-wide protein domain analysis. To validate the new tool chains, the Shewanella oneidensis and Escherichia coli genomes were processed, resulting in a new peptide domain database, detection of poor domain models, and identification of likely new domains. The automated tool chains will require months or years to model a small genome when executing on a single workstation. Therefore the dissertation investigates approaches with grid computing and parallel processing to significantly accelerate these bioinformatics tool chains
Optimal binning of X-ray spectra and response matrix design
A theoretical framework is developed to estimate the optimal binning of X-ray
spectra. We derived expressions for the optimal bin size for model spectra as
well as for observed data using different levels of sophistication. It is shown
that by taking into account both the number of photons in a given spectral
model bin and their average energy over the bin size, the number of model
energy bins and the size of the response matrix can be reduced by a factor of
. The response matrix should then contain the response at the bin
centre as well as its derivative with respect to the incoming photon energy. We
provide practical guidelines for how to construct optimal energy grids as well
as how to structure the response matrix. A few examples are presented to
illustrate the present methods.Comment: 16 pages, 7 figures, accepted for publication in Astronomy and
Astrophysic
A Survey on Enterprise Network Security: Asset Behavioral Monitoring and Distributed Attack Detection
Enterprise networks that host valuable assets and services are popular and
frequent targets of distributed network attacks. In order to cope with the
ever-increasing threats, industrial and research communities develop systems
and methods to monitor the behaviors of their assets and protect them from
critical attacks. In this paper, we systematically survey related research
articles and industrial systems to highlight the current status of this arms
race in enterprise network security. First, we discuss the taxonomy of
distributed network attacks on enterprise assets, including distributed
denial-of-service (DDoS) and reconnaissance attacks. Second, we review existing
methods in monitoring and classifying network behavior of enterprise hosts to
verify their benign activities and isolate potential anomalies. Third,
state-of-the-art detection methods for distributed network attacks sourced from
external attackers are elaborated, highlighting their merits and bottlenecks.
Fourth, as programmable networks and machine learning (ML) techniques are
increasingly becoming adopted by the community, their current applications in
network security are discussed. Finally, we highlight several research gaps on
enterprise network security to inspire future research.Comment: Journal paper submitted to Elseive
Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms
This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach.
The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms.
Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods.
We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs.
This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution.
The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms.
Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms
Adaptive remote visualization system with optimized network performance for large scale scientific data
This dissertation discusses algorithmic and implementation aspects of an automatically configurable remote visualization system, which optimally decomposes and adaptively maps the visualization pipeline to a wide-area network. The first node typically serves as a data server that generates or stores raw data sets and a remote client resides on the last node equipped with a display device ranging from a personal desktop to a powerwall. Intermediate nodes can be located anywhere on the network and often include workstations, clusters, or custom rendering engines. We employ a regression model-based network daemon to estimate the effective bandwidth and minimal delay of a transport path using active traffic measurement. Data processing time is predicted for various visualization algorithms using block partition and statistical technique. Based on the link measurements, node characteristics, and module properties, we strategically organize visualization pipeline modules such as filtering, geometry generation, rendering, and display into groups, and dynamically assign them to appropriate network nodes to achieve minimal total delay for post-processing or maximal frame rate for streaming applications. We propose polynomial-time algorithms using the dynamic programming method to compute the optimal solutions for the problems of pipeline decomposition and network mapping under different constraints. A parallel based remote visualization system, which comprises a logical group of autonomous nodes that cooperate to enable sharing, selection, and aggregation of various types of resources distributed over a network, is implemented and deployed at geographically distributed nodes for experimental testing. Our system is capable of handling a complete spectrum of remote visualization tasks expertly including post processing, computational steering and wireless sensor network monitoring. Visualization functionalities such as isosurface, ray casting, streamline, linear integral convolution (LIC) are supported in our system. The proposed decomposition and mapping scheme is generic and can be applied to other network-oriented computation applications whose computing components form a linear arrangement
- …