174 research outputs found

    Computing Large-scale Distance Matrices on GPU

    Get PDF
    Abstract-A distance matrix is simply an n×n two-dimensional array that contains pairwise distances of a set of n points in a metric space. It has a wide range of usage in several fields of scientific research e.g., data clustering, machine learning, pattern recognition, image analysis, information retrieval, signal processing, bioinformatics etc. However, as the size of n increases, the computation of distance matrix becomes very slow or incomputable on traditional general purpose computers. In this paper, we propose an inexpensive and scalable data-parallel solution to this problem by dividing the computational tasks and data on GPUs. We demonstrate the performance of our method on a set of real-world biological networks constructed from a renowned breast cancer study

    SkelCL: enhancing OpenCL for high-level programming of multi-GPU systems

    Get PDF
    Application development for modern high-performance systems with Graphics Processing Units (GPUs) currently relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs. In this paper, we present SkelCL – a high-level programming approach for systems with multiple GPUs and its implementation as a library on top of OpenCL. SkelCL provides three main enhancements to the OpenCL standard: 1) computations are conveniently expressed using parallel algorithmic patterns (skeletons); 2) memory management is simplified using parallel container data types (vectors and matrices); 3) an automatic data (re)distribution mechanism allows for implicit data movements between GPUs and ensures scalability when using multiple GPUs. We demonstrate how SkelCL is used to implement parallel applications on one- and two-dimensional data. We report experimental results to evaluate our approach in terms of programming effort and performance

    On the feasibility of automatically selecting similar patients in highly individualized radiotherapy dose reconstruction for historic data of pediatric cancer survivors

    Get PDF
    Purpose: The aim of this study is to establish the first step toward a novel and highly individualized three-dimensional (3D) dose distribution reconstruction method, based on CT scans and organ delineations of recently treated patients. Specifically, the feasibility of automatically selecting the CT scan of a recently treated childhood cancer patient who is similar to a given historically treated child who suffered from Wilms' tumor is assessed.Methods: A cohort of 37 recently treated children between 2- and 6-yr old are considered. Five potential notions of ground-truth similarity are proposed, each focusing on different anatomical aspects. These notions are automatically computed from CT scans of the abdomen and 3D organ delineations (liver, spleen, spinal cord, external body contour). The first is based on deformable image registration, the second on the Dice similarity coefficient, the third on the Hausdorff distance, the fourth on pairwise organ distances, and the last is computed by means of the overlap volume histogram. The relationship between typically available features of historically treated patients and the proposed ground-truth notions of similarity is studied by adopting state-of-the-art machine learning techniques, including random forest. Also, the feasibility of automatically selecting the most similar patient is assessed by comparing ground-truth rankings of similarity with predicted rankings.Results: Similarities (mainly) based on the external abdomen shape and on the pairwise organ distances are highly correlated (Pearson rp ≥ 0.70) and are successfully modeled with random forests based on historically recorded features (pseudo-R2 ≥ 0.69). In contrast, similarities based on the shape of internal organs cannot be modeled. For the similarities that random forest can reliably model, an estimation of feature relevance indicates that abdominal diameters and weight are the most important. Experiments on automatically selecting similar patients lead to coarse, yet quite robust results: the most similar patient is retrieved only 22% of the times, however, the error in worst-case scenarios is limited, with the fourth most similar patient being retrieved.Conclusions: Results demonstrate that automatically selecting similar patients is feasible when focusing on the shape of the external abdomen and on the position of internal organs. Moreover, whereas the common practice in phantom-based dose reconstruction is to select a representative phantom using age, height, and weight as discriminant factors for any treatment scenario, our analysis on abdominal tumor treatment for children shows that the most relevant features are weight and the anterior-posterior and left-right abdominal diameters

    Exploiting multiple levels of parallelism of Convergent Cross Mapping

    Get PDF
    Identifying causal relationships between variables remains an essential problem across various scientific fields. Such identification is particularly important but challenging in complex systems, such as those involving human behaviour, sociotechnical contexts, and natural ecosystems. By exploiting state space reconstruction via lagged embeddings of time series, convergent cross mapping (CCM) serves as an important method for addressing this problem. While powerful, CCM is computationally costly; moreover, CCM results are highly sensitive to several parameter values. Current best practice involves performing a systematic search on a range of parameters, but results in high computational burden, which mainly raises barriers to practical use. In light of both such challenges and the growing size of commonly encountered datasets from complex systems, inferring the causality with confidence using CCM in a reasonable time becomes a biggest challenge. In this thesis, I investigate the performance associated with a variety of parallel techniques (CUDA, Thrust, OpenMP, MPI and Spark, etc.,) to accelerate convergent cross mapping. The performance of each method was collected and compared across multiple experiments to further evaluate potential bottlenecks. Moreover, the work deployed and tested combinations of these techniques to more thoroughly exploit available computation resources. The results obtained from these experiments indicate that GPUs can only accelerate the CCM algorithm under certain circumstances and requirements. Otherwise, the overhead of data transfer and communication can become the limiting bottleneck. On the other hand, in cluster computing, the MPI/OpenMP framework outperforms the Spark framework by more than one order of magnitude in terms of processing speed and provides more consistent performance for distributed computing. This also reflects the large size of the output from the CCM algorithm. However, Spark shows better cluster infrastructure management, ease of software engineering, and more ready handling of other aspects, such as node failure and data replication. Furthermore, combinations of GPU and cluster frameworks are deployed and compared in GPU/CPU clusters. An apparent speedup can be achieved in the Spark framework, while extra time cost is incurred in the MPI/OpenMP framework. The underlying reason reflects the fact that the code complexity imposed by GPU utilization cannot be readily offset in the MPI/OpenMP framework. Overall, the experimental results on parallelized solutions have demonstrated a capacity for over an order of magnitude performance improvement when compared with the widely used current library rEDM. Such economies in computation time can speed learning and robust identification of causal drivers in complex systems. I conclude that these parallel techniques can achieve significant improvements. However, the performance gain varies among different techniques or frameworks. Although the use of GPUs can accelerate the application, there still exists constraints required to be taken into consideration, especially with regards to the input data scale. Without proper usage, GPUs use can even slow down the whole execution time. Convergent cross mapping can achieve a maximum speedup by adopting the MPI/OpenMP framework, as it is suitable to computation-intensive algorithms. By contrast, the Spark framework with integrated GPU accelerators still offers low execution cost comparing to the pure Spark version, which mainly fits in data-intensive problems

    A New Incremental Decision Tree Learning for Cyber Security based on ILDA and Mahalanobis Distance

    Get PDF
    A cyber-attack detection is currently essential for computer network protection. The fundamentals of protection are to detect cyber-attack effectively with the ability to combat it in various ways and with constant data learning such as internet traffic. With these functions, each cyber-attack can be memorized and protected effectively any time. This research will present procedures for a cyber-attack detection system Incremental Decision Tree Learning (IDTL) that use the principle through Incremental Linear Discriminant Analysis (ILDA) together with Mahalanobis distance for classification of the hierarchical tree by reducing data features that enhance classification of a variety of malicious data. The proposed model can learn a new incoming datum without involving the previous learned data and discard this datum after being learned. The results of the experiments revealed that the proposed method can improve classification accuracy as compare with other methods. They showed the highest accuracy when compared to other methods. If comparing with the effectiveness of each class, it was found that the proposed method can classify both intrusion datasets and other datasets efficiently

    Imputation Aided Methylation Analysis

    Get PDF
    Genome-wide DNA methylation analysis is of broad interest to medical research because of its central role in human development and disease. However, generating high-quality methylomes on a large scale is particularly expensive due to technical issues inherent to DNA treatment with bisulfite, requiring deeper than usual sequencing. In silico methodologies, such as imputation, can be used to address this limitation and improve the coverage and quality of data produced in these experiments. Imputation is a statistical technique where missing values are substituted with computed values. The process involves leveraging information from reference data to calculate probable values for missing data points. In this thesis, imputation is explored for its potential to increase the value of methylation datasets sequenced at different depths: 1. First, a new R package, Methylation Analysis ToolkiT (MATT), was developed to deal with large numbers of WGBS datasets in a computationally- and memory-efficient manner. 2. Second, the performance of DNA methylation-specific and generic imputation tools were assessed by down-sampling high-quality (100x) WGBS datasets to determine the extent to which missing data can be recovered and the accuracy of imputed values. 3. Third, to overcome shortfalls within existing tools, a novel imputation tool was developed, termed Global IMputation of cpg MEthylation (GIMMEcpg). GIMMEcpg default implementation is based on Model Stacking and outperforms existing tools in accuracy and speed. 4. Lastly, to demonstrate its potential, GIMMEcpg was used to impute ten shallow (17x) WGBS datasets from healthy volunteers of the Personal Genome Project UK with high accuracy. Moreover, the extent of missing and low-quality data, as well as the reproducibility and accuracy of methylation datasets, were explored for different data types (Microarrays, Reduced Representation Bisulfite Sequencing (RRBS), Whole Genome Bisulfite Sequencing (WGBS), EM-Seq and Nanopore sequencing)

    Towards High-Level Programming for Systems with Many Cores

    Get PDF
    The final publication is available at Springer vi

    A spatiotemporal complexity architecture of human brain activity

    Get PDF

    Diffeomorphic Transformations for Time Series Analysis: An Efficient Approach to Nonlinear Warping

    Full text link
    The proliferation and ubiquity of temporal data across many disciplines has sparked interest for similarity, classification and clustering methods specifically designed to handle time series data. A core issue when dealing with time series is determining their pairwise similarity, i.e., the degree to which a given time series resembles another. Traditional distance measures such as the Euclidean are not well-suited due to the time-dependent nature of the data. Elastic metrics such as dynamic time warping (DTW) offer a promising approach, but are limited by their computational complexity, non-differentiability and sensitivity to noise and outliers. This thesis proposes novel elastic alignment methods that use parametric \& diffeomorphic warping transformations as a means of overcoming the shortcomings of DTW-based metrics. The proposed method is differentiable \& invertible, well-suited for deep learning architectures, robust to noise and outliers, computationally efficient, and is expressive and flexible enough to capture complex patterns. Furthermore, a closed-form solution was developed for the gradient of these diffeomorphic transformations, which allows an efficient search in the parameter space, leading to better solutions at convergence. Leveraging the benefits of these closed-form diffeomorphic transformations, this thesis proposes a suite of advancements that include: (a) an enhanced temporal transformer network for time series alignment and averaging, (b) a deep-learning based time series classification model to simultaneously align and classify signals with high accuracy, (c) an incremental time series clustering algorithm that is warping-invariant, scalable and can operate under limited computational and time resources, and finally, (d) a normalizing flow model that enhances the flexibility of affine transformations in coupling and autoregressive layers.Comment: PhD Thesis, defended at the University of Navarra on July 17, 2023. 277 pages, 8 chapters, 1 appendi
    • …
    corecore