7 research outputs found

    Editorial for IEEE access special section on theoretical foundations for big data applications : challenges and opportunities

    Full text link
    Big data is one of the hottest research topics in science and technology communities, and it possesses a great application potential in every sector for our society, such as climate, economy, health, social science, and so on. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, and manage. We can conclude that big data is still in its infancy stage, and we will face many unprecedented problems and challenges along the way of this unfolding chapter of human history

    Straggler Robust Distributed Matrix Inverse Approximation

    Full text link
    A cumbersome operation in numerical analysis and linear algebra, optimization, machine learning and engineering algorithms; is inverting large full-rank matrices which appears in various processes and applications. This has both numerical stability and complexity issues, as well as high expected time to compute. We address the latter issue, by proposing an algorithm which uses a black-box least squares optimization solver as a subroutine, to give an estimate of the inverse (and pseudoinverse) of real nonsingular matrices; by estimating its columns. This also gives it the flexibility to be performed in a distributed manner, thus the estimate can be obtained a lot faster, and can be made robust to \textit{stragglers}. Furthermore, we assume a centralized network with no message passing between the computing nodes, and do not require a matrix factorization; e.g. LU, SVD or QR decomposition beforehand.Comment: 4 pages, 1 figure, conferenc

    Kernel solver design of FPGA-based real-time simulator for active distribution networks

    Get PDF
    The field-programmable gate array (FPGA)-based real-time simulator takes advantage of many merits of FPGA, such as small time-step, high simulation precision, rich I/O interface resources, and low cost. The sparse linear equations formed by the node conductance matrix need to be solved repeatedly within each time-step, which introduces great challenges to the performance of the real-time simulator. In this paper, a fine-grained solver of the FPGA-based real-time simulator for active distribution networks is designed to meet the computational demand. The framework of the solver, offline process design on PC and online process design on FPGA are proposed in detail. The modified IEEE 33-node system with photovoltaics is simulated on a 4-FPGA-based real-time simulator. Simulation results are compared with PSCAD/EMTDC under the same conditions to validate the solver design

    An Improved Smith-Waterman Algorithm Based on Spark Parallelization

    Get PDF
    This paper proposes the design and the implementation of a Spark parallelization plan for improving the Smith-Waterman (SW) algorithm, named the Spark-OSW algorithm. Then, the Spark-OSW was verified through accuracy, performance and acceleration tests. The results show that the proposed algorithm achieved 100% accuracy, ran much faster than the SW, and performed well in cluster environment. The research findings shed important new light on the database search for gene sequences

    Spark-Based Large-Scale Matrix Inversion for Big Data Processing

    No full text

    ParallelLCA : a foreground aware parallel calculator for life cycle assessment

    Get PDF
    Life Cycle Assessment (LCA), which aims to assess the environmental impacts during the life cycle of a system product (S) (e.g., production of aluminum in Quebec), can be used to compare different systems built with different types of materials to determine which is the least harmful to the environment. The calculation in LCA represents a computational challenge as it is dependent on the size of the system, the number of iterations in the Monte-Carlo simulation, and the number of uncertain variables in the system. First, the solving of a linear system of dimensions in the order of 10,000 equations by 10,000 unknown variables is required for the base case. Second, the building of a graph iterative in nature with minimum dimensions of 10,000 vertices. Third, the computing of a Monte-Carlo simulation requiring several thousands of iterations to converge is to be computed. Finally, a sensitivity analysis which requires the computing of millions of correlations between vectors each having a dimension that is proportional to the number of iterations in the Monte-Carlo simulation. To best solve the computational challenges present in LCA, this research benefits from well established libraries that solve large sparse linear systems and performs large sparse matrix computing. Also, this thesis adopted mathematical optimizations that removed the matrix inverse step from the contribution analysis module, which is very expensive, as well as other algorithmic optimizations that removed the large and variant part of the LCA supply-chain from the matrix component of the various calculation phases. Furthermore, this research experimented with libraries such as OpenMP, MPI, and Apache Spark to parallelize the computation. First, the thesis will discuss the literature regarding these computational opportunities. Second, it will present a proposed LCA calculator for implementing an efficient LCA computation. Finally, it will present the performance of computing the different phases of LCA for various dimensions of the system (S) and concludes with suggestions for improvement and future development
    corecore