73 research outputs found

    Programming matrix algorithms-by-blocks for thread-level parallelism

    Get PDF
    With the emergence of thread-level parallelism as the primary means for continued improvement of performance, the programmability issue has reemerged as an obstacle to the use of architectural advances. We argue that evolving legacy libraries for dense and banded linear algebra is not a viable solution due to constraints imposed by early design decisions. We propose a philosophy of abstraction and separation of concerns that provides a promising solution in this problem domain. The first abstraction, FLASH, allows algorithms to express computation with matrices consisting of blocks, facilitating algorithms-by-blocks. Transparent to the library implementor, operand descriptions are registered for a particular operation a priori. A runtime system, SuperMatrix, uses this information to identify data dependencies between suboperations, allowing them to be scheduled to threads out-of-order and executed in parallel. But not all classical algorithms in linear algebra lend themselves to conversion to algorithms-by-blocks. We show how our recently proposed LU factorization with incremental pivoting and closely related algorithm-by-blocks for the QR factorization, both originally designed for out-of-core computation, overcome this difficulty. Anecdotal evidence regarding the development of routines with a core functionality demonstrates how the methodology supports high productivity while experimental results suggest that high performance is abundantly achievabl

    Fast block QR update in digital signal processing

    Full text link
    [EN] The processing of digital sound signals often requires the computation of the QR factorization of a rectangular system matrix. However, sometimes, only a given (and probably small) part of the system matrix varies from the current sample to the next one. We exploit this fact to reuse some computations carried out to process the former sample in order to save execution time in the processing of the current sample. These savings can be critical for real-time applications running on low power consumption devices with high mobility. In addition, we propose a simple out-of-order task-parallel algorithm for the QR factorization using OpenMP that exploits the multicore capability of modern processors. Furthermore, in the presence of a Graphics Processing Unit (GPU) in the system, our algorithm is able to off-load some tasks to the GPU to accelerate the computation on these hardware devices.This work was supported by the Spanish Ministry of Economy and Competitiveness under MINECO and FEDER projects TEC2015-67387-C4-1-R and TIN2014-53495-R; and the Generalitat Valenciana PROMETEOII/2014/003Alventosa, FJ.; Alonso-Jordá, P.; Vidal Maciá, AM.; Piñero, G.; Quintana-Ortí, ES. (2019). Fast block QR update in digital signal processing. The Journal of Supercomputing. 75(3):1051-1064. https://doi.org/10.1007/s11227-018-2298-5S10511064753Augonnet C, Thibault S, Namyst R (2010) StarPU: a runtime system for scheduling tasks over accelerator-based multicore machines. Research Report RR-7240, INRIAButtari A, Langou J, Kurzak J, Dongarra J (2008) Parallel tiled QR factorization for multicore architectures. Concurr Comput Pract Exp 20(13):1573–1590Buttari A, Langou J, Kurzak J, Dongarra J (2009) A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput 35(1):38–53Chan E, Quintana-Ortí ES, Quintana-Ortí G, van de Geijn R (2007) Supermatrix out-of-order scheduling of matrix operations for smp and multi-core architectures. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’07. ACM, New York, pp 116–125Chan E, Van Zee FG, Quintana-Ortí ES, Quintana-Ortí G, De Van Geijn R (2007) Satisfying your dependencies with supermatrix. In: Proceedings—2007 IEEE International Conference on Cluster Computing, CLUSTER 2007. pp 91–99Chan E, Van Zee FG, Bientinesi P, Quintana-Ortí ES, Quintana-Ortí G, van de Geijn RA (2008) Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In: Chatterjee S, Scott ML (eds) PPOPP. ACM, New york, pp 123–132Golub GH, Van Loan CF (2013) Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, BaltimoreGunter BC, van de Geijn RA (2005) Parallel out-of-core computation and updating the QR factorization. ACM Trans Math Softw 31(1):60–78Joffrain T, Quintana-Ortí ES, van de Geijn RA (2004) Rapid development of high-performance out-of-core solvers. In: Applied Parallel Computing, State of the Art in Scientific Computing, 7th International Workshop, PARA 2004, Lyngby, Denmark, June 20–23, 2004, revised selected papers. pp 413–422NVIDIA. The cuBLAS library. http://docs.nvidia.com/cuda/cublas . Accessed May 2017Openblas. http://www.openblas.net . Accessed May 2017Quintana-Ortí G, Quintana-Ortí ES, Van De Geijn RA, Van Zee FG, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26The OmpSs Programming Model. https://pm.bsc.es/ompss . Accessed May 2017Wende F, Steinke T, Cordes F (2014) Multi-threaded kernel offloading to gpgpu using hyper-q on kepler architecture. Technical Report 14-19, ZIB, Takustr.7, 14195 Berli

    A framework for argument-based task synchronization with automatic detection of dependencies

    Get PDF
    [Abstract] Synchronization in parallel applications can be achieved either implicitly or explicitly. Implicit synchronization is typical of programming environments that provide predefined, and often simple, patterns of parallelism such as data-parallel libraries and languages and skeletal operations. Nevertheless, more flexible approaches that allow to express arbitrary task-level parallel computations without a predefined structure request in turn that the user explicitly specifies the synchronization needed among the parallel tasks. In this paper we present a library-based approach that enables arbitrary patterns of parallelism with minimal effort for the user. Our proposal is the first generic approach to express parallelism we know of that requires neither explicit synchronizations nor a detail of the dependencies of the parallel tasks. Our strategy relies on expressing the parallel tasks as functions that convey their dependencies implicitly by means of their arguments. These function arguments are analyzed by our library, called DepSpawn, when a parallel task is spawned in order to enforce its dependencies. Our experiments indicate that DepSpawn is very competitive, both in terms of performance and programmability, with respect to a widespread high-level approach like OpenMP.Xunta de Galicia; INCITE08PXIB105161PRMinisterio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación de España; AP2009-475

    An ANP model to support decision-making in a Portuguese pharmaceutical supply chain

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia e Gestão IndustrialIn order to cope up with a volatile and scarce environment, companies have had to adopt new ways of thinking. One of them is embracing Supply Chain Management (SCM) and considering it as a crucial asset if willing to compete in the marketplace. In the context of SCM, it is important to understand how Lean and Agile SCM paradigms are adopted as means of achieving an efficient Supply Chain (SC). Besides the mentioned paradigms, many Key Performance Indicators (KPIs) and management practices come along with SCM, and it is important that SC managers identify the ones that bring the most competitive advantages. This dissertation intends to design a model based on the Analytic Network Process (ANP) in order to assist SC managers from different entities of a pharmaceutical SC in exploring efficient decisions to be made, with respect to KPIs and management practices, as means of achieving a highly competitive SC

    Techno-economic model-based design space exploration of ‘combined’ ship propulsion systems

    Get PDF
    The architecture of a ship propulsion system, developed during early stages of the overall ship design process, has a very large impact on the overall design and performance of the ship. The design space exploration to arrive at the final ship propulsion architecture can be a rather complex process for high-performance 'combined' ship propulsion systems designed to achieve multiple, often conflicting, design objectives. This paper proposes a novel process for the process of design space exploration based on a model-based ‘Techno-economic & Environmental Risk Assessment’ (TERA) approach, executed using a hybrid ‘Multiple-Criteria Decision-Making’ (MCDM) procedure, to select a compromise solution from competing propulsion system architectures populating the design space. The process utilizes a combination of performance data generated from performance simulation of developed models, as well as comparative expert opinions-based metrics for information not available early in the ship design process for selection of a 'compromise solution'. The paper includes an illustrative example of application of the proposed process for design space exploration for a combined propulsion system architecture for a notional destroyer

    An analytical decision approach to rural telecommunication infrastructure selection

    Get PDF
    Telecommunications infrastructure is recognised as the fundamental factor for economic and social development for it is the platform of communication and transaction within and beyond geographical boundaries. It is a necessity for social benefits, growth, connection and competition, more in the rural communities in developing countries. Its acquisition entails great investment, considering the emergence of various technologies and thereby making the selection a critical task. The research described in this thesis is concerned with a comprehensive examination and analytical procedures on the selection of technologies, for rural telecommunications infrastructure. A structured systematic approach is deemed necessary to reduce the time and effort in the decision-making process. A literature review was carried out to explore the knowledge in the areas of Multi-Criteria Decision-Making (MCDM) approaches, with particular focus on the analytical decision processes. The findings indicate that, the Analytic Hierarchy Process (AHP)/AnalyticNetwork Process (ANP) are powerful decision methods capable of modelling such acomplex problem. Primarily, an AHP model is formulated, however, since the problem at hand involves many interactions and dependencies, a more holistic method is required to overcome its shortcomings by allowing for dependencies and feedback within the structure. Hence, the ANP is adopted and its network is established to represent the problem, making way to telecommunications experts to provide their judgements on the elements within the structure. The data collected are used to estimate the relative influence from which the overall synthesise is derived, forming a general ANP model for such a rural telecommunications selection problem. To provide a more wide-ranging investigation regarding selecting a potential rural telecommunications infrastructure, another systematic analysis that utilises a BOCR-based (Benefits, Opportunities, Costs and Risks) ANP was conducted. The obtained results indicate that Microwave technology is the most preferred alternative within the context of the developing countries. Sensitivity analysis was performed to show robustness of the obtained results. This framework provides the structure and the flexibility required for such decisions. It enables decision makers to examine the strengths and weaknesses of the problem, by comparing several technology options, with respect to appropriate gauge for judgement. Moreover, using the ANP, the criteria for such a technology selection task were clearly identified and the problem was structured systematically. A case study was carried out in Libya involving its main telecommunications infrastructure provider to demonstrate how such rural technology selection decisions can be made within a specific developing country's rural area. Based on the results of this case study that were in agreement with the focus group's expectations, it can be concluded that the application of the ANP in the selection of telecommunications technology, is indeed beneficial. In addition, it is believed that telecommunications planners could, by the use of data pertaining to another rural area, utilise the developed model to propose appropriate solutions. If new criteria and/or alternatives emerge to satisfy changing business needs, they can also be included in the ANP model.EThOS - Electronic Theses Online ServiceLibyan GovernmentGBUnited Kingdo

    Balancing stakeholder goals in structural fire design of steel-framed buildings.

    Get PDF
    When designing a steel-framed building, there are many design options available in terms of meeting the structural fire resistance objectives. Different stakeholders have different opinions about which approach is the most appropriate. A tool or procedure is needed that allows the integration of these diverse stakeholder desires to achieve the most appropriate option. Hence, this research aims to develop this tool. Firstly, extraction and understanding of stakeholder views, along with the capacity to rank them, are needed. However, the challenge is that there are many stakeholder views, so there is also the need to manage these views without ignoring any of them. Towards that some tools are identified in this work to manage different and sometimes divergent stakeholder views to rank them for appropriate decision making. Secondly, to achieve consensus on multiple stakeholder views, the Weighted/Geometric Mean Method (W/GMM) is investigated. Decision analysis techniques including Analytic Hierarchy/Network Processes (AHP/ANP) and Technique of Order of Preference and Similarity to Ideal Solution (TOPSIS) are also studied to understand the influences of stakeholder views on competing design options and to rank the options in the decision-making process. Thirdly, to critically assess the ranking of the design options, a parametric study is needed to predict the suitability and cost-benefit of the various available options. This is carried out by probabilistic analysis of typical structural steel members considering varying parameters and limit state criteria. A probabilistic cost evaluation is also included. Hence, a hybrid design decision analysis tool is developed for the integration of the assessment outcomes to enable the identification of the most cost-effective design option. The final part of this work takes a case study of a realistic building and demonstrates how the process can be applied to structural fire design. This is carried out by integrating and synthesising views from chartered stakeholders and outcomes of the parametric study on representative steel members of the building using the developed hybrid decision analysis tool. The case study follows a risk-based structural fire design decision-making procedure