Search CORE

6,741 research outputs found

Probabilistic structural mechanics research for parallel processing computers

Author: Chen Heh-Chyun
Martin William R.
Sues Robert H.
Twisdale Lawrence A.
Publication venue
Publication date
Field of study

Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical

An FPGA implementation of an investigative many-core processor, Fynbos : in support of a Fortran autoparallelising software pipeline

Author: Wyngaard Janet Ruth
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2014
Field of study

Includes bibliographical references.In light of the power, memory, ILP, and utilisation walls facing the computing industry, this work examines the hypothetical many-core approach to finding greater compute performance and efficiency. In order to achieve greater efficiency in an environment in which Moore’s law continues but TDP has been capped, a means of deriving performance from dark and dim silicon is needed. The many-core hypothesis is one approach to exploiting these available transistors efficiently. As understood in this work, it involves trading in hardware control complexity for hundreds to thousands of parallel simple processing elements, and operating at a clock speed sufficiently low as to allow the efficiency gains of near threshold voltage operation. Performance is there- fore dependant on exploiting a new degree of fine-grained parallelism such as is currently only found in GPGPUs, but in a manner that is not as restrictive in application domain range. While removing the complex control hardware of traditional CPUs provides space for more arithmetic hardware, a basic level of control is still required. For a number of reasons this work chooses to replace this control largely with static scheduling. This pushes the burden of control primarily to the software and specifically the compiler, rather not to the programmer or to an application specific means of control simplification. An existing legacy tool chain capable of autoparallelising sequential Fortran code to the degree of parallelism necessary for many-core exists. This work implements a many-core architecture to match it. Prototyping the design on an FPGA, it is possible to examine the real world performance of the compiler-architecture system to a greater degree than simulation only would allow. Comparing theoretical peak performance and real performance in a case study application, the system is found to be more efficient than any other reviewed, but to also significantly under perform relative to current competing architectures. This failing is apportioned to taking the need for simple hardware too far, and an inability to implement static scheduling mitigating tactics due to lack of support for such in the compiler

CERN openlab Whitepaper on Future IT Challenges in Scientific Research

Author: Di Meglio Alberto
Gaillard Melissa
Purcell Andrew
Publication venue
Publication date: 01/01/2014
Field of study

This whitepaper describes the major IT challenges in scientific research at CERN and several other European and international research laboratories and projects. Each challenge is exemplified through a set of concrete use cases drawn from the requirements of large-scale scientific programs. The paper is based on contributions from many researchers and IT experts of the participating laboratories and also input from the existing CERN openlab industrial sponsors. The views expressed in this document are those of the individual contributors and do not necessarily reflect the view of their organisations and/or affiliates

ZENODO

Memory system support for image processing

Author: Hsieh Wilson C.
Zhang Lixin
Publication venue: University of Utah
Publication date: 01/01/1999
Field of study

Journal ArticleProcessor speeds are increasing rapidly, but memory speeds are not keeping pace. Image processing is an important application domain that is particularly impacted by this growing performance gap. Image processing algorithms tend to have poor memory locality because they access their data in a non-sequential fashion and reuse that data infrequently. As a result, they often exhibit poor cache and TLB hit rates on conventional memory systems, which limits overall performance. Most current approaches to addressing the memory bottleneck focus on modifying cache organizations or introducing processor-based prefetching. The Impulse memory system takes a different approach: allowing application software to control how, when, and where data are loaded into a conventional processor cache. Impulse does this by letting software configure how the memory controller interprets the physical addresses exported by a processor. Introducing an extra level of address translation in the memory. Data that is sparse in memory can be accessed densely, which improves both cache and TLB utilization, and Impulse hides memory latency by prefectching data within the memory controller. We describe how Impulse improves the performance of three image processing algorithms: an Impulse memory system yields speedups of 40% to 226% over an otherwise identical machine with a conventional memory system

Impulse: Memory System Support for Scientific Applications

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1999
Field of study

Performance analysis and tuning in multicore environments

Author: César Galobardes Eduardo
Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Zavala Jiménez Mario Adolfo
Publication venue
Publication date: 01/01/2009
Field of study

Performance analysis is the task of monitor the behavior of a program execution. The main goal is to find out the possible adjustments that might be done in order improve the performance. To be able to get that improvement it is necessary to find the different causes of overhead. Nowadays we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software). When we talk about multicore we are also speaking of shared memory systems, on this master thesis we talk about the issues involved on the performance analysis and tuning of applications running specifically in a shared Memory system. We move one step ahead to take the performance analysis to another level by analyzing the applications structure and patterns. We also present some tools specifically addressed to the performance analysis of OpenMP multithread application. At the end we present the results of some experiments performed with a set of OpenMP scientific application.Análisis de rendimiento es el área de estudio encargada de monitorizar el comportamiento de la ejecución de programas informáticos. El principal objetivo es encontrar los posibles ajustes que serán necesarios para mejorar el rendimiento. Para poder obtener esa mejora es necesario encontrar las principales causas de overhead. Actualmente estamos sumergidos en la era multicore, pero existe una brecha entre el nivel de desarrollo de sus dos principales divisiones (hardware y software). Cuando hablamos de multicore también estamos hablando de sistemas de memoria compartida. Nosotros damos un paso más al abordar el análisis de rendimiento a otro nivel por medio del estudio de la estructura de las aplicaciones y sus patrones. También presentamos herramientas de análisis de aplicaciones que son específicas para el análisis de rendimiento de aplicaciones paralelas desarrolladas con OpenMP. Al final presentamos los resultados de algunos experimentos realizados con un grupo de aplicaciones científicas desarrolladas bajo este modelo de programación.L'Anàlisi de rendiment és l'àrea d'estudi encarregada de monitorar el comportament de l'execució de programes informàtics. El principal objectiu és trobar els possibles ajustaments que seran necessaris per a millorar el rendiment. Per a poder obtenir aquesta millora és necessari trobar les principals causes de l'overhead (excessos de computació no productiva). Actualment estem immersos en l'era multicore, però existeix una rasa entre el nivell de desenvolupament de les seves dues principals divisions (maquinari i programari). Quan parlam de multicore, també estem parlant de sistemes de memòria compartida. Nosaltres donem un pas més per a abordar l'anàlisi de rendiment en un altre nivell per mitjà de l'estudi de l'estructura de les aplicacions i els seus patrons. També presentem eines d'anàlisis d'aplicacions que són específiques per a l'anàlisi de rendiment d'aplicacions paral·leles desenvolupades amb OpenMP. Al final, presentem els resultats d'alguns experiments realitzats amb un grup d'aplicacions científiques desenvolupades sota aquest model de programació

Diposit Digital de Documents de la UAB

Factors shaping the evolution of electronic documentation systems

Author: Dede Christopher J.
Scace Jacque R.
Sullivan Tim R.
Publication venue
Publication date
Field of study

The main goal is to prepare the space station technical and managerial structure for likely changes in the creation, capture, transfer, and utilization of knowledge. By anticipating advances, the design of Space Station Project (SSP) information systems can be tailored to facilitate a progression of increasingly sophisticated strategies as the space station evolves. Future generations of advanced information systems will use increases in power to deliver environmentally meaningful, contextually targeted, interconnected data (knowledge). The concept of a Knowledge Base Management System is emerging when the problem is focused on how information systems can perform such a conversion of raw data. Such a system would include traditional management functions for large space databases. Added artificial intelligence features might encompass co-existing knowledge representation schemes; effective control structures for deductive, plausible, and inductive reasoning; means for knowledge acquisition, refinement, and validation; explanation facilities; and dynamic human intervention. The major areas covered include: alternative knowledge representation approaches; advanced user interface capabilities; computer-supported cooperative work; the evolution of information system hardware; standardization, compatibility, and connectivity; and organizational impacts of information intensive environments