27 research outputs found

    Towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system

    Get PDF
    International audienceUnderstanding three-dimensional seismic wave propagation in complex media remains one of the main challenges of quantitative seismology. Because of its simplicity and numerical efficiency, the finite-differences method is one of the standard techniques implemented to consider the elastodynamics equation. Additionally, this class of modelling heavily relies on parallel architectures in order to tackle large scale geometries including a detailed description of the physics. Last decade, significant efforts have been devoted towards efficient implementation of the finite-differences methods on emerging architectures. These contributions have demonstrated their efficiency leading to robust industrial applications. The growing representation of heterogeneous architectures combining general purpose multicore platforms and accelerators leads to redesign current parallel application. In this paper, we consider StarPU task-based runtime system in order to harness the power of heterogeneous CPU+GPU computing nodes. We detail our implementation and compare the performance obtained with the classical CPU or GPU only versions. Preliminary results demonstrate significant speedups in comparison with the best implementation suitable for homogeneous cores

    PERFORMANCE ANALYSIS AND FITNESS OF GPGPU AND MULTICORE ARCHITECTURES FOR SCIENTIFIC APPLICATIONS

    Get PDF
    Recent trends in computing architecture development have focused on exploiting task- and data-level parallelism from applications. Major hardware vendors are experimenting with novel parallel architectures, such as the Many Integrated Core (MIC) from Intel that integrates 50 or more x86 processors on a single chip, the Accelerated Processing Unit from AMD that integrates a multicore x86 processor with a graphical processing unit (GPU), and many other initiatives from other hardware vendors that are underway. Therefore, various types of architectures are available to developers for accelerating an application. A performance model that predicts the suitability of the architecture for accelerating an application would be very helpful prior to implementation. Thus, in this research, a Fitness model that ranks the potential performance of accelerators for an application is proposed. Then the Fitness model is extended using statistical multiple regression to model both the runtime performance of accelerators and the impact of programming models on accelerator performance with high degree of accuracy. We have validated both performance models for all the case studies. The error rate of these models, calculated using the experimental performance data, is tolerable in the high-performance computing field. In this research, to develop and validate the two performance models we have also analyzed the performance of several multicore CPUs and GPGPU architectures and the corresponding programming models using multiple case studies. The first case study used in this research is a matrix-matrix multiplication algorithm. By varying the size of the matrix from a small size to a very large size, the performance of the multicore and GPGPU architectures are studied. The second case study used in this research is a biological spiking neural network (SNN), implemented with four neuron models that have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the performance variation of the four popular accelerators (Intel Xeon, AMD Opteron, Nvidia Fermi, and IBM PS3) and four advanced CPU architectures (Intel 32 core, AMD 32 core, IBM 16 core, and SUN 32 core) with problem size (matrix and network size) scaling, available optimization techniques and execution configuration. This thorough analysis provides insight regarding how the performance of an accelerator is affected by problem size, optimization techniques, and accelerator configuration. We have analyzed the performance impact of four popular multicore parallel programming models, POSIX-threading, Open Multi-Processing (OpenMP), Open Computing Language (OpenCL), and Concurrency Runtime on an Intel i7 multicore architecture; and, two GPGPU programming models, Compute Unified Device Architecture (CUDA) and OpenCL, on a NVIDIA GPGPU. With the broad study conducted using a wide range of application complexity, multiple optimizations, and varying problem size, it was found that according to their achievable performance, the programming models for the x86 processor cannot be ranked across all applications, whereas the programming models for GPGPU can be ranked conclusively. We also have qualitatively and quantitatively ranked all the six programming models in terms of their perceived programming effort. The results and analysis in this research indicate and are supported by the proposed performance models that for a given hardware system, the best performance for an application is obtained with a proper match of programming model and architecture

    SOLID-SHELL FINITE ELEMENT MODELS FOR EXPLICIT SIMULATIONS OF CRACK PROPAGATION IN THIN STRUCTURES

    Get PDF
    Crack propagation in thin shell structures due to cutting is conveniently simulated using explicit finite element approaches, in view of the high nonlinearity of the problem. Solidshell elements are usually preferred for the discretization in the presence of complex material behavior and degradation phenomena such as delamination, since they allow for a correct representation of the thickness geometry. However, in solid-shell elements the small thickness leads to a very high maximum eigenfrequency, which imply very small stable time-steps. A new selective mass scaling technique is proposed to increase the time-step size without affecting accuracy. New ”directional” cohesive interface elements are used in conjunction with selective mass scaling to account for the interaction with a sharp blade in cutting processes of thin ductile shells

    GPU data structures for graphics and vision

    Get PDF
    Graphics hardware has in recent years become increasingly programmable, and its programming APIs use the stream processor model to expose massive parallelization to the programmer. Unfortunately, the inherent restrictions of the stream processor model, used by the GPU in order to maintain high performance, often pose a problem in porting CPU algorithms for both video and volume processing to graphics hardware. Serial data dependencies which accelerate CPU processing are counterproductive for the data-parallel GPU. This thesis demonstrates new ways for tackling well-known problems of large scale video/volume analysis. In some instances, we enable processing on the restricted hardware model by re-introducing algorithms from early computer graphics research. On other occasions, we use newly discovered, hierarchical data structures to circumvent the random-access read/fixed write restriction that had previously kept sophisticated analysis algorithms from running solely on graphics hardware. For 3D processing, we apply known game graphics concepts such as mip-maps, projective texturing, and dependent texture lookups to show how video/volume processing can benefit algorithmically from being implemented in a graphics API. The novel GPU data structures provide drastically increased processing speed, and lift processing heavy operations to real-time performance levels, paving the way for new and interactive vision/graphics applications.Graphikhardware wurde in den letzen Jahren immer weiter programmierbar. Ihre APIs verwenden das Streamprozessor-Modell, um die massive Parallelisierung auch für den Programmierer verfügbar zu machen. Leider folgen aus dem strikten Streamprozessor-Modell, welches die GPU für ihre hohe Rechenleistung benötigt, auch Hindernisse in der Portierung von CPU-Algorithmen zur Video- und Volumenverarbeitung auf die GPU. Serielle Datenabhängigkeiten beschleunigen zwar CPU-Verarbeitung, sind aber für die daten-parallele GPU kontraproduktiv . Diese Arbeit präsentiert neue Herangehensweisen für bekannte Probleme der Video- und Volumensverarbeitung. Teilweise wird die Verarbeitung mit Hilfe von modifizierten Algorithmen aus der frühen Computergraphik-Forschung an das beschränkte Hardwaremodell angepasst. Anderswo helfen neu entdeckte, hierarchische Datenstrukturen beim Umgang mit den Schreibzugriff-Restriktionen die lange die Portierung von komplexeren Bildanalyseverfahren verhindert hatten. In der 3D-Verarbeitung nutzen wir bekannte Konzepte aus der Computerspielegraphik wie Mipmaps, projektive Texturierung, oder verkettete Texturzugriffe, und zeigen auf welche Vorteile die Video- und Volumenverarbeitung aus hardwarebeschleunigter Graphik-API-Implementation ziehen kann. Die präsentierten GPU-Datenstrukturen bieten drastisch schnellere Verarbeitung und heben rechenintensive Operationen auf Echtzeit-Niveau. Damit werden neue, interaktive Bildverarbeitungs- und Graphik-Anwendungen möglich

    Virtual Reality Methods for Research in the Geosciences

    Get PDF
    In the presented work, I evaluate if and how Virtual Reality (VR) technologies can be used to support researchers working in the geosciences by providing immersive, collaborative visualization systems as well as virtual tools for data analysis. Technical challenges encountered in the development of theses systems are identified and solutions for these are provided. To enable geologists to explore large digital terrain models (DTMs) in an immersive, explorative fashion within a VR environment, a suitable terrain rendering algorithm is required. For realistic perception of planetary curvature at large viewer altitudes, spherical rendering of the surface is necessary. Furthermore, rendering must sustain interactive frame rates of about 30 frames per second to avoid sensory confusion of the user. At the same time, the data structures used for visualization should also be suitable for efficiently computing spatial properties such as height profiles or volumes in order to implement virtual analysis tools. To address these requirements, I have developed a novel terrain rendering algorithm based on tiled quadtree hierarchies using the HEALPix parametrization of a sphere. For evaluation purposes, the system is applied to a 500 GiB dataset representing the surface of Mars. Considering the current development of inexpensive remote surveillance equipment such as quadcopters, it seems inevitable that these devices will play a major role in future disaster management applications. Virtual reality installations in disaster management headquarters which provide an immersive visualization of near-live, three-dimensional situational data could then be a valuable asset for rapid, collaborative decision making. Most terrain visualization algorithms, however, require a computationally expensive pre-processing step to construct a terrain database. To address this problem, I present an on-the-fly pre-processing system for cartographic data. The system consists of a frontend for rendering and interaction as well as a distributed processing backend executing on a small cluster which produces tiled data in the format required by the frontend on demand. The backend employs a CUDA based algorithm on graphics cards to perform efficient conversion from cartographic standard projections to the HEALPix-based grid used by the frontend. Measurement of spatial properties is an important step in quantifying geological phenomena. When performing these tasks in a VR environment, a suitable input device and abstraction for the interaction (a “virtual tool”) must be provided. This tool should enable the user to precisely select the location of the measurement even under a perspective projection. Furthermore, the measurement process should be accurate to the resolution of the data available and should not have a large impact on the frame rate in order to not violate interactivity requirements. I have implemented virtual tools based on the HEALPix data structure for measurement of height profiles as well as volumes. For interaction, a ray-based picking metaphor was employed, using a virtual selection ray extending from the user’s hand holding a VR interaction device. To provide maximum accuracy, the algorithms access the quad-tree terrain database at the highest available resolution level while at the same time maintaining interactivity in rendering. Geological faults are cracks in the earth’s crust along which a differential movement of rock volumes can be observed. Quantifying the direction and magnitude of such translations is an essential requirement in understanding earth’s geological history. For this purpose, geologists traditionally use maps in top-down projection which are cut (e.g. using image editing software) along the suspected fault trace. The two resulting pieces of the map are then translated in parallel against each other until surface features which have been cut by the fault motion come back into alignment. The amount of translation applied is then used as a hypothesis for the magnitude of the fault action. In the scope of this work it is shown, however, that performing this study in a top-down perspective can lead to the acceptance of faulty reconstructions, since the three-dimensional structure of topography is not considered. To address this problem, I present a novel terrain deformation algorithm which allows the user to trace a fault line directly within a 3D terrain visualization system and interactively deform the terrain model while inspecting the resulting reconstruction from arbitrary perspectives. I demonstrate that the application of 3D visualization allows for a more informed interpretation of fault reconstruction hypotheses. The algorithm is implemented on graphics cards and performs real-time geometric deformation of the terrain model, guaranteeing interactivity with respect to all parameters. Paleoceanography is the study of the prehistoric evolution of the ocean. One of the key data sources used in this research are coring experiments which provide point samples of layered sediment depositions at the ocean floor. The samples obtained in these experiments document the time-varying sediment concentrations within the ocean water at the point of measurement. The task of recovering the ocean flow patterns based on these deposition records is a challenging inverse numerical problem, however. To support domain scientists working on this problem, I have developed a VR visualization tool to aid in the verification of model parameters by providing simultaneous visualization of experimental data from coring as well as the resulting predicted flow field obtained from numerical simulation. Earth is visualized as a globe in the VR environment with coring data being presented using a billboard rendering technique while the time-variant flow field is indicated using Line-Integral-Convolution (LIC). To study individual sediment transport pathways and their correlation with the depositional record, interactive particle injection and real-time advection is supported

    Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics 2015

    Get PDF
    This volume contains the full papers accepted for presentation at the ECCOMAS Thematic Conference on Multibody Dynamics 2015 held in the Barcelona School of Industrial Engineering, Universitat Politècnica de Catalunya, on June 29 - July 2, 2015. The ECCOMAS Thematic Conference on Multibody Dynamics is an international meeting held once every two years in a European country. Continuing the very successful series of past conferences that have been organized in Lisbon (2003), Madrid (2005), Milan (2007), Warsaw (2009), Brussels (2011) and Zagreb (2013); this edition will once again serve as a meeting point for the international researchers, scientists and experts from academia, research laboratories and industry working in the area of multibody dynamics. Applications are related to many fields of contemporary engineering, such as vehicle and railway systems, aeronautical and space vehicles, robotic manipulators, mechatronic and autonomous systems, smart structures, biomechanical systems and nanotechnologies. The topics of the conference include, but are not restricted to: ● Formulations and Numerical Methods ● Efficient Methods and Real-Time Applications ● Flexible Multibody Dynamics ● Contact Dynamics and Constraints ● Multiphysics and Coupled Problems ● Control and Optimization ● Software Development and Computer Technology ● Aerospace and Maritime Applications ● Biomechanics ● Railroad Vehicle Dynamics ● Road Vehicle Dynamics ● Robotics ● Benchmark ProblemsPostprint (published version

    Multibody dynamics 2015

    Get PDF
    This volume contains the full papers accepted for presentation at the ECCOMAS Thematic Conference on Multibody Dynamics 2015 held in the Barcelona School of Industrial Engineering, Universitat Politècnica de Catalunya, on June 29 - July 2, 2015. The ECCOMAS Thematic Conference on Multibody Dynamics is an international meeting held once every two years in a European country. Continuing the very successful series of past conferences that have been organized in Lisbon (2003), Madrid (2005), Milan (2007), Warsaw (2009), Brussels (2011) and Zagreb (2013); this edition will once again serve as a meeting point for the international researchers, scientists and experts from academia, research laboratories and industry working in the area of multibody dynamics. Applications are related to many fields of contemporary engineering, such as vehicle and railway systems, aeronautical and space vehicles, robotic manipulators, mechatronic and autonomous systems, smart structures, biomechanical systems and nanotechnologies. The topics of the conference include, but are not restricted to: Formulations and Numerical Methods, Efficient Methods and Real-Time Applications, Flexible Multibody Dynamics, Contact Dynamics and Constraints, Multiphysics and Coupled Problems, Control and Optimization, Software Development and Computer Technology, Aerospace and Maritime Applications, Biomechanics, Railroad Vehicle Dynamics, Road Vehicle Dynamics, Robotics, Benchmark Problems. The conference is organized by the Department of Mechanical Engineering of the Universitat Politècnica de Catalunya (UPC) in Barcelona. The organizers would like to thank the authors for submitting their contributions, the keynote lecturers for accepting the invitation and for the quality of their talks, the awards and scientific committees for their support to the organization of the conference, and finally the topic organizers for reviewing all extended abstracts and selecting the awards nominees.Postprint (published version

    Electromagnetic ray-tracing for the investigation of multipath and vibration signatures in radar imagery

    Get PDF
    Synthetic Aperture Radar (SAR) imagery has been used extensively within UK Defence and Intelligence for many years. Despite this, the exploitation of SAR imagery is still challenging to the inexperienced imagery analyst as the non-literal image provided for exploitation requires careful consideration of the imaging geometry, the target being imaged and the physics of radar interactions with objects. It is therefore not surprising to note that in 2017 the most useful tool available to a radar imagery analyst is a contextual optical image of the same area. This body of work presents a way to address this by adopting recent advances in radar signal processing and computational geometry to develop a SAR simulator called SARCASTIC (SAR Ray-Caster for the Intelligence Community) that can rapidly render a scene with the precise collection geometry of an image being exploited. The work provides a detailed derivation of the simulator from first principals. It is then validated against a range of real-world SAR collection systems. The work shows that such a simulator can provide an analyst with the necessary tools to extract intelligence from a collection that is unavailable to a conventional imaging system. The thesis then describes a new technique that allows a vibrating target to be detected within a SAR collection. The simulator is used to predict a unique scattering signature - described as a one-sided paired echo. Finally an experiment is described that was performed by Cranfield University to specifications determined by SARCASTIC which show that the unique radar signature can actually occur within a SAR collection
    corecore