Search CORE

36,072 research outputs found

AlSub: Fully Parallel and Modular Subdivision

Author: Mlakar Daniel
Seidel Hans-Peter
Steinberger Markus
Winter Martin
Zayer Rhaleb
Publication venue
Publication date: 01/01/2018
Field of study

In recent years, mesh subdivision---the process of forging smooth free-form surfaces from coarse polygonal meshes---has become an indispensable production instrument. Although subdivision performance is crucial during simulation, animation and rendering, state-of-the-art approaches still rely on serial implementations for complex parts of the subdivision process. Therefore, they often fail to harness the power of modern parallel devices, like the graphics processing unit (GPU), for large parts of the algorithm and must resort to time-consuming serial preprocessing. In this paper, we show that a complete parallelization of the subdivision process for modern architectures is possible. Building on sparse matrix linear algebra, we show how to structure the complete subdivision process into a sequence of algebra operations. By restructuring and grouping these operations, we adapt the process for different use cases, such as regular subdivision of dynamic meshes, uniform subdivision for immutable topology, and feature-adaptive subdivision for efficient rendering of animated models. As the same machinery is used for all use cases, identical subdivision results are achieved in all parts of the production pipeline. As a second contribution, we show how these linear algebra formulations can effectively be translated into efficient GPU kernels. Applying our strategies to

\sqrt{3}

, Loop and Catmull-Clark subdivision shows significant speedups of our approach compared to state-of-the-art solutions, while we completely avoid serial preprocessing.Comment: Changed structure Added content Improved description

arXiv.org e-Print Archive

MPG.PuRe

Recommended from our members

Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu

eScholarship - University of California

Optimisation of a parallel ocean general circulation model

Author: Beare MI
Stevens DP
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

Abstract. This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing rou- tines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel e?ciency of the model is adversely a?ected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers

Crossref

Directory of Open Access Journals

HAL-INSU

University of East Anglia digital repository

Connecting the World of Embedded Mobiles: The RIOT Approach to Ubiquitous Networking for the Internet of Things

Author: Baccelli Emmanuel
Gündoğan Cenk
Hahm Oliver
Kietzmann Peter
Lenders Martine
Petersen Hauke
Schleiser Kaspar
Schmidt Thomas C.
Wählisch Matthias
Publication venue
Publication date: 01/01/2018
Field of study

The Internet of Things (IoT) is rapidly evolving based on low-power compliant protocol standards that extend the Internet into the embedded world. Pioneering implementations have proven it is feasible to inter-network very constrained devices, but had to rely on peculiar cross-layered designs and offer a minimalistic set of features. In the long run, however, professional use and massive deployment of IoT devices require full-featured, cleanly composed, and flexible network stacks. This paper introduces the networking architecture that turns RIOT into a powerful IoT system, to enable low-power wireless scenarios. RIOT networking offers (i) a modular architecture with generic interfaces for plugging in drivers, protocols, or entire stacks, (ii) support for multiple heterogeneous interfaces and stacks that can concurrently operate, and (iii) GNRC, its cleanly layered, recursively composed default network stack. We contribute an in-depth analysis of the communication performance and resource efficiency of RIOT, both on a micro-benchmarking level as well as by comparing IoT communication across different platforms. Our findings show that, though it is based on significantly different design trade-offs, the networking subsystem of RIOT achieves a performance equivalent to that of Contiki and TinyOS, the two operating systems which pioneered IoT software platforms

arXiv.org e-Print Archive

REPOSIT