Search CORE

2,388 research outputs found

Optimizing sorting algorithms for the Cell Broadband Engine

Author: Offermann Eric J.
Publication venue: RIT Scholar Works
Publication date: 01/05/2010
Field of study

The quest for higher performance in computationally intensive tasks is and will always be an ongoing effort. General purpose processors (GPP) have not been sufficient for many of these tasks which has led to research focused towards computing on specialty processors and graphics processing units (GPU). While GPU provide sufficient speedups for some tasks, other specialty processors may be better suited, more economical, or more efficient for different types of tasks. Sorting is an important task in many applications and can be computationally intensive when dealing with large data sets. One such specialty processor that has proven to be a viable solution for sorting is the Cell Broadband Engine (CBE). The CBE is being used as the main platform for this thesis since there are already applications for it that require sorting software. The Cell processor is a general purpose processor that combines one master PowerPC core with eight other vector processors connected via a high bandwidth interconnect bus. The user must explicitly manage the communication, scheduling, and load-balancing between the vector processors and the PowerPC processor to achieve the highest efficiency. By optimizing the sorting algorithms for the vector processors, large speedups can be achieved because multiple operations occur simultaneously. Optimized sorting software is often sought when sorting is not the main purpose of the application. This keeps overheads low so that the performance gains can be realized from the actual code that is to be optimized on specialty processors. Often having sorted datasets enable algorithms to run faster and are more predictably. The motivation behind this thesis is that there is currently no standard library of sorting algorithms that have been optimized for the CBE. Lack of standard libraries makes writing code for the CBE difficult. Results from previous works have also not been sufficient in providing specific measurements of sorting performance. This thesis will explore the development and analysis of a variety of optimized parallel sorting algorithms written for the Cell processor. This thesis will focus on the sorting of both individual elements within vectors as well as sorting entire vectors within arrays. The sorting algorithms, written in C++, that will be optimized and analyzed include, but are not limited to bitonic sort, heap sort, merge sort, and quick sort. A communication management framework will also be created as a main focus of this thesis in order to better understand the architecture of the processor

RIT Scholar Works

Mixing multi-core CPUs and GPUs for scientific simulation software

Author: Hawick K.A.
Leist A.
Playne D.P.
Publication venue: 'Massey University'
Publication date: 01/01/2010
Field of study

Recent technological and economic developments have led to widespread availability of multi-core CPUs and specialist accelerator processors such as graphical processing units (GPUs). The accelerated computational performance possible from these devices can be very high for some applications paradigms. Software languages and systems such as NVIDIA's CUDA and Khronos consortium's open compute language (OpenCL) support a number of individual parallel application programming paradigms. To scale up the performance of some complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica- tions using threading approaches and multi-core CPUs to control independent GPU devices. We present speed-up data and discuss multi-threading software issues for the applications level programmer and o er some suggested areas for language development and integration between coarse-grained and ne-grained multi-thread systems. We discuss results from three common simulation algorithmic areas including: partial di erential equations; graph cluster metric calculations and random number generation. We report on programming experiences and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs; a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and trends in multi-core programming for scienti c applications developers

Massey Research Online

A distributed QoS Routing and CAC framework: performance evaluation of its SSRA and InterD Agents

Author: Barolli Leonard
Durresi Arjan
Ikeda Makoto
Koyama Akio
Xhafa Xhafa Fatos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

In order to support multimedia communication, it is necessary to develop routing algorithms which use for routing more than one QoS parameters. This is because new services such as video on demand and remote meeting systems require better QoS. Also, for admission control of multimedia applications different QoS parameters should be considered. In our previous work, we proposed an intelligent routing and CAC strategy using cooperative agents. In this paper, we propose and evaluate the performance of SSRA algorithm and a GA-based InterD agent. Performace evaluation shows that proposed agents have a good behaviorPeer ReviewedPostprint (published version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Broadband Continuous-time MASH Sigma-Delta ADCs

Author: Zhang Chenming
Publication venue: Eindhoven University of Technology
Publication date: 21/09/2021
Field of study

Pure OAI Repository

A quadri-dimensional approach for poor performance prioritization in mobile networks using Big Data

Author: Mampaka Maluambanzila Minerve
Sumbwanyambe Mbuyu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2019
Field of study

Abstract The Management of mobile networks has become so complex due to a huge number of devices, technologies and services involved. Network optimization and incidents management in mobile networks determine the level of the quality of service provided by the communication service providers (CSPs). Generally, the down time of a system and the time taken to repair [mean time to repair (MTTR)] has a direct impact on the revenue, especially on the operational expenditure (OPEX). A fast root cause analysis (RCA) mechanism is therefore crucial to improve the efficiency of the operational team within the CSPs. This paper proposes a quadri-dimensional approach (i.e. services, subscribers, handsets and cells) to build a service quality management (SQM) tree in a Big Data platform. This is meant to speed up the root cause analysis and prioritize the elements impacting the performance of the network. Two algorithms have been proposed; the first one, to normalize the performance indicators and the second one to build the SQM tree by aggregating the performance indicators for different dimensions to allow ranking and detection of tree paths with the worst performance. Additionally, the proposed approach will allow CSPs to detect the mobile network dimensions causing network issues in a faster way and protect their revenue while improving the quality of the service delivered

Directory of Open Access Journals

Unisa Institutional Repository