2,388 research outputs found

    Optimizing sorting algorithms for the Cell Broadband Engine

    Get PDF
    The quest for higher performance in computationally intensive tasks is and will always be an ongoing effort. General purpose processors (GPP) have not been sufficient for many of these tasks which has led to research focused towards computing on specialty processors and graphics processing units (GPU). While GPU provide sufficient speedups for some tasks, other specialty processors may be better suited, more economical, or more efficient for different types of tasks. Sorting is an important task in many applications and can be computationally intensive when dealing with large data sets. One such specialty processor that has proven to be a viable solution for sorting is the Cell Broadband Engine (CBE). The CBE is being used as the main platform for this thesis since there are already applications for it that require sorting software. The Cell processor is a general purpose processor that combines one master PowerPC core with eight other vector processors connected via a high bandwidth interconnect bus. The user must explicitly manage the communication, scheduling, and load-balancing between the vector processors and the PowerPC processor to achieve the highest efficiency. By optimizing the sorting algorithms for the vector processors, large speedups can be achieved because multiple operations occur simultaneously. Optimized sorting software is often sought when sorting is not the main purpose of the application. This keeps overheads low so that the performance gains can be realized from the actual code that is to be optimized on specialty processors. Often having sorted datasets enable algorithms to run faster and are more predictably. The motivation behind this thesis is that there is currently no standard library of sorting algorithms that have been optimized for the CBE. Lack of standard libraries makes writing code for the CBE difficult. Results from previous works have also not been sufficient in providing specific measurements of sorting performance. This thesis will explore the development and analysis of a variety of optimized parallel sorting algorithms written for the Cell processor. This thesis will focus on the sorting of both individual elements within vectors as well as sorting entire vectors within arrays. The sorting algorithms, written in C++, that will be optimized and analyzed include, but are not limited to bitonic sort, heap sort, merge sort, and quick sort. A communication management framework will also be created as a main focus of this thesis in order to better understand the architecture of the processor

    Mixing multi-core CPUs and GPUs for scientific simulation software

    Get PDF
    Recent technological and economic developments have led to widespread availability of multi-core CPUs and specialist accelerator processors such as graphical processing units (GPUs). The accelerated computational performance possible from these devices can be very high for some applications paradigms. Software languages and systems such as NVIDIA's CUDA and Khronos consortium's open compute language (OpenCL) support a number of individual parallel application programming paradigms. To scale up the performance of some complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica- tions using threading approaches and multi-core CPUs to control independent GPU devices. We present speed-up data and discuss multi-threading software issues for the applications level programmer and o er some suggested areas for language development and integration between coarse-grained and ne-grained multi-thread systems. We discuss results from three common simulation algorithmic areas including: partial di erential equations; graph cluster metric calculations and random number generation. We report on programming experiences and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs; a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and trends in multi-core programming for scienti c applications developers

    A distributed QoS Routing and CAC framework: performance evaluation of its SSRA and InterD Agents

    Get PDF
    In order to support multimedia communication, it is necessary to develop routing algorithms which use for routing more than one QoS parameters. This is because new services such as video on demand and remote meeting systems require better QoS. Also, for admission control of multimedia applications different QoS parameters should be considered. In our previous work, we proposed an intelligent routing and CAC strategy using cooperative agents. In this paper, we propose and evaluate the performance of SSRA algorithm and a GA-based InterD agent. Performace evaluation shows that proposed agents have a good behaviorPeer ReviewedPostprint (published version

    Broadband Continuous-time MASH Sigma-Delta ADCs

    Get PDF

    A quadri-dimensional approach for poor performance prioritization in mobile networks using Big Data

    Get PDF
    Abstract The Management of mobile networks has become so complex due to a huge number of devices, technologies and services involved. Network optimization and incidents management in mobile networks determine the level of the quality of service provided by the communication service providers (CSPs). Generally, the down time of a system and the time taken to repair [mean time to repair (MTTR)] has a direct impact on the revenue, especially on the operational expenditure (OPEX). A fast root cause analysis (RCA) mechanism is therefore crucial to improve the efficiency of the operational team within the CSPs. This paper proposes a quadri-dimensional approach (i.e. services, subscribers, handsets and cells) to build a service quality management (SQM) tree in a Big Data platform. This is meant to speed up the root cause analysis and prioritize the elements impacting the performance of the network. Two algorithms have been proposed; the first one, to normalize the performance indicators and the second one to build the SQM tree by aggregating the performance indicators for different dimensions to allow ranking and detection of tree paths with the worst performance. Additionally, the proposed approach will allow CSPs to detect the mobile network dimensions causing network issues in a faster way and protect their revenue while improving the quality of the service delivered
    • …
    corecore