110 research outputs found

    Dynamic Hierarchical Cache Management for Cloud RAN and Multi- Access Edge Computing in 5G Networks

    Get PDF
    Cloud Radio Access Networks (CRAN) and Multi-Access Edge Computing (MEC) are two of the many emerging technologies that are proposed for 5G mobile networks. CRAN provides scalability, flexibility, and better resource utilization to support the dramatic increase of Internet of Things (IoT) and mobile devices. MEC aims to provide low latency, high bandwidth and real- time access to radio networks. Cloud architecture is built on top of traditional Radio Access Networks (RAN) to bring the idea of CRAN and in MEC, cloud computing services are brought near users to improve the user’s experiences. A cache is added in both CRAN and MEC architectures to speed up the mobile network services. This research focuses on cache management of CRAN and MEC because there is a necessity to manage and utilize this limited cache resource efficiently. First, a new cache management algorithm, H-EXD-AHP (Hierarchical Exponential Decay and Analytical Hierarchy Process), is proposed to improve the existing EXD-AHP algorithm. Next, this paper designs three dynamic cache management algorithms and they are implemented on the proposed algorithm: H-EXD-AHP and an existing algorithm: H-PBPS (Hierarchical Probability Based Popularity Scoring). In these proposed designs, cache sizes of the different Service Level Agreement (SLA) users are adjusted dynamically to meet the guaranteed cache hit rate set for their corresponding SLA users. The minimum guarantee of cache hit rate is for our setting. Net neutrality, prioritized treatment will be in common practice. Finally, performance evaluation results show that these designs achieve the guaranteed cache hit rate for differentiated users according to their SLA

    Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters

    Get PDF
    Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version

    Optimising Simulation Data Structures for the Xeon Phi

    Get PDF
    In this paper, we propose a lock-free architecture to accelerate logic gate circuit simulation using SIMD multi-core machines. We evaluate its performance on different test circuits simulated on the Intel Xeon Phi and 2 other machines. Comparisons are presented of this software/hardware combination with reported performances of GPU and other multi-core simulation platforms. Comparisons are also given between the lock free architecture and a leading commercial simulator running on the same Intel hardware

    Cache Management and Load Balancing for 5G Cloud Radio Access Networks

    Get PDF
    Cloud radio access network (CRAN) has been proposed for 5G mobile networks. The benefit of a CRAN includes better scalability, flexibility, and performance. The paper introduces a cache management algorithm for a baseband unit of CRAN and load balancing algorithms for virtual machines load within the CRAN. The proposed scheme, exponential decay (EXD) with analytical hierarchy process (AHP), increases hit rate and reduces network traffic. The scheme also provides preferential services for users with a higher service level agreement (SLA). Finally, the experiment shows the proposed load balancing algorithm can reduce the virtual machines’ (VM) queue size and wait time

    Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures

    Get PDF
    Feltor is a modular and free scientific software package. It allows developing platform independent code that runs on a variety of parallel computer architectures ranging from laptop CPUs to multi-GPU distributed memory systems. Feltor consists of both a numerical library and a collection of application codes built on top of the library. Its main target are two- and three-dimensional drift- and gyro-fluid simulations with discontinuous Galerkin methods as the main numerical discretization technique. We observe that numerical simulations of a recently developed gyro-fluid model produce non-deterministic results in parallel computations. First, we show how we restore accuracy and bitwise reproducibility algorithmically and programmatically. In particular, we adopt an implementation of the exactly rounded dot product based on long accumulators, which avoids accuracy losses especially in parallel applications. However, reproducibility and accuracy alone fail to indicate correct simulation behaviour. In fact, in the physical model slightly different initial conditions lead to vastly different end states. This behaviour translates to its numerical representation. Pointwise convergence, even in principle, becomes impossible for long simulation times. In a second part, we explore important performance tuning considerations. We identify latency and memory bandwidth as the main performance indicators of our routines. Based on these, we propose a parallel performance model that predicts the execution time of algorithms implemented in Feltor and test our model on a selection of parallel hardware architectures. We are able to predict the execution time with a relative error of less than 25% for problem sizes between 0.1 and 1000 MB. Finally, we find that the product of latency and bandwidth gives a minimum array size per compute node to achieve a scaling efficiency above 50% (both strong and weak)

    Optimising Simulation Data Structures for the Xeon Phi

    Get PDF
    In this paper, we propose a lock-free architecture to accelerate logic gate circuit simulation using SIMD multi-core machines. We evaluate its performance on different test circuits simulated on the Intel Xeon Phi and 2 other machines. Comparisons are presented of this software/hardware combination with reported performances of GPU and other multi-core simulation platforms. Comparisons are also given between the lock free architecture and a leading commercial simulator running on the same Intel hardware

    Distributed Environment for Efficient Virtual Machine Image Management in Federated Cloud Architectures

    Get PDF
    The use of Virtual Machines (VM) in Cloud computing provides various benefits in the overall software engineering lifecycle. These include efficient elasticity mechanisms resulting in higher resource utilization and lower operational costs. VM as software artifacts are created using provider-specific templates, called VM images (VMI), and are stored in proprietary or public repositories for further use. However, some technology specific choices can limit the interoperability among various Cloud providers and bundle the VMIs with nonessential or redundant software packages, leading to increased storage size, prolonged VMI delivery, stagnant VMI instantiation and ultimately vendor lock-in. To address these challenges, we present a set of novel functionalities and design approaches for efficient operation of distributed VMI repositories, specifically tailored for enabling: (i) simplified creation of lightweight and size optimized VMIs tuned for specific application requirements; (ii) multi-objective VMI repository optimization; and (iii) efficient reasoning mechanism to help optimizing complex VMI operations. The evaluation results confirm that the presented approaches can enable VMI size reduction by up to 55%, while trimming the image creation time by 66%. Furthermore, the repository optimization algorithms, can reduce the VMI delivery time by up to 51% and cut down the storage expenses by 3%. Moreover, by implementing replication strategies, the optimization algorithms can increase the system reliability by 74%

    A survey on pseudonym changing strategies for Vehicular Ad-Hoc Networks

    Full text link
    The initial phase of the deployment of Vehicular Ad-Hoc Networks (VANETs) has begun and many research challenges still need to be addressed. Location privacy continues to be in the top of these challenges. Indeed, both of academia and industry agreed to apply the pseudonym changing approach as a solution to protect the location privacy of VANETs'users. However, due to the pseudonyms linking attack, a simple changing of pseudonym shown to be inefficient to provide the required protection. For this reason, many pseudonym changing strategies have been suggested to provide an effective pseudonym changing. Unfortunately, the development of an effective pseudonym changing strategy for VANETs is still an open issue. In this paper, we present a comprehensive survey and classification of pseudonym changing strategies. We then discuss and compare them with respect to some relevant criteria. Finally, we highlight some current researches, and open issues and give some future directions

    Rack Level Study of Hybrid Liquid/Air Cooled Servers: The Impact of Flow Distribution and Pumping Configuration on Central Processing Units Temperature

    Get PDF
    The flow distribution and central processing unit (CPU) temperatures inside a rack of thirty 1 U (single rack unit) Sun Fire V20z servers retrofitted with direct-to-chip liquid cooling and two coolant pumping configuration scenarios (central and distributed) are investigated using the EPANET open source network flow software. The results revealed that the servers in the top of the rack and close to the cooling distribution unit can receive up 30% higher flow rate than the servers in the bottom of the rack, depending on the pumping scenario. This results in a variation in the CPU temperatures depending on the position in the rack. Optimization analysis is carried out and shows that increasing the flow distribution manifold’s dimensions can reduce the flow variation through the servers and increase the total coolant flow rate in the rack by roughly 10%. In addition, activating the small pumps in the direct-to-chip liquid cooling loops inside the servers (distributed pumping) resulted in an increase of 2 °C in the CPU temperatures at the high computational workload
    • …
    corecore