2,254 research outputs found

    Bio-inspired call-stack reconstruction for performance analysis

    Get PDF
    The correlation of performance bottlenecks and their associated source code has become a cornerstone of performance analysis. It allows understanding why the efficiency of an application falls behind the computer's peak performance and enabling optimizations on the code ultimately. To this end, performance analysis tools collect the processor call-stack and then combine this information with measurements to allow the analyst comprehend the application behavior. Some tools modify the call-stack during run-time to diminish the collection expense but at the cost of resulting in non-portable solutions. In this paper, we present a novel portable approach to associate performance issues with their source code counterpart. To address it, we capture a reduced segment of the call-stack (up to three levels) and then process the segments using an algorithm inspired by multi-sequence alignment techniques. The results of our approach are easily mapped to detailed performance views, enabling the analyst to unveil the application behavior and its corresponding region of code. To demonstrate the usefulness of our approach, we have applied the algorithm to several first-time seen in-production applications to describe them finely, and optimize them by using tiny modifications based on the analyses.We thankfully acknowledge Mathis Bode for giving us access to the Arts CF binaries, and Miguel Castrillo and Kim Serradell for their valuable insight regarding Nemo. We would like to thank Forschungszentrum JĂĽlich for the computation time on their Blue Gene/Q system. This research has been partially funded by the CICYT under contracts No. TIN2012-34557 and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

    Big Data Caching for Networking: Moving from Cloud to Edge

    Full text link
    In order to cope with the relentless data tsunami in 5G5G wireless networks, current approaches such as acquiring new spectrum, deploying more base stations (BSs) and increasing nodes in mobile packet core networks are becoming ineffective in terms of scalability, cost and flexibility. In this regard, context-aware 55G networks with edge/cloud computing and exploitation of \emph{big data} analytics can yield significant gains to mobile operators. In this article, proactive content caching in 55G wireless networks is investigated in which a big data-enabled architecture is proposed. In this practical architecture, vast amount of data is harnessed for content popularity estimation and strategic contents are cached at the BSs to achieve higher users' satisfaction and backhaul offloading. To validate the proposed solution, we consider a real-world case study where several hours of mobile data traffic is collected from a major telecom operator in Turkey and a big data-enabled analysis is carried out leveraging tools from machine learning. Based on the available information and storage capacity, numerical studies show that several gains are achieved both in terms of users' satisfaction and backhaul offloading. For example, in the case of 1616 BSs with 30%30\% of content ratings and 1313 Gbyte of storage size (78%78\% of total library size), proactive caching yields 100%100\% of users' satisfaction and offloads 98%98\% of the backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special Issue on Communications, Caching, and Computing for Content-Centric Mobile Network

    PRL: standardizing performance monitoring library for high-integrity real-time systems

    Get PDF
    The use of complex processors is becoming ubiquitous in High-Integrity Systems (HIS). To deal with processor’s increased complexity, Performance Monitoring Counters (PMCs) are increasingly used to reason on software behavior and provide the necessary evidence to support software certification. However, the use of PMCs in HIS is relatively recent and hence far from being standardized. As a result, software engineers are forced to resort to highly-customized, low-level programming of platform-specific PMC control registers, which is both error prone and time consuming. To cover this gap, we propose building on the PAPI library, a standardized performance monitoring solution in the mainstream domain, and develop a PMC Reading Library (PRL) for configuring and collecting traceable events while capturing HIS specific requirements and peculiarities. We instantiate PRL in a reference automotive configuration to show that PRL meets key HIS requirements: negligible footprint, limited and predictable overhead, and accuracy collecting hardware events by filtering out the impact of interrupts and context switches.This work has been supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GBC21/AEI/10.13039/501100011033, the European Unions Horizon 2020 Framework Programme under grant agreement No. 878752 (MASTECS), and the European Research Council (ERC) grant agreement No. 772773 (SuPerCom).Peer ReviewedPostprint (author's final draft
    • …
    corecore