334 research outputs found

    Code Cache Management in Managed Language VMs to Reduce Memory Consumption for Embedded Systems

    Get PDF
    The compiled native code generated by a just-in-time (JIT) compiler in man- aged language virtual machines (VM) is placed in a region of memory called the code cache. Code cache management (CCM) in a VM is responsible to find and evict methods from the code cache to maintain execution correctness and manage program performance for a given code cache size or memory budget. Effective CCM can also boost program speed by enabling more aggressive JIT compilation, powerful optimizations, and improved hardware instruction cache and I-TLB per- formance. Though important, CCM is an overlooked component in VMs. We find that the default CCM policies in Oracle’s production-grade HotSpot VM perform poorly even at modest memory pressure. We develop a detailed simulation-based frame- work to model and evaluate the potential efficiency of many different CCM poli- cies in a controlled and realistic, but VM-independent environment. We make the encouraging discovery that effective CCM policies can sustain high program performance even for very small cache sizes. Our simulation study provides the rationale and motivation to improve CCM strategies in existing VMs. We implement and study the properties of several CCM policies in HotSpot. We find that in spite of working within the bounds of the HotSpot VM’s current CCM sub-system, our best CCM policy implementation in HotSpot improves program performance over the default CCM algorithm by 39%, 41%, 55%, and 50% with code cache sizes that are 90%, 75%, 50%, and 25% of the desired cache size, on average

    Analyse des performances de stockage, en mémoire et sur les périphériques d'entrée/sortie, à partir d'une trace d'exécution

    Get PDF
    Le stockage des données est vital pour l’industrie informatique. Les supports de stockage doivent être rapides et fiables pour répondre aux demandes croissantes des entreprises. Les technologies de stockage peuvent être classifiées en deux catégories principales : stockage de masse et stockage en mémoire. Le stockage de masse permet de sauvegarder une grande quantité de données à long terme. Les données sont enregistrées localement sur des périphériques d’entrée/sortie, comme les disques durs (HDD) et les Solid-State Drive (SSD), ou en ligne sur des systèmes de stockage distribué. Le stockage en mémoire permet de garder temporairement les données nécessaires pour les programmes en cours d’exécution. La mémoire vive est caractérisée par sa rapidité d’accès, indispensable pour fournir rapidement les données à l’unité de calcul du processeur. Les systèmes d’exploitation utilisent plusieurs mécanismes pour gérer les périphériques de stockage, par exemple les ordonnanceurs de disque et les allocateurs de mémoire. Le temps de traitement d’une requête de stockage est affecté par l’interaction entre plusieurs soussystèmes, ce qui complique la tâche de débogage. Les outils existants, comme les outils d’étalonnage, permettent de donner une vague idée sur la performance globale du système, mais ne permettent pas d’identifier précisément les causes d’une mauvaise performance. L’analyse dynamique par trace d’exécution est très utile pour l’étude de performance des systèmes. Le traçage permet de collecter des données précises sur le fonctionnement du système, ce qui permet de détecter des problèmes de performance difficilement identifiables. L’objectif de cette thèse est de fournir un outil permettant d’analyser les performances de stockage, en mémoire et sur les périphériques d’entrée/sortie, en se basant sur les traces d’exécution. Les défis relevés par cet outil sont : collecter les données nécessaires à l’analyse depuis le noyau et les programmes en mode utilisateur, limiter le surcoût du traçage et la taille des traces générées, synchroniser les différentes traces, fournir des analyses multiniveau couvrant plusieurs aspects de la performance et enfin proposer des abstractions permettant aux utilisateurs de facilement comprendre les traces.----------ABSTRACT: Data storage is an essential resource for the computer industry. Storage devices must be fast and reliable to meet the growing demands of the data-driven economy. Storage technologies can be classified into two main categories: mass storage and main memory storage. Mass storage can store large amounts of data persistently. Data is saved locally on input/output devices, such as Hard Disk Drives (HDD) and Solid-State Drives (SSD), or remotely on distributed storage systems. Main memory storage temporarily holds the necessary data for running programs. Main memory is characterized by its high access speed, essential to quickly provide data to the Central Processing Unit (CPU). Operating systems use several mechanisms to manage storage devices, such as disk schedulers and memory allocators. The processing time of a storage request is affected by the interaction between several subsystems, which complicates the debugging task. Existing tools, such as benchmarking tools, provide a general idea of the overall system performance, but do not accurately identify the causes of poor performance. Dynamic analysis through execution tracing is a solution for the detailed runtime analysis of storage systems. Tracing collects precise data about the internal behavior of the system, which helps in detecting performance problems that are difficult to identify. The goal of this thesis is to provide a tool to analyze storage performance based on lowlevel trace events. The main challenges addressed by this tool are: collecting the required data using kernel and userspace tracing, limiting the overhead of tracing and the size of the generated traces, synchronizing the traces collected from different sources, providing multi-level analyses covering several aspects of storage performance, and lastly proposing abstractions allowing users to easily understand the traces. We carefully designed and inserted the instrumentation needed for the analyses. The tracepoints provide full visibility into the system and track the lifecycle of storage requests, from creation to processing. The Linux Trace Toolkit Next Generation (LTTng), a free and low-overhead tracer, is used for data collection. This tracer is characterized by its stability, and efficiency with highly parallel applications, thanks to the lock-free synchronization mechanisms used to update the content of the trace buffers. We also contributed to the creation of a patch that allows LTTng to capture the call stacks of userspace events

    Lean manufacturing in small and medium-sized food processing enterprises : practice, performance and its determining factors

    Get PDF
    Why do only a few food processing SMEs take advantage of lean manufacturing? Is there anything inherent to food processing SMEs with respect to plant, product, process and organizational behavior influencing the applicability and effectiveness of lean manufacturing? In other words: What are the determining factors that contribute to the variations in operational performance in food processing SMEs and most importantly, how? This doctoral research provides some interesting insights into this topic. Firstly, food processing SMEs are mainly focusing on quality assurance (food safety) and less on quality improvement. Secondly, lean manufacturing implementation improves the operational performance, especially in relation to productivity and quality. Thirdly, variations in the use of lean manufacturing practices are substantial and some practices are yet to be fully used in the food sector. Fourthly, the size of the company is positively correlated with the degree of use of lean practices. Fifthly, the commitment of the top management, training, change agent, product and process characteristics of the food sector are critical for the success of lean manufacturing implementation. Food processing SMEs that manage these determining factors effectively have a higher probability of implementation success. Finally, a framework - house of lean for food SMEs - that takes into consideration the needs and characteristics of food processing SMEs has been proposed in order to assist managers in lean practices implementation

    Autotuning for Automatic Parallelization on Heterogeneous Systems

    Get PDF

    Exploring Processor and Memory Architectures for Multimedia

    Get PDF
    Multimedia has become one of the cornerstones of our 21st century society and, when combined with mobility, has enabled a tremendous evolution of our society. However, joining these two concepts introduces many technical challenges. These range from having sufficient performance for handling multimedia content to having the battery stamina for acceptable mobile usage. When taking a projection of where we are heading, we see these issues becoming ever more challenging by increased mobility as well as advancements in multimedia content, such as introduction of stereoscopic 3D and augmented reality. The increased performance needs for handling multimedia come not only from an ongoing step-up in resolution going from QVGA (320x240) to Full HD (1920x1080) a 27x increase in less than half a decade. On top of this, there is also codec evolution (MPEG-2 to H.264 AVC) that adds to the computational load increase. To meet these performance challenges there has been processing and memory architecture advances (SIMD, out-of-order superscalarity, multicore processing and heterogeneous multilevel memories) in the mobile domain, in conjunction with ever increasing operating frequencies (200MHz to 2GHz) and on-chip memory sizes (128KB to 2-3MB). At the same time there is an increase in requirements for mobility, placing higher demands on battery-powered systems despite the steady increase in battery capacity (500 to 2000mAh). This leaves negative net result in-terms of battery capacity versus performance advances. In order to make optimal use of these architectural advances and to meet the power limitations in mobile systems, there is a need for taking an overall approach on how to best utilize these systems. The right trade-off between performance and power is crucial. On top of these constraints, the flexibility aspects of the system need to be addressed. All this makes it very important to reach the right architectural balance in the system. The first goal for this thesis is to examine multimedia applications and propose a flexible solution that can meet the architectural requirements in a mobile system. Secondly, propose an automated methodology of optimally mapping multimedia data and instructions to a heterogeneous multilevel memory subsystem. The proposed methodology uses constraint programming for solving a multidimensional optimization problem. Results from this work indicate that using today’s most advanced mobile processor technology together with a multi-level heterogeneous on-chip memory subsystem can meet the performance requirements for handling multimedia. By utilizing the automated optimal memory mapping method presented in this thesis lower total power consumption can be achieved, whilst performance for multimedia applications is improved, by employing enhanced memory management. This is achieved through reduced external accesses and better reuse of memory objects. This automatic method shows high accuracy, up to 90%, for predicting multimedia memory accesses for a given architecture

    Optimal program variant generation for hybrid manycore systems

    Get PDF
    Field Programmable Gate Arrays promise to deliver superior energy efficiency in heterogeneous high performance computing, as compared to multicore CPUs and GPUs. The rate of adoption is however hampered by the relative difficulty of programming FPGAs. High-level synthesis tools such as Xilinx Vivado, Altera OpenCL or Intel's HLS address a large part of the programmability issue by synthesizing a Hardware Description Languages representation from a high-level specification of the application, given in programming languages such as OpenCL C, typically used to program CPUs and GPUs. Although HLS solutions make programming easier, they fail to also lighten the burden of optimization. Application developers must rely on expert knowledge to manually optimize their applications for each target device, meaning that traditional HLS solutions do not offer a solution to the issue of performance portability. This state of fact prompted the development of compiler frameworks such as TyTra that operate at an even higher level of abstraction that is amenable to the use of Design Space Exploration (DSE). With DSE the initial program specification can be seen as the starting location in a search-space of correct-by-construction program transformations. In TyTra the search-space is generated from the transitive-closure of term-level transformations derived from type-level transformations. Compiler frameworks such as TyTra theoretically solve the issue of performance portability by providing a way to automatically generate alternative correct program variants. They however suffer from the very practical issue that the generated space is often too large to fully explore. As a consequence, the globally optimal solution may be overlooked. In this work we provide a novel solution to issue performance portability by deriving an efficient yet effective DSE strategy for the TyTra compiler framework. We make use of categorical data types to derive categorical semantics for the formal languages that describe the terms, types, cost-performance estimates and their transformations. From these we define a category of interpretations for TyTra applications, from which we derive a DSE strategy that finds the globally optimal transformation sequence in polynomial time. This is achieved by reducing the size of the generated search space. We formally state and prove a theorem for this claim and then show that the polynomial run-time for our DSE strategy has practically negligible coefficients leading to sub-second exploration times for realistic applications

    A multimodal framework for interactive sonification and sound-based communication

    Get PDF

    Sustainable Industrial Engineering along Product-Service Life Cycle/Supply Chain

    Get PDF
    Sustainable industrial engineering addresses the sustainability issue from economic, environmental, and social points of view. Its application fields are the whole value chain and lifecycle of products/services, from the development to the end-of-life stages. This book aims to address many of the challenges faced by industrial organizations and supply chains to become more sustainable through reinventing their processes and practices, by continuously incorporating sustainability guidelines and practices in their decisions, such as circular economy, collaboration with suppliers and customers, using information technologies and systems, tracking their products’ life-cycle, using optimization methods to reduce resource use, and to apply new management paradigms to help mitigate many of the wastes that exist across organizations and supply chains. This book will be of interest to the fast-growing body of academics studying and researching sustainability, as well as to industry managers involved in sustainability management
    • …
    corecore