199 research outputs found

    Automatic SIMD vectorization of chains of recurrences

    Full text link

    An efficient algorithm for pointer-to-array access conversion for compiling and optimizing DSP applications

    Full text link

    Automated and accurate cache behavior analysis for codes with irregular access patterns

    Get PDF
    This is the peer reviewed version of the following article: Andrade, D. , Arenaz, M. , Fraguela, B. B., Touriño, J. and Doallo, R. (2007), Automated and accurate cache behavior analysis for codes with irregular access patterns. Concurrency Computat.: Pract. Exper., 19: 2407-2423. doi:10.1002/cpe.1173, which has been published in final form at https://doi.org/10.1002/cpe.1173. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.[Abstract] The memory hierarchy plays an essential role in the performance of current computers, so good analysis tools that help in predicting and understanding its behavior are required. Analytical modeling is the ideal base for such tools if its traditional limitations in accuracy and scope of application can be overcome. While there has been extensive research on the modeling of codes with regular access patterns, less attention has been paid to codes with irregular patterns due to the increased difficulty in analyzing them. Nevertheless, many important applications exhibit this kind of pattern, and their lack of locality make them more cache‐demanding, which makes their study more relevant. The focus of this paper is the automation of the Probabilistic Miss Equations (PME) model, an analytical model of the cache behavior that provides fast and accurate predictions for codes with irregular access patterns. The information requirements of the PME model are defined and its integration in the XARK compiler, a research compiler oriented to automatic kernel recognition in scientific codes, is described. We show how to exploit the powerful information‐gathering capabilities provided by this compiler to allow the automated modeling of loop‐oriented scientific codes. Experimental results that validate the correctness of the automated PME model are also presented.Ministerio de Educación y Ciencia; TIN2004-07797-C02Xunta de Galicia; PGIDIT03TIC10502PRXunta de Galicia; PGIDT05PXIC10504P

    PICO: A Presburger In-bounds Check Optimization for Compiler-based Memory Safety Instrumentations

    Get PDF
    Memory safety violations such as buffer overflows are a threat to security to this day. A common solution to ensure memory safety for C is code instrumentation. However, this often causes high execution-time overhead and is therefore rarely used in production. Static analyses can reduce this overhead by proving some memory accesses in bounds at compile time. In practice, however, static analyses may fail to verify in-bounds accesses due to over-approximation. Therefore, it is important to additionally optimize the checks that reside in the program. In this article, we present PICO, an approach to eliminate and replace in-bounds checks. PICO exactly captures the spatial memory safety of accesses using Presburger formulas to either verify them statically or substitute existing checks with more efficient ones. Thereby, PICO can generate checks of which each covers multiple accesses and place them at infrequently executed locations. We evaluate our LLVM-based PICO prototype with the well-known SoftBound instrumentation on SPEC benchmarks commonly used in related work. PICO reduces the execution-time overhead introduced by SoftBound by 36% on average (and the code-size overhead by 24%). Our evaluation shows that the impact of substituting checks dominates that of removing provably redundant checks

    Worst-Case Energy Consumption Analysis for Energy-Constrained Embedded Systems

    Full text link
    Abstract—The fact that energy is a scarce resource in many embedded real-time systems creates the need for energy-aware task schedulers, which not only guarantee timing constraints but also consider energy consumption. Unfortunately, existing approaches to analyze the worst-case execution time (WCET) of a task usually cannot be directly applied to determine its worst-case energy consumption (WCEC) due to execution time and energy consumption not being closely correlated on many state-of-the-art processors. Instead, a WCEC analyzer must take into account the particular energy characteristics of a target platform. In this paper, we present 0g, a comprehensive approach to WCEC analysis that combines different techniques to speed up the analysis and to improve results. If detailed knowledge about the energy costs of instructions on the target platform is available, our tool is able to compute upper bounds for the WCEC by statically analyzing the program code. Otherwise, a novel ap-proach allows 0g to determine the WCEC by measurement after having identified a set of suitable program inputs based on an auxiliary energy model, which specifies the energy consumption of instructions in relation to each other. Our experiments for three target platforms show that 0g provides precise WCEC estimates. I

    Compilation techniques for automatic extraction of parallelism and locality in heterogeneous architectures

    Get PDF
    [Abstract] High performance computing has become a key enabler for innovation in science and industry. This fact has unleashed a continuous demand of more computing power that the silicon industry has satisfied with parallel and heterogeneous architectures, and complex memory hierarchies. As a consequence, software developers have been challenged to write new codes and rewrite the old ones to be efficient in these new systems. Unfortunately, success cases are scarce and require huge investments in human workforce. Current compilers generate peak-peformance binary code in monocore architectures. Following this victory, this thesis explores new ideas in compiler design to overcome this challenge with the automatic extraction of parallelism and locality. First, we present a new compiler intermediate representation based on diKernels named KIR, which is insensitive to syntactic variations in the source code and exposes multiple levels of parallelism. On top of the KIR, we build a source-to-source approach that generates parallel code annotated with compiler directives: OpenMP for multicores and OpenHMPP for GPUs. Finally, we model program behavior from the point of view of the memory accesses through the reconstruction of affine loops for sequential and parallel codes. The experimental evaluations throughout the thesis corroborate the effectiveness and efficiency of the proposed solutions.[Resumen]La computación de altas prestaciones se ha convertido en un habilitador clave para la innovación en la ciencia y la industria. Este hecho ha propiciado una demanda continua de más poder computacional que la industria del silicio ha satisfecho con arquitecturas paralelas y heterogéneas, y jerarquías de memoria complejas. Como consecuencia, los desarrolladores de software han sido desafiados a escribir códigos nuevos y reescribir los antiguos para que sean eficientes en estos nuevos sistemas. Desafortunadamente, los casos de éxito son escasos y requieren inversiones enormes en fuerza de trabajo. Los compiladores actuales generan código binario con rendimiento máximo en las arquitecturas mononúcleo. Siguiendo esta victoria, esta tesis explora nuevas ideas en el diseño de compiladores para superar este reto con la extracción automática de paralelismo y localidad. En primer lugar, presentamos una nueva representación intermedia de compilador basada en diKernels denominada KIR, la cual es insensible a variaciones sintácticas en el código de fuente y expone múltiples niveles de paralelismo. Sobre la KIR, construimos una aproximación fuente-a-fuente que genera código paralelo anotado con directivas: OpenMP para multinúcleos y OpenHMPP para GPUs. Finalmente, modelamos el comportamiento del programa desde el punto de vista de los accesos de memoria a través de la reconstrucción de bucles afines para códigos secuenciales y paralelos. Las evaluaciones experimentales a lo largo de la tesis corroboran la efectividad y eficacia de las soluciones propuestas.[Resumo]A computación de altas prestacións converteuse nun habilitador clave para a innovación na ciencia e na industria. Este feito propiciou unha demanda continua de máis poder computacional que a industria do silicio satisfixo con arquitecturas paralelas e heteroxéneas, e xerarquías de memoria complexas. Como consecuencia, os desenvolvedores de software foron desafiados a escribir códigos novos e reescribir os antigos para que sexan eficientes nestes novos sistemas. Desafortunadamente, os casos de éxito son escasos e requiren investimentos enormes en forza de traballo. Os compiladores actuais xeran código binario con rendemento máximo nas arquitecturas mononúcleo. Seguindo esta vitoria, esta tese explora novas ideas no deseño de compiladores para superar este reto coa extracción automática de paralelismo e localidade. En primeiro lugar, presentamos unha nova representación intermedia de compilador baseada en diKernels denominada KIR, a cal é insensible a variacións sintácticas no código fonte e expón múltiples niveis de paralelismo. Sobre a KIR, construímos unha aproximación fonte-a-fonte que xera código paralelo anotado con directivas: OpenMP para multinúcleos e OpenHMPP para GPUs. Finalmente, modelamos o comportamento do programa desde o punto de vista dos accesos de memoria a través da reconstrución de bucles afíns para códigos secuenciais e paralelos. As avaliacións experimentais ao longo da tese corroboran a efectividade e eficacia das solucións propostas

    Fabrication and characterization of a magnetic bacterial nanocellulose for neurovascular reconstruction of cerebral aneurysms

    Get PDF
    A cerebral aneurysm is a condition where a defect protrudes out the arterial wall, and which is formed due among other reasons to abnormal high hemodynamic stresses that contribute to deterioration and dilation of the blood vessel. A desirable treatment of cerebral aneurysms is the complete cut-off of the defect from the parent artery with minimum luminal obstruction. Even though different approaches have shown to exclude the defect from the parent artery, the main shortcoming remains the delayed reconstructive occlusion of the aneurysm, which occurs a period of weeks to months. We hypothesized that a material with magnetic properties can provide the force required to speed up re-endothelization across the aneurysm defect, since it can facilitate high cell density coverage at the damaged site in virtue of its ability to capture and retain magnetically functionalized endothelial cells. The aneurysmal neck is a hostile environment for tissue growth resulting from the blood’s shear stress that precludes cell adhesion and proliferation. Therefore, this strategy looks for designing a magnetic material for rapid endothelial cell take up and retention against hemodynamic forces. This magnetic material is required also to satisfy other important features such as biocompatibility, appropriate mechanical properties (e.g. tensile strength and compliance), and blood compatibility (non-thrombogenic). In the present work, we have used bacterial nanocellulose (BNC) as starting material for the production of a magnetic hydrogel, which we named magnetic bacterial nanocellulose (MBNC). BNC is a natural polymer produced by the bacterial strain Acetobacter xylinum, which is extruded as a pellicle in the interface liquid/air to protect the bacteria from dehydration and UV radiation. BNC possesses a multiple of desirable physical and chemical properties for tissue engineering applications such as biocompatibility, high swell ratio, and high tensile strength. D-glucose chains abundant in hydroxyl groups conform the BNC's chemical structure, which are able to adsorb metallic ions and compounds with functional groups active on hydrogen-bonding formation. A brief review about the BNC and magnetic hydrogels are presented in chapter I. In chapter II, we describe the production of BNC and its purification to subsequently synthesize the MBNC through an in-situ precipitation method, in which superparamagnetic iron oxide nanoparticles (SPION) are formed inside the BNC by using ammonium hydroxide as precipitating agent. Different concentrations of Fe3+ and Fe2+iron salts were used for the synthesis of MBNC, and their effect on BNC permeability, porosity and magnetic saturation were analyzed. The permeability testing was performed using a side-by-side diffusion cell. MBNC porosity was estimated using a mass equation balance. Magnetization testing was performed using vibrating sample magnetometer. Scanning electron microscopy (SEM) and magnetic force microscopy (MFM) were used to reveal the morphology and magnetic domains on MBNC respectively. Chemical characterization of the MBNC was performed via X-ray photoelectron spectroscopy (XPS). Because naked SPION are easily oxidized to Fe2O3 under environmental conditions, dextran was used to coat the SPION embedded into the MBNC. In chapter III, once established the optimal reaction conditions for the MBNC synthesis, MBNC pellicles were tested for biocompatibility and cell capture under dynamic fluid flow conditions. Cell adhesion sites were introduced on the surface of MBNC via collagen-conjugation using CDAP as activating agent. Our results showed a satisfactory MBNC magnetization, which was able to separate magnetically functionalized cells under dynamic flow conditions compare to non-magnetized MBNC

    Systematic analysis of the cache behavior of irregular codes

    Get PDF
    [Resumen] El rendimiento de las jerarquías de memoria, en las cuales la caché juega un papel fundamental, es crítico en los computadores de proposito general actuales y en los sistemas embebidos, debido al creciente problema del cuello de botella del sistema de memoria. Desafortunadamente, el comportamiento de la caché es muy inestable y difícil de predecir. Esto es especialmente cierto en presencia de patrones de acceso irregulares, los cuales exhiben poca localidad. Tales patrones son muy comunes por ejemplo en aplicaciones en las cuales algunas referencias están afectadas por sentencias condicionales o en las que el almacenamiento comprimido de matrices dispersas da lugar a la aparición de indirecciones. SIn embargo, el comportamiento caché en presencia de patrones de acceso irregulares no ha sido estudiado ampliamente. En esta tesis presentamos extensiones de una técnica de modelado analítico sistemático basadas en PMEs (Ecuaciones probabilísticas de fallos) que permiten el análisis automático del comportamiento caché para códigos que incluyen sentencias condicionales cuyo valor de verdad puede no ser determinable en tiempo de compilación y códigos con referencias irregulares debidas a indirecciones, respectivamente. El modelo genera predicciones muy precisar a pesar de la irregularidad y tiene un bajo coste computacional siendo el primer modelo que reune estas dos características capaz de analizar automáticamente esta clase de códigos. Estas propiedades convierten al modelo en adecuado para servir de guía en optimizaciones del compilador. La extensión del modelo para códigos irregulares con indirecciones ha sido integrada en el compilador XARK, un compilador orientado al reconocimiento automático de kernels en aplicaciones científicas. Mostramos como explotar las potentes capacidades de extracción de información de este compilador para permitir el modelado automático de códigos científicos basados en bucles

    Impact and Implications of the WTO Trade Facilitation Agreement in East and Southern Africa: 2nd WCO ESA Regional Research Conference

    Get PDF
    This book presents the papers, report and outcomes of the 2nd WCO ESA Regional Research Conference which was hosted by the Regional Training Centre (RTC) Kenya on the 23rd and 24th November, 2017, at the Kenya School of Monetary Studies (KSMS) in Nairobi, Kenya. It was co-organized by the ROCB and the RTC Kenya and attended by more than 200 participants from 20 nations. Participants included researchers and officials from various Customs administrations in the East and Southern Africa Region, WCO ESA Regional Training Centres (RTCs), the WCO, the African Union, the World Bank, Africa Development Bank, Regional Economic Committees (RECs) (the East African Community), the Government of Australia, Kenyan ministries, the private sector, academia, and other cooperating partners. The theme of the conference was “Impacts and Implication of the Trade Facilitation Agreement and the WCO Mercator Programme to the ESA region” and covered the following topics: Impacts of the WTO Trade Facilitation Agreement in East and Southern Africa; Data Analysis for Effective Border Management in East and Southern Africa; Best Practices in Digital Customs in East and Southern Africa; E-commerce as a Driver for Economic Growth in East and Southern Africa; Securing and Facilitating Trade in East and Southern Africa; and Regional Integration: Addressing Levels of Intraregional Trade in East and Southern Africa. The Governing Council of the World Customs Organization, East and Southern Africa region, established the regional research programme aiming to build institutional capacity and the body of knowledge in customs through research. The objective of the programme is to encourage research on topical themes for customs in East and Southern Africa. The programme also aims to develop a body of knowledge to guide the decision-making process concerning trade facilitation and regional economic integration in the Region. It is also hoped that the research programme and the results from findings from the research initiatives will assist countries in sharing experiences, ideas, knowledge, and information on new innovations to improve Customs operations while creating new inventions to continue modernizing customs to ease facilitation of trade in East and Southern Africa. The envisaged output from this process will always be the publication of an e-book (and book) consisting of a consolidation of papers presented during the conference

    Modulated Properties of Fully Absorbable Bicomponent Meshes

    Get PDF
    Current meshes used for soft-tissue repair are mostly composed of single component, nonabsorbable yarn constructions, limiting the ability to modulate their properties. This situation has left the majority of soft tissue repair load-bearing applications to suffer distinctly from undesirable features associated, in part, with mesh inability to (1) possess short-term stiffness to facilitate tissue stability during the development of wound strength; (2) gradually transfer the perceived mechanical load as the wound builds mechanical integrity; and (3) provide compliance with load transfer to the remodeling and maturing mesh/tissue complex. The likelihood of long-term complications is reduced for fully absorbable systems with degradation and absorption at the conclusion of their intended functional performance. The primary goal of this dissertation was to develop and characterize a fully absorbable bicomponent mesh (ABM) for hernia repair which can modulate biomechanical and physical properties to work with the expected needs of the wound healing process. The first study reviewed the current state of hernioplasty and proposed the subject device. The second study investigated different knitting technologies to establish a mesh construction which temporally modulated properties. To this end, a novel construction using warp knitting was developed where two degradable copolyester yarns with different degradation profiles were coknit into an initially interdependent knit construction. The developed knit construction provided an initial high level of structural stiffness; however, upon degradation of the fast-degrading yarn the mesh comprised of the slow-degrading yarn was liberated and affords high compliance. In the third study, the segmented, triaxial, high-glycolide copolyester used as the fast-degrading yarn was optimized to retain strength for greater than 18 days. As such, the ABM physical and biomechanical transition was designed to temporally coincide with the expected commencement of wound strength. The fourth study investigated the in vivo tissue response and integration of the developed degradable copolyester yarns in a novel construct to simulate the ABM. Results indicated a strong initial inflammatory response which resolved quickly and an integration process that produced a dense, compacted, and oriented collagen capsule around the implant during the transition phase. For the final study, the clinically-relevant biomechanical properties of two different ABM constructions were compared against traditional hernia meshes. Using a novel synthetic in vitro simulated mesh/tissue complex, the ABM were found to provide significantly greater early stability, subsequent biomechanics that approximated that of the abdominal wall, and evidence of restoring endogenous tension to the surrounding tissue. These results were in marked contrast to traditional hernia meshes which showed stress shielding and significantly greater stiffness than the abdominal wall
    corecore