779 research outputs found
High-performance generalized tensor operations: A compiler-oriented approach
The efficiency of tensor contraction is of great importance. Compilers cannot optimize it well enough to come close to the performance of expert-tuned implementations. All existing approaches that provide competitive performance require optimized external code. We introduce a compiler optimization that reaches the performance of optimized BLAS libraries without the need for an external implementation or automatic tuning. Our approach provides competitive performance across hardware architectures and can be generalized to deliver the same benefits for algebraic path problems. By making fast linear algebra kernels available to everyone, we expect productivity increases when optimized libraries are not available. © 2018 Association for Computing Machinery
Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels
Achieving optimal program performance requires deep insight into the
interaction between hardware and software. For software developers without an
in-depth background in computer architecture, understanding and fully utilizing
modern architectures is close to impossible. Analytic loop performance modeling
is a useful way to understand the relevant bottlenecks of code execution based
on simple machine models. The Roofline Model and the Execution-Cache-Memory
(ECM) model are proven approaches to performance modeling of loop nests. In
comparison to the Roofline model, the ECM model can also describes the
single-core performance and saturation behavior on a multicore chip. We give an
introduction to the Roofline and ECM models, and to stencil performance
modeling using layer conditions (LC). We then present Kerncraft, a tool that
can automatically construct Roofline and ECM models for loop nests by
performing the required code, data transfer, and LC analysis. The layer
condition analysis allows to predict optimal spatial blocking factors for loop
nests. Together with the models it enables an ab-initio estimate of the
potential benefits of loop blocking optimizations and of useful block sizes. In
cases where LC analysis is not easily possible, Kerncraft supports a cache
simulator as a fallback option. Using a 25-point long-range stencil we
demonstrate the usefulness and predictive power of the Kerncraft tool.Comment: 22 pages, 5 figure
Mixed-data-model heterogeneous compilation and OpenMP offloading
Heterogeneous computers combine a general-purpose host processor with domain-specific programmable many-core accelerators, uniting high versatility with high performance and energy efficiency. While the host manages ever-more application memory, accelerators are designed to work mainly on their local memory. This difference in addressed memory leads to a discrepancy between the optimal address width of the host and the accelerator. Today 64-bit host processors are commonplace, but few accelerators exceed 32-bit addressable local memory, a difference expected to increase with 128-bit hosts in the exascale era. Managing this discrepancy requires support for multiple data models in heterogeneous compilers. So far, compiler support for multiple data models has not been explored, which hampers the programmability of such systems and inhibits their adoption. In this work, we perform the first exploration of the feasibility and performance of implementing a mixed-data-mode heterogeneous system. To support this, we present and evaluate the first mixed-data-model compiler, supporting arbitrary address widths on host and accelerator. To hide the inherent complexity and to enable high programmer productivity, we implement transparent offloading on top of OpenMP. The proposed compiler techniques are implemented in LLVM and evaluated on a 64+32-bit heterogeneous SoC. Results on benchmarks from the PolyBench-ACC suite show that memory can be transparently shared between host and accelerator at overheads below 0.7 % compared to 32-bit-only execution, enabling mixed-data-model computers to execute at near-native performance
Towards an Achievable Performance for the Loop Nests
Numerous code optimization techniques, including loop nest optimizations,
have been developed over the last four decades. Loop optimization techniques
transform loop nests to improve the performance of the code on a target
architecture, including exposing parallelism. Finding and evaluating an
optimal, semantic-preserving sequence of transformations is a complex problem.
The sequence is guided using heuristics and/or analytical models and there is
no way of knowing how close it gets to optimal performance or if there is any
headroom for improvement. This paper makes two contributions. First, it uses a
comparative analysis of loop optimizations/transformations across multiple
compilers to determine how much headroom may exist for each compiler. And
second, it presents an approach to characterize the loop nests based on their
hardware performance counter values and a Machine Learning approach that
predicts which compiler will generate the fastest code for a loop nest. The
prediction is made for both auto-vectorized, serial compilation and for
auto-parallelization. The results show that the headroom for state-of-the-art
compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to
1.71x for the auto-parallelized code. These results are based on the Machine
Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and
Compilers for Parallel Computing (LCPC 2018
Simultaneous magma and gas eruptions at three volcanoes in southern Italy: an earthquake trigger?
In September 2002, a series of tectonic earthquakes occurred north of Sicily, Italy, followed by three events of volcanic unrest within 150 km. On October 28, 2002, Mt. Etna erupted; on November 3, 2002, submarine degassing occurred near Panarea Island; and on December 28, 2002, Stromboli Island erupted. All of these events were considered unusual: the Mt. Etna NE-rift eruption was the largest in 55 yr, the Panarea degassing was one of the strongest ever detected there, and the Stromboli eruption, which produced a landslide and tsunami, was the largest effusive eruption in 17 yr. Here, we investigate the synchronous occurrence of these clustered unrest events, and develop a possible explanatory model. We compute short-term earthquake-induced dynamic strain changes and compare them to long-term tectonic effects. Results suggest that the earthquake-induced strain changes exceeded annual tectonic strains by at least an order of magnitude. This agitation occurred in seconds, and may have induced fluid and gas pressure migration within the already active hydrothermal and magmatic systems
Removal of temporary pacemaker after cardiac surgery in infants: A harmless procedure?
External pacemakers (PM) via temporary epicardial leads are routinely applied to infants and children during heart surgery, which usually, after an uneventful post surgical course, can be removed without complications. We report about two infants with complex congenital heart defects after cardiac surgery (arterial switch and Mustard operation for Transposition of the great arteries). Intraoperative these patients received temporary epicardial PM wires. Thirteen and 18 days post surgery, respectively, the PM wires were removed under electrocardiogram (ECG) monitoring. The patients showed acute ECG changes in terms of significant ST elevation during and after removing their pacing wires. Clinically, patients were stable and subsequent echocardiographic examination showed no evidence of myocardial dysfunction or pericardial effusion. In the course of time, patients showed no signs of arrhythmia or abnormal ECG changes. The decision to place temporary pacing wires during the cardiac surgery in patients with congenital heart defects should be considered carefully and their removal should occur under ECG monitoring as soon as the situation of the patient allows. It should be taken into consideration that a complication like this case may be related to delayed removal of temporary PM’s leads. © 2012 - IOS Press and the authors
Проект узла синтеза бутилацетата
Объект разработки: производство бутилацетата методом этерификации с катализатором в виде серной кислоты. Цель работы: изучение физико – химических свойств процесса и их влияния на протекание реакции, конструирование основного аппарата синтеза бутилацетата. В результате исследования выполнен расчет материального и теплового балансов, конструктивный и механический расчеты, на основании которых был выполнен чертеж основного аппарата.Content words are esterification, feasibility study and another.The object of the development is the production of butylacetate by esterification catalyst in the form of sulfuric acid.The mission is the study of physical - chemical properties of the process and their influence on the reaction, as well as the construction of the main unit synthesis of butyl acetate. The study was carried out payment of material and heat balances, the constructive and mechanical calculations, drawing on the basis of which the main unit was made. The final qualifying work carried out at the Department of TOVPM student group 2D2A Marina Filippova, under the leadership of Candidate of Chemical Sciences Ann Manankova
An axiomatic approach to the non-linear theory of generalized functions and consistency of Laplace transforms
We offer an axiomatic definition of a differential algebra of generalized
functions over an algebraically closed non-Archimedean field. This algebra is
of Colombeau type in the sense that it contains a copy of the space of Schwartz
distributions. We study the uniqueness of the objects we define and the
consistency of our axioms. Next, we identify an inconsistency in the
conventional Laplace transform theory. As an application we offer a free of
contradictions alternative in the framework of our algebra of generalized
functions. The article is aimed at mathematicians, physicists and engineers who
are interested in the non-linear theory of generalized functions, but who are
not necessarily familiar with the original Colombeau theory. We assume,
however, some basic familiarity with the Schwartz theory of distributions.Comment: 23 page
Nonsteroidal Anti-Inflammatory Drugs and Opioids in Postsurgical Dental Pain
Postsurgical dental pain is mainly driven by inflammation, particularly through the generation of prostaglandins via the cyclooxygenase system. Thus, it is no surprise that numerous randomized placebo-controlled trials studying acute pain following the surgical extraction of impacted third molars have demonstrated the remarkable efficacy of nonsteroidal anti-inflammatory drugs (NSAIDs) such as ibuprofen, naproxen sodium, etodolac, diclofenac, and ketorolac in this prototypic condition of acute inflammatory pain. Combining an optimal dose of an NSAID with an appropriate dose of acetaminophen appears to further enhance analgesic efficacy and potentially reduce the need for opioids. In addition to being on average inferior to NSAIDs as analgesics in postsurgical dental pain, opioids produce a higher incidence of side effects in dental outpatients, including dizziness, drowsiness, psychomotor impairment, nausea/vomiting, and constipation. Unused opioids are also subject to misuse and diversion, and they may cause addiction. Despite these risks, some dental surgical outpatients may benefit from a 1- or 2-d course of opioids added to their NSAID regimen. NSAID use may carry significant risks in certain patient populations, in which a short course of an acetaminophen/opioid combination may provide a more favorable benefit versus risk ratio than an NSAID regimen. © International & American Associations for Dental Research 2020
Use of Non-Steroidal Anti-Inflammatory Drugs That Elevate Cardiovascular Risk: An Examination of Sales and Essential Medicines Lists in Low-, Middle-, and High-Income Countries
PMCID: PMC3570554This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
- …