918 research outputs found
Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels
Achieving optimal program performance requires deep insight into the
interaction between hardware and software. For software developers without an
in-depth background in computer architecture, understanding and fully utilizing
modern architectures is close to impossible. Analytic loop performance modeling
is a useful way to understand the relevant bottlenecks of code execution based
on simple machine models. The Roofline Model and the Execution-Cache-Memory
(ECM) model are proven approaches to performance modeling of loop nests. In
comparison to the Roofline model, the ECM model can also describes the
single-core performance and saturation behavior on a multicore chip. We give an
introduction to the Roofline and ECM models, and to stencil performance
modeling using layer conditions (LC). We then present Kerncraft, a tool that
can automatically construct Roofline and ECM models for loop nests by
performing the required code, data transfer, and LC analysis. The layer
condition analysis allows to predict optimal spatial blocking factors for loop
nests. Together with the models it enables an ab-initio estimate of the
potential benefits of loop blocking optimizations and of useful block sizes. In
cases where LC analysis is not easily possible, Kerncraft supports a cache
simulator as a fallback option. Using a 25-point long-range stencil we
demonstrate the usefulness and predictive power of the Kerncraft tool.Comment: 22 pages, 5 figure
Towards an Achievable Performance for the Loop Nests
Numerous code optimization techniques, including loop nest optimizations,
have been developed over the last four decades. Loop optimization techniques
transform loop nests to improve the performance of the code on a target
architecture, including exposing parallelism. Finding and evaluating an
optimal, semantic-preserving sequence of transformations is a complex problem.
The sequence is guided using heuristics and/or analytical models and there is
no way of knowing how close it gets to optimal performance or if there is any
headroom for improvement. This paper makes two contributions. First, it uses a
comparative analysis of loop optimizations/transformations across multiple
compilers to determine how much headroom may exist for each compiler. And
second, it presents an approach to characterize the loop nests based on their
hardware performance counter values and a Machine Learning approach that
predicts which compiler will generate the fastest code for a loop nest. The
prediction is made for both auto-vectorized, serial compilation and for
auto-parallelization. The results show that the headroom for state-of-the-art
compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to
1.71x for the auto-parallelized code. These results are based on the Machine
Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and
Compilers for Parallel Computing (LCPC 2018
An axiomatic approach to the non-linear theory of generalized functions and consistency of Laplace transforms
We offer an axiomatic definition of a differential algebra of generalized
functions over an algebraically closed non-Archimedean field. This algebra is
of Colombeau type in the sense that it contains a copy of the space of Schwartz
distributions. We study the uniqueness of the objects we define and the
consistency of our axioms. Next, we identify an inconsistency in the
conventional Laplace transform theory. As an application we offer a free of
contradictions alternative in the framework of our algebra of generalized
functions. The article is aimed at mathematicians, physicists and engineers who
are interested in the non-linear theory of generalized functions, but who are
not necessarily familiar with the original Colombeau theory. We assume,
however, some basic familiarity with the Schwartz theory of distributions.Comment: 23 page
Use of Non-Steroidal Anti-Inflammatory Drugs That Elevate Cardiovascular Risk: An Examination of Sales and Essential Medicines Lists in Low-, Middle-, and High-Income Countries
PMCID: PMC3570554This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Optimizing end-labeled free-solution electrophoresis by increasing the hydrodynamic friction of the drag-tag
We study the electrophoretic separation of polyelectrolytes of varying
lengths by means of end-labeled free-solution electrophoresis (ELFSE). A
coarse-grained molecular dynamics simulation model, using full electrostatic
interactions and a mesoscopic Lattice Boltzmann fluid to account for
hydrodynamic interactions, is used to characterize the drag coefficients of
different label types: linear and branched polymeric labels, as well as
transiently bound micelles.
It is specifically shown that the label's drag coefficient is determined by
its hydrodynamic size, and that the drag per label monomer is largest for
linear labels. However, the addition of side chains to a linear label offers
the possibility to increase the hydrodynamic size, and therefore the label
efficiency, without having to increase the linear length of the label, thereby
simplifying synthesis. The third class of labels investigated, transiently
bound micelles, seems very promising for the usage in ELFSE, as they provide a
significant higher hydrodynamic drag than the other label types.
The results are compared to theoretical predictions, and we investigate how
the efficiency of the ELFSE method can be improved by using smartly designed
drag-tags.Comment: 32 pages, 11 figures, submitted to Macromolecule
Thrombosis Is Reduced by Inhibition of COX-1, but Unaffected by Inhibition of COX-2, in an Acute Model of Platelet Activation in the Mouse
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
High-performance generalized tensor operations: A compiler-oriented approach
The efficiency of tensor contraction is of great importance. Compilers cannot optimize it well enough to come close to the performance of expert-tuned implementations. All existing approaches that provide competitive performance require optimized external code. We introduce a compiler optimization that reaches the performance of optimized BLAS libraries without the need for an external implementation or automatic tuning. Our approach provides competitive performance across hardware architectures and can be generalized to deliver the same benefits for algebraic path problems. By making fast linear algebra kernels available to everyone, we expect productivity increases when optimized libraries are not available. © 2018 Association for Computing Machinery
Remarks on the distributional Schwarzschild geometry
This work is devoted to a mathematical analysis of the distributional Schwarzschild geometry. The Schwarzschild solution is extended to include the singularity; the energy momentum tensor becomes a delta-distribution supported at r=0. Using generalized distributional geometry in the sense of Colombeau's (special) construction the nonlinearities are treated in a mathematically rigorous way. Moreover, generalized function techniques are used as a tool to give a unified discussion of various approaches taken in the literature so far; in particular we comment on geometrical issues
Energy performance contracting (EPC): a suitable mechanism for achieving energy savings in housing cooperatives? Results from a Norwegian pilot project
The barriers to energy savings in institutions and private homes are well known and include people’s lack of interest, awareness, knowledge and human and financial capacity. Experiences made in several countries show that EPC—energy performance contracting—may be used for overcoming many of these barriers. A typical EPC project is delivered by an energy service company (ESCO) and the contract is accompanied with a guarantee for energy savings. EPC is increasingly taken in use in the professional market (firms and the public sector), but is less common in the residential sector market. It has been suggested that there are several barriers for using EPC in the domestic sector such as the uncertainty involved in estimating forthcoming reductions in private consumption. In this paper, we present the results from a pilot project on the use of EPC in a housing cooperative in Oslo. The project was initiated and observed by the researchers. The research followed a transdisciplinary methodology in that it was conducted by both researcher and practitioner (co-authors) in close collaboration with members of the housing cooperative and the ESCOs, who also contributed to the interpretation of results. We document the process in terms of why the Board decided to join the EPC pilot, the call for offers from ESCOs who guaranteed that purchased annual energy would be reduced by one third, the responses to and negotiations of the offer from the ESCO who became contracted in the initial phase and up to the moment when the General Assembly finally decided to not invest in the proposed energy saving measures. We find that the residents not only had limited interest in energy savings but also lacked confidence in the EPC process. This contributed to the outcome. We discuss the findings in relation to the barriers to using EPC among housing cooperatives. We highlight the need for more knowledge about the client side for understanding how barriers may be overcome. Three specific recommendations for how EPC may successfully be employed among housing cooperatives are suggested as follows: (i) include refurbishment and not only energy savings in the EPC, (ii) identify the residents’ needs in an early phase and (iii) communicate the EPC principle to the residents throughout the process
- …
