27 research outputs found
Clock Math — a System for Solving SLEs Exactly
In this paper, we present a GPU-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. Exactly means without rounding errors due to using integer arithmetics. First, we scale floating-point numbers up to integers, then we solve dozens of SLEs within different modular arithmetics and then we assemble sub-solutions back using the Chinese remainder theorem. This approach effectively bypasses current CPU floating-point limitations. The system is capable of solving Hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing CPU solvers
Simit: A Language for Physical Simulation
Using existing programming tools, writing high-performance simulation code is labor intensive and requires sacrificing readability and portability. The alternative is to prototype simulations in a high-level language like Matlab, thereby sacrificing performance. The Matlab programming model naturally describes the behavior of an entire physical system using the language of linear algebra. However, simulations also manipulate individual geometric elements, which are best represented using linked data structures like meshes. Translating between the linked data structures and linear algebra comes at significant cost, both to the programmer and the machine. High-performance implementations avoid the cost by rephrasing the computation in terms of linked or index data structures, leaving the code complicated and monolithic, often increasing its size by an order of magnitude. In this paper, we present Simit, a new language for physical simulations that lets the programmer view the system both as a linked data structure in the form of a hypergraph, and as a set of global vectors, matrices and tensors depending on what is convenient at any given time. Simit provides a novel assembly construct that makes it conceptually easy and computationally efficient to move between the two abstractions. Using the information provided by the assembly construct, the compiler generates efficient in-place computation on the graph. We demonstrate that Simit is easy to use: a Simit program is typically shorter than a Matlab program; that it is high-performance: a Simit program running sequentially on a CPU performs comparably to hand-optimized simulations; and that it is portable: Simit programs can be compiled for GPUs with no change to the program, delivering 5-25x speedups over our optimized CPU code
Recommended from our members
Hybrid Analog-Digital Co-Processing for Scientific Computation
In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms.
This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking.
I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer.
The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around.
The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations,
and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads.
The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms.
The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation.
As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads
Enhanced Lanczos Algorithms for Solving Systems of Linear Equations with Embedding Interpolation and Extrapolation
Lanczos-type algorithms are prone to breaking down before convergence to an acceptable solution is achieved. This study investigates a number of ways to deal with this issue. In the first instance, we investigate the quality of three types of restarting points in the restarting strategy when applied to a particular Lanczos-type algorithm namely Orthodir. The main contribution of the thesis, however, is concerned with using regression as an alternative way to deal with breakdown. A Lanczos-type algorithm is run for a number of iterations and then stopped, ideally, just before breakdown occurs. The sequence of generated iterates is used to build up a regression model that captures the characteristic of this sequence. The model is then used to generate new iterates that belong to that sequence. Since the iterative process of Lanczos is circumvented, or ignored, while using the model to find new points, the breakdown issue is resolved, at least temporarily, unless convergence is achieved. This new approach, called EIEMLA, is shown formally, through extrapolation, that it generates a new point which is at least as good as the last point generated by the Lanczos-type algorithm prior to stoppage. The remaining part of the thesis reports on the implementation of EIEMLA sequentially and in parallel on a standard parallel machine provided locally and on a Cloud Computing platform, namely Domino Data Lab. Through these implementations, we have shown that problems with up to variables and equations can be solved with the new approach. Extensive numerical results are included in this thesis. Moreover, we point out some important issues for further investigation
Proceedings, MSVSCC 2011
Proceedings of the 5th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 14, 2011 at VMASC in Suffolk, Virginia. 186 pp
Practical synthesis from real-world oracles
As software systems become increasingly heterogeneous, the ability of compilers to reason about an entire system has decreased. When components of a system are not implemented as traditional programs, but rather as specialised hardware, optimised architecture-specific libraries, or network services, the compiler is unable to cross these abstraction barriers and analyse the system as a whole.
If these components could be modelled or understood as programs, then the compiler would be able to reason about their behaviour without concern for their internal implementation details: a homogeneous view of the entire system would be afforded. However, it is not often the case that such components ever corresponded to an original program. This means that to facilitate this homogenenous analysis, programmatic models of component behaviour must be learned or constructed automatically.
Constructing these models is an inductive program synthesis problem, albeit a challenging one that is largely beyond the ability of existing implementations. In order for the problem to be made tractable, information provided by the underlying context (i.e. the real component behaviour to be matched) must be integrated.
This thesis presents three program synthesis approaches that integrate contextual information to synthesise programmatic models for real, existing components. The first, Annote, exploits informally-encoded information about a component's interface (e.g. from documentation) by weaving that information into an extended type-and-attribute system for component interfaces. The second, Presyn, learns a pair of cooperating probabilistic models from prior syntheses, that aim to predict likely program structure based on a component's interface. Finally, Haze uses observations of common side-effects of component executions to bias the search for programs. These approaches are each evaluated against comparable synthesisers from the literature, on a set of benchmark problems derived from real components.
Learning models for component behaviour is only a partial solution; the compiler must also have some mechanism to use those models for program analysis and transformation. This thesis additionally proposes a novel mechanism for context-sensitive automatic API migration based on synthesised programmatic models, and evaluates the effectiveness of doing so on real application code.
In summary, this thesis proposes a new framing for program synthesis problems that target the behaviour of real components, and demonstrates three different potential approaches to synthesis in this spirit. The success of these approaches is evaluated against implementations from the literature, and their results used to drive a novel API migration technique
SusTrainable: Promoting Sustainability as a Fundamental Driver in Software Development Training and Education. 2nd Teacher Training, January 23-27, 2023, Pula, Croatia. Revised lecture notes
This volume exhibits the revised lecture notes of the 2nd teacher training
organized as part of the project Promoting Sustainability as a Fundamental
Driver in Software Development Training and Education, held at the Juraj
Dobrila University of Pula, Croatia, in the week January 23-27, 2023. It is the
Erasmus+ project No. 2020-1-PT01-KA203-078646 - Sustrainable. More details can
be found at the project web site https://sustrainable.github.io/
One of the most important contributions of the project are two summer
schools. The 2nd SusTrainable Summer School (SusTrainable - 23) will be
organized at the University of Coimbra, Portugal, in the week July 10-14, 2023.
The summer school will consist of lectures and practical work for master and
PhD students in computing science and closely related fields. There will be
contributions from Babe\c{s}-Bolyai University, E\"{o}tv\"{o}s Lor\'{a}nd
University, Juraj Dobrila University of Pula, Radboud University Nijmegen,
Roskilde University, Technical University of Ko\v{s}ice, University of
Amsterdam, University of Coimbra, University of Minho, University of Plovdiv,
University of Porto, University of Rijeka.
To prepare and streamline the summer school, the consortium organized a
teacher training in Pula, Croatia. This was an event of five full days,
organized by Tihana Galinac Grbac and Neven Grbac. The Juraj Dobrila University
of Pula is very concerned with the sustainability issues. The education,
research and management are conducted with sustainability goals in mind.
The contributions in the proceedings were reviewed and provide a good
overview of the range of topics that will be covered at the summer school. The
papers in the proceedings, as well as the very constructive and cooperative
teacher training, guarantee the highest quality and beneficial summer school
for all participants.Comment: 85 pages, 8 figures, 3 code listings and 1 table; editors: Tihana
Galinac Grbac, Csaba Szab\'{o}, Jo\~{a}o Paulo Fernande
WOFEX 2021 : 19th annual workshop, Ostrava, 1th September 2021 : proceedings of papers
The workshop WOFEX 2021 (PhD workshop of Faculty of Electrical Engineer-ing and Computer Science) was held on September 1st September 2021 at the VSB – Technical University of Ostrava. The workshop offers an opportunity for students to meet and share their research experiences, to discover commonalities in research and studentship, and to foster a collaborative environment for joint problem solving. PhD students are encouraged to attend in order to ensure a broad, unconfined discussion. In that view, this workshop is intended for students and researchers of this faculty offering opportunities to meet new colleagues.Ostrav