343 research outputs found
Thread algebra for poly-threading
International audienceIt is a fact of life that sequential programs are often fragmented. Consequently, fragmented program behaviours are frequently found. We consider this phenomenon in the setting of thread algebra. We extend basic thread algebra with poly-threading, the barest mechanism for sequencing of threads that are taken for program fragment behaviours. This mechanism is the counterpart of program overlaying at the level of program behaviours. We relate the resulting theory to the process theory known as ACP and use it to describe analytic execution architectures suited for fragmented programs. We also consider the case where the steps of fragmented program behaviours are interleaved in the ways of non-distributed and distributed multi-threading
Personal Multi-threading
Multi-threading allows agents to pursue a heterogeneous collection of tasks
in an orderly manner. The view of multi-threading that emerges from thread
algebra is applied to the case where a single agent, who may be human,
maintains a hierarchical multithread as an architecture of its own activities
Decision Taking for Selling Thread Startup
Decision Taking is discussed in the context of the role it may play for a
selling agent in a search market, in particular for agents involved in the sale
of valuable and relatively unique items, such as a dwelling, a second hand car,
or a second hand recreational vessel.
Detailed connections are made between the architecture of decision making
processes and a sample of software technology based concepts including
instruction sequences, multi-threading, and thread algebra.
Ample attention is paid to the initialization or startup of a thread
dedicated to achieving a given objective, and to corresponding decision taking.
As an application, the selling of an item is taken as an objective to be
achieved by running a thread that was designed for that purpose
Thread extraction for polyadic instruction sequences
In this paper, we study the phenomenon that instruction sequences are split
into fragments which somehow produce a joint behaviour. In order to bring this
phenomenon better into the picture, we formalize a simple mechanism by which
several instruction sequence fragments can produce a joint behaviour. We also
show that, even in the case of this simple mechanism, it is a non-trivial
matter to explain by means of a translation into a single instruction sequence
what takes place on execution of a collection of instruction sequence
fragments.Comment: 21 pages; error corrected; presentation improve
Instruction sequences for the production of processes
Single-pass instruction sequences under execution are considered to produce
behaviours to be controlled by some execution environment. Threads as
considered in thread algebra model such behaviours: upon each action performed
by a thread, a reply from its execution environment determines how the thread
proceeds. Threads in turn can be looked upon as producing processes as
considered in process algebra. We show that, by apposite choice of basic
instructions, all processes that can only be in a finite number of states can
be produced by single-pass instruction sequences.Comment: 23 pages; acknowledgement corrected, reference update
WCRT Algebra and Scheduling Interfaces for Esterel-Style Synchronous Multithreading
The abstractions used in system design typically limit themselves to encapsulate and guarantee functionality, not timing. Hence, it is very difficult to transfer results on timing behavior across layers, e.g., from the application level through the operating system level to the hardware level. The choice of the model of computation plays a big role in facilitating this transfer. In the realm of reactive systems, the synchronous model of computation has some appeal here, as it inherently limits the number of operations per reaction, and addresses concurrency and preemptive behavior at the language level. Recently, reactive processing architectures have been proposed as execution platform for synchronous languages, notably Esterel. Initially, these architectures were driven by the desire for high performance with low resource usage, including low power consumption. However, by now they have also demonstrated their benefits in terms of predictability. Preliminary work on worst case reaction time (WCRT) analysis has been promising---fairly simple heuristics already achieve an accuracy typically in the 30--40% range. However, these methods so far lack formal grounding, and do not exploit knowledge about signal consistency etc. To provide a formal basis for WCRT analysis, we here propose a type-theoretic, algebraic approach. This approach not only allows to verify the correctness of WCRT analyses methods, but also opens the door for more exact analyses, as it allows to capture functionality and timing precisely and to trade off precision against analysis effort. This approach is still under development; this report presents first results on suitable interface types and the proper characterization of instantaneous nodes, delay nodes and concurrency. As a concrete application, it builds on a multi-threaded Esterel processor, the Kiel Esterel Processor (KEP)
The Universe at Extreme Scale: Multi-Petaflop Sky Simulation on the BG/Q
Remarkable observational advances have established a compelling
cross-validated model of the Universe. Yet, two key pillars of this model --
dark matter and dark energy -- remain mysterious. Sky surveys that map billions
of galaxies to explore the `Dark Universe', demand a corresponding
extreme-scale simulation capability; the HACC (Hybrid/Hardware Accelerated
Cosmology Code) framework has been designed to deliver this level of
performance now, and into the future. With its novel algorithmic structure,
HACC allows flexible tuning across diverse architectures, including accelerated
and multi-core systems.
On the IBM BG/Q, HACC attains unprecedented scalable performance -- currently
13.94 PFlops at 69.2% of peak and 90% parallel efficiency on 1,572,864 cores
with an equal number of MPI ranks, and a concurrency of 6.3 million. This level
of performance was achieved at extreme problem sizes, including a benchmark run
with more than 3.6 trillion particles, significantly larger than any
cosmological simulation yet performed.Comment: 11 pages, 11 figures, final version of paper for talk presented at
SC1
- …