6,929 research outputs found
Reliable massively parallel symbolic computing : fault tolerance for a distributed Haskell
As the number of cores in manycore systems grows exponentially, the number of failures is
also predicted to grow exponentially. Hence massively parallel computations must be able to
tolerate faults. Moreover new approaches to language design and system architecture are needed
to address the resilience of massively parallel heterogeneous architectures.
Symbolic computation has underpinned key advances in Mathematics and Computer Science,
for example in number theory, cryptography, and coding theory. Computer algebra software
systems facilitate symbolic mathematics. Developing these at scale has its own distinctive
set of challenges, as symbolic algorithms tend to employ complex irregular data and control
structures. SymGridParII is a middleware for parallel symbolic computing on massively parallel
High Performance Computing platforms. A key element of SymGridParII is a domain specific
language (DSL) called Haskell Distributed Parallel Haskell (HdpH). It is explicitly designed for
scalable distributed-memory parallelism, and employs work stealing to load balance dynamically
generated irregular task sizes.
To investigate providing scalable fault tolerant symbolic computation we design, implement
and evaluate a reliable version of HdpH, HdpH-RS. Its reliable scheduler detects and handles
faults, using task replication as a key recovery strategy. The scheduler supports load balancing
with a fault tolerant work stealing protocol. The reliable scheduler is invoked with two fault
tolerance primitives for implicit and explicit work placement, and 10 fault tolerant parallel
skeletons that encapsulate common parallel programming patterns. The user is oblivious to
many failures, they are instead handled by the scheduler.
An operational semantics describes small-step reductions on states. A simple abstract machine
for scheduling transitions and task evaluation is presented. It defines the semantics of
supervised futures, and the transition rules for recovering tasks in the presence of failure. The
transition rules are demonstrated with a fault-free execution, and three executions that recover
from faults.
The fault tolerant work stealing has been abstracted in to a Promela model. The SPIN
model checker is used to exhaustively search the intersection of states in this automaton to
validate a key resiliency property of the protocol. It asserts that an initially empty supervised
future on the supervisor node will eventually be full in the presence of all possible combinations
of failures.
The performance of HdpH-RS is measured using five benchmarks. Supervised scheduling
achieves a speedup of 757 with explicit task placement and 340 with lazy work stealing when
executing Summatory Liouville up to 1400 cores of a HPC architecture. Moreover, supervision
overheads are consistently low scaling up to 1400 cores. Low recovery overheads are observed in
the presence of frequent failure when lazy on-demand work stealing is used. A Chaos Monkey
mechanism has been developed for stress testing resiliency with random failure combinations.
All unit tests pass in the presence of random failure, terminating with the expected results
Recommended from our members
Abductive reasoning in neural-symbolic learning systems
Abduction is or subsumes a process of inference. It entertains possible hypotheses and it chooses hypotheses for further scrutiny. There is a large literature on various aspects of non-symbolic, subconscious abduction. There is also a very active research community working on the symbolic (logical) characterisation of abduction, which typically treats it as a form of hypothetico-deductive reasoning. In this paper we start to bridge the gap between the symbolic and sub-symbolic approaches to abduction. We are interested in benefiting from developments made by each community. In particular, we are interested in the ability of non-symbolic systems (neural networks) to learn from experience using efficient algorithms and to perform massively parallel computations of alternative abductive explanations. At the same time, we would like to benefit from the rigour and semantic clarity of symbolic logic. We present two approaches to dealing with abduction in neural networks. One of them uses Connectionist Modal Logic and a translation of Horn clauses into modal clauses to come up with a neural network ensemble that computes abductive explanations in a top-down fashion. The other combines neural-symbolic systems and abductive logic programming and proposes a neural architecture which performs a more systematic, bottom-up computation of alternative abductive explanations. Both approaches employ standard neural network architectures which are already known to be highly effective in practical learning applications. Differently from previous work in the area, our aim is to promote the integration of reasoning and learning in a way that the neural network provides the machinery for cognitive computation, inductive learning and hypothetical reasoning, while logic provides the rigour and explanation capability to the systems, facilitating the interaction with the outside world. Although it is left as future work to determine whether the structure of one of the proposed approaches is more amenable to learning than the other, we hope to have contributed to the development of the area by approaching it from the perspective of symbolic and sub-symbolic integration
Recommended from our members
Fewer epistemological challenges for connectionism
Seventeen years ago, John McCarthy wrote the note Epistemological challenges for connectionism as a response to Paul Smolensky’s paper 'On the proper treatment of connectionism'. I will discuss the extent to which the four key challenges put forward by McCarthy have been solved, and what are the new challenges ahead. I argue that there are fewer epistemological challenges for connectionism, but progress has been slow. Nevertheless, there is now strong indication that neural-symbolic integration can provide effective systems of expressive reasoning and robust learning due to the recent developments in the field
Platform Dependent Verification: On Engineering Verification Tools for 21st Century
The paper overviews recent developments in platform-dependent explicit-state
LTL model checking.Comment: In Proceedings PDMC 2011, arXiv:1111.006
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications
NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era
Recommended from our members
Value-based argumentation frameworks as neural-symbolic learning systems
While neural networks have been successfully used in a number of machine learning applications, logical languages have been the standard for the representation of argumentative reasoning. In this paper, we establish a relationship between neural networks and argumentation networks, combining reasoning and learning in the same argumentation framework. We do so by presenting a new neural argumentation algorithm, responsible for translating argumentation networks into standard neural networks. We then show a correspondence between the two networks. The algorithm works not only for acyclic argumentation networks, but also for circular networks, and it enables the accrual of arguments through learning as well as the parallel computation of arguments
- …