138 research outputs found

    Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

    Get PDF
    The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

    Polyglot Programming in Applications Used for Genetic Data Analysis

    Get PDF

    Replication Techniques for Speeding up Parallel Applications on Distributed Systems

    Get PDF
    This paper discusses the design choices involved in replicating objects and their effect on performance. Important issues are: how to maintain consistency among different copies of an object; how to implement changes to objects; and which strategy for object replication to use. We have implemented several options to determine which ones are most efficient

    Distributed Shared Memory: A Survey of Issues and Algorithms

    Get PDF
    26 pagesA distributed shared memory (DSM) is an implementation of the shared memory abstraction on a multicomputer architecture which has no physically shared memory. Shared memory is important (as a programming model) not only because of the vast number of existing applications which use it, but also because it is a more appropriate paradigm for certain algorithms. The DSM concept was demonstrated to be viable by Li, in IVY. Recently, there has been a surge of new projects which implement DSM in a variety of software and hardware environments. This paper gives an integrated overview of distributed shared memory. We discuss theoretical lower bounds on the performance of DSM systems, design choices such as structure and granularity, access, coherence semantics, scalability, and heterogeneity, and open problems in DSM. In addition, we describe algorithms used to implement and improve efficiency: reducing thrashing, eliminating false sharing, matching the coherence protocol to the type of sharing, and relaxing the semantics of the memory coherence provided. A spectrum of current DSM systems are used as illustrative examples

    Orca: A Language for Parallel Programming of Distributed Systems

    Get PDF
    Orca is a language for implementing parallel applications on loosely coupled distributed systems. Unlike most languages for distributed programming, it allows processes on different machines to share data. Such data are encapsulated in data-objects, which are instances of user-defined abstract data types. The implementation of Orca takes care of the physical distribution of objects among the local memories of the processors. In particular, an implementation may replicate and/or migrate objects in order to decrease access times to objects and increase parallelism. This paper gives a detailed description of the Orca language design and motivates the design choices. Orca is intended for applications programmers rather than systems programmers. This is reflected in its design goals to provide a simple, easy to use language that is type-secure and provides clean semantics. The paper discusses three example parallel applications in Orca, one of which is described in detail. It also describes..

    Installation guide to mpich, a portable implementation of MPI

    Full text link

    A Comparison of Two Paradigms for Distributed Shared Memory

    Get PDF
    This paper compares two paradigms for Distributed Shared Memory on loosely coupled computing systems: the shared data-object model as used in Orca, a programming language specially designed for loosely coupled computing systems and the Shared Virtual Memory model. For both paradigms two systems are described, one using only point-to-point messages, the other using broadcasting as well. The two paradigms and their implementations are described briefly. Their performances on four applications are compared: the travelling-salesman problem, alpha-beta search, matrix multiplication and the all-pairs shortest paths problem. The relevant measurements were obtained on a system consisting of 10 MC68020 processors connected by an Ethernet. For comparison purposes, the applications have also been run on a system with physical shared memory. In addition, the paper gives measurements for the first two applications above when Remote Procedure Call is used as the communication mechanism. The measurements show that both paradigms can be used efficiently for programming large-grain parallel applications, with significant speed-ups. The structured shared data-object model achieves the highest speed-ups and is easiest to program and to debug. KEYWORDS: Amoeba Distributed shared memory Distributed programming Orc

    A language for the execution of graded BDI agents

    Get PDF
    We are interested in the specification and deployment of multi-agent systems, and particularly we focus on the execution of agents. Along this research line, we have proposed a general model for graded BDI agents, specifying an architecture based on multi-context systems (MCSs) and able to deal with the environment uncertainty (via graded beliefs) and with graded mental proactive attitudes (via desires and intentions). These graded attitudes are represented using appropriate fuzzy modal logics. In this article, we cope with the operational semantics of this agent model. We present a Multi-context calculus, based on Ambient calculus, for the execution of MCSs with its corresponding semantics. This calculus is general enough to support different kinds of MCSs and particularly, we show how a graded BDI agent can be mapped into the language of the calculus. © The Author 2011. Published by Oxford University Press. All rights reserved.The authors are thankful to the anonymous reviewers for their helpful comments for improving the paper. Ana Casali acknowledge partial support by the PID-UNR ING308 project. Llus Godo and Carles Sierra acknowledge partial support by the Spanish project Agreement Technologies (CONSOLIDER CSD2007-0022, INGENIO 2010).Peer Reviewe

    HOLMeS: eHealth in the Big Data and Deep Learning Era

    Get PDF
    Now, data collection and analysis are becoming more and more important in a variety of application domains, as long as novel technologies advance. At the same time, we are experiencing a growing need for human–machine interaction with expert systems, pushing research toward new knowledge representation models and interaction paradigms. In particular, in the last few years, eHealth—which usually indicates all the healthcare practices supported by electronic elaboration and remote communications—calls for the availability of a smart environment and big computational resources able to offer more and more advanced analytics and new human–computer interaction paradigms. The aim of this paper is to introduce the HOLMeS (health online medical suggestions) system: A particular big data platform aiming at supporting several eHealth applications. As its main novelty/functionality, HOLMeS exploits a machine learning algorithm, deployed on a cluster-computing environment, in order to provide medical suggestions via both chat-bot and web-app modules, especially for prevention aims. The chat-bot, opportunely trained by leveraging a deep learning approach, helps to overcome the limitations of a cold interaction between users and software, exhibiting a more human-like behavior. The obtained results demonstrate the effectiveness of the machine learning algorithms, showing an area under ROC (receiver operating characteristic) curve (AUC) of 74.65% when some first-level features are used to assess the occurrence of different chronic diseases within specific prevention pathways. When disease-specific features are added, HOLMeS shows an AUC of 86.78%, achieving a greater effectiveness in supporting clinical decisions

    The Polylith Software Bus

    Get PDF
    We describe a system called POLYLITH that helps programmers prepare and interconnect mixedlanguage software components for execution in heterogeneous environments. POLYLITH'S principal benefit is that programmers are free to implement functional requirements separately from their treatment of interfacing requirements; this means that once an application has been developed for use in one execution environment (such as a distributed network) it can be adapted for reuse in other environments (such as a share d-memory multiprocessor) by automatic techniques. This flexibility is provided without loss of performance. We accomplish this by creating a new run-time organization for software. An abstract decoupling agent, called the software toolbus, is introduced between the system components. Heterogeneity in language and architecture is accommodated since program units are prepared to interface directly to the toolbus, not to other program units. Programmers specify application structure in terms of a module interconnection language (MIL); POLYLITH uses this specification to guide packaging (static interfacing acti vities such as stub generation, source program adaptation, compilation and linking). At run time, an implementation of the toolbus abstraction may assist in message delivery, name service or system reconfiguration. (Also cross-referenced as UMIACS-TR-90-65
    • …
    corecore