242 research outputs found

    An Expressive Language and Efficient Execution System for Software Agents

    Full text link
    Software agents can be used to automate many of the tedious, time-consuming information processing tasks that humans currently have to complete manually. However, to do so, agent plans must be capable of representing the myriad of actions and control flows required to perform those tasks. In addition, since these tasks can require integrating multiple sources of remote information ? typically, a slow, I/O-bound process ? it is desirable to make execution as efficient as possible. To address both of these needs, we present a flexible software agent plan language and a highly parallel execution system that enable the efficient execution of expressive agent plans. The plan language allows complex tasks to be more easily expressed by providing a variety of operators for flexibly processing the data as well as supporting subplans (for modularity) and recursion (for indeterminate looping). The executor is based on a streaming dataflow model of execution to maximize the amount of operator and data parallelism possible at runtime. We have implemented both the language and executor in a system called THESEUS. Our results from testing THESEUS show that streaming dataflow execution can yield significant speedups over both traditional serial (von Neumann) as well as non-streaming dataflow-style execution that existing software and robot agent execution systems currently support. In addition, we show how plans written in the language we present can represent certain types of subtasks that cannot be accomplished using the languages supported by network query engines. Finally, we demonstrate that the increased expressivity of our plan language does not hamper performance; specifically, we show how data can be integrated from multiple remote sources just as efficiently using our architecture as is possible with a state-of-the-art streaming-dataflow network query engine

    Scheduling Optimisations for SPIN to Minimise Buffer Requirements in Synchronous Data Flow

    Get PDF
    Synchronous Data flow (SDF) graphs have a simple and elegant semantics (essentially linear algebra) which makes SDF graphs eminently suitable as a vehicle for studying scheduling optimisations. We extend related work on using SPIN to experiment with scheduling optimisations aimed at minimising buffer requirements.We show that for a benchmark of commonly used case studies the performance of our SPIN based scheduler is comparable to that of state of the art research tools. The key to success is using the semantics of SDF to prove when using (even unsound and/or incomplete) optimisations are justified. The main benefit of our approach lies in gaining deep insight in the optimisations at relatively low cost

    Parameterized Looped Schedules

    Get PDF
    This paper is concerned with the compact representation of execution sequences in terms of efficient looping constructs. Here, by a looping construct we mean a compact way of specifying a finite repetition of a set of execution primitives (“instructions”). Such compaction, which can be viewed as a form of hierarchical run-length encoding, has application in many embedded software contexts, including efficient control generation for Kahn processes, and software synthesis for static dataflow models of computation, such as synchronous dataflow and cyclo-static dataflow. In this paper, we significantly generalize previous models for loop-based code compaction of DSP programs to yield a configurable code compression methodology that exhibits a broad range of achievable trade-offs. Specifically, we formally develop and apply to DSP hardware and software implementation a parameterizable loop scheduling approach with compact format, dynamic reconfigurability, and low overhead decompression

    Pregelix: Big(ger) Graph Analytics on A Dataflow Engine

    Full text link
    There is a growing need for distributed graph processing systems that are capable of gracefully scaling to very large graph datasets. Unfortunately, this challenge has not been easily met due to the intense memory pressure imposed by process-centric, message passing designs that many graph processing systems follow. Pregelix is a new open source distributed graph processing system that is based on an iterative dataflow design that is better tuned to handle both in-memory and out-of-core workloads. As such, Pregelix offers improved performance characteristics and scaling properties over current open source systems (e.g., we have seen up to 15x speedup compared to Apache Giraph and up to 35x speedup compared to distributed GraphLab), and makes more effective use of available machine resources to support Big(ger) Graph Analytics

    Integrated Software Synthesis for Signal Processing Applications

    Get PDF
    Signal processing applications usually encounter multi-dimensional real-time performance requirements and restrictions on resources, which makes software implementation complex. Although major advances have been made in embedded processor technology for this application domain -- in particular, in technology for programmable digital signal processors -- traditional compiler techniques applied to such platforms do not generate machine code of desired quality. As a result, low-level, human-driven fine tuning of software implementations is needed, and we are therefore in need of more effective strategies for software implementation for signal processing applications. In this thesis, a number of important memory and performance optimization problems are addressed for translating high-level representations of signal processing applications into embedded software implementations. This investigation centers around signal processing-oriented dataflow models of computation. This form of dataflow provides a coarse grained modeling approach that is well-suited to the signal processing domain and is increasingly supported by commercial and research-oriented tools for design and implementation of signal processing systems. Well-developed dataflow models of signal processing systems expose high-level application structure that can be used by designers and design tools to guide optimization of hardware and software implementations. This thesis advances the suite of techniques available for optimization of software implementations that are derived from the application structure exposed from dataflow representations. In addition, the specialized architecture of programmable digital signal processors is considered jointly with dataflow-based analysis to streamline the optimization process for this important family of embedded processors. The specialized features of programmable digital signal processors that are addressed in this thesis include parallel memory banks to facilitate data parallelism, and signal-processing-oriented addressing modes and address register management capabilities. The problems addressed in this thesis involve several inter-related features, and therefore an integrated approach is required to solve them effectively. This thesis proposes such an integrated approach, and develops the approach through formal problem formulations, in-depth theoretical analysis, and extensive experimentation

    Design methodology for embedded computer vision systems

    Get PDF
    Computer vision has emerged as one of the most popular domains of embedded appli¬cations. Though various new powerful embedded platforms to support such applica¬tions have emerged in recent years, there is a distinct lack of efficient domain-specific synthesis techniques for optimized implementation of such systems. In this thesis, four different aspects that contribute to efficient design and synthesis of such systems are explored: (1) Graph Transformations: Dataflow modeling is widely used in digital signal processing (DSP) systems. However, support for dynamic behavior in such systems exists mainly at the modeling level and there is a lack of optimized synthesis tech¬niques for these models. New transformation techniques for efficient system-on-chip (SoC) design methods are proposed and implemented for cyclo-static dataflow and its parameterized version (parameterized cyclo-static dataflow) -- two powerful models that allow dynamic reconfigurability and phased behavior in DSP systems. (2) Design Space Exploration: The broad range of target platforms along with the complexity of applications provides a vast design space, calling for efficient tools to explore this space and produce effective design choices. A novel architectural level design methodology based on a formalism called multirate synchronization graphs is presented along with methods for performance evaluation. (3) Multiprocessor Communication Interface: Efficient code synthesis for emerg¬ing new parallel architectures is an important and sparsely-explored problem. A widely-encountered problem in this regard is efficient communication between pro¬cessors running different sub-systems. A widely used tool in the domain of general-purpose multiprocessor clusters is MPI (Message Passing Interface). However, this does not scale well for embedded DSP systems. A new, powerful and highly optimized communication interface for multiprocessor signal processing systems is presented in this work that is based on the integration of relevant properties of MPI with dataflow semantics. (4) Parameterized Design Framework for Particle Filters: Particle filter systems constitute an important class of applications used in a wide number of fields. An effi¬cient design and implementation framework for such systems has been implemented based on the observation that a large number of such applications exhibit similar prop¬erties. The key properties of such applications are identified and parameterized appro¬priately to realize different systems that represent useful trade-off points in the space of possible implementations
    corecore