52 research outputs found

    Customising compilers for customisable processors

    Get PDF
    The automatic generation of instruction set extensions to provide application-specific acceleration for embedded processors has been a productive area of research in recent years. There have been incremental improvements in the quality of the algorithms that discover and select which instructions to add to a processor. The use of automatic algorithms, however, result in instructions which are radically different from those found in conventional, human-designed, RISC or CISC ISAs. This has resulted in a gap between the hardware’s capabilities and the compiler’s ability to exploit them. This thesis proposes and investigates the use of a high-level compiler pass that uses graph-subgraph isomorphism checking to exploit these complex instructions. Operating in a separate pass permits techniques to be applied that are uniquely suited for mapping complex instructions, but unsuitable for conventional instruction selection. The existing, mature, compiler back-end can then handle the remainder of the compilation. With this method, the high-level pass was able to use 1965 different automatically produced instructions to obtain an initial average speed-up of 1.11x over 179 benchmarks evaluated on a hardware-verified cycle-accurate simulator. This result was improved following an investigation of how the produced instructions were being used by the compiler. It was established that the models the automatic tools were using to develop instructions did not take account of how well the compiler could realistically use them. Adding additional parameters to the search heuristic to account for compiler issues increased the speed-up from 1.11x to 1.24x. An alternative approach using a re-designed hardware interface was also investigated and this achieved a speed-up of 1.26x while reducing hardware and compiler complexity. A complementary, high-level, method of exploiting dual memory banks was created to increase memory bandwidth to accommodate the increased data-processing bandwidth provided by extension instructions. Finally, the compiler was considered for use in a non-conventional role where rather than generating code it is used to apply source-level transformations prior to the generation of extension instructions and thus affect the shape of the instructions that are generated

    Run-time management for future MPSoC platforms

    Get PDF
    In recent years, we are witnessing the dawning of the Multi-Processor Systemon- Chip (MPSoC) era. In essence, this era is triggered by the need to handle more complex applications, while reducing overall cost of embedded (handheld) devices. This cost will mainly be determined by the cost of the hardware platform and the cost of designing applications for that platform. The cost of a hardware platform will partly depend on its production volume. In turn, this means that ??exible, (easily) programmable multi-purpose platforms will exhibit a lower cost. A multi-purpose platform not only requires ??exibility, but should also combine a high performance with a low power consumption. To this end, MPSoC devices integrate computer architectural properties of various computing domains. Just like large-scale parallel and distributed systems, they contain multiple heterogeneous processing elements interconnected by a scalable, network-like structure. This helps in achieving scalable high performance. As in most mobile or portable embedded systems, there is a need for low-power operation and real-time behavior. The cost of designing applications is equally important. Indeed, the actual value of future MPSoC devices is not contained within the embedded multiprocessor IC, but in their capability to provide the user of the device with an amount of services or experiences. So from an application viewpoint, MPSoCs are designed to ef??ciently process multimedia content in applications like video players, video conferencing, 3D gaming, augmented reality, etc. Such applications typically require a lot of processing power and a signi??cant amount of memory. To keep up with ever evolving user needs and with new application standards appearing at a fast pace, MPSoC platforms need to be be easily programmable. Application scalability, i.e. the ability to use just enough platform resources according to the user requirements and with respect to the device capabilities is also an important factor. Hence scalability, ??exibility, real-time behavior, a high performance, a low power consumption and, ??nally, programmability are key components in realizing the success of MPSoC platforms. The run-time manager is logically located between the application layer en the platform layer. It has a crucial role in realizing these MPSoC requirements. As it abstracts the platform hardware, it improves platform programmability. By deciding on resource assignment at run-time and based on the performance requirements of the user, the needs of the application and the capabilities of the platform, it contributes to ??exibility, scalability and to low power operation. As it has an arbiter function between different applications, it enables real-time behavior. This thesis details the key components of such an MPSoC run-time manager and provides a proof-of-concept implementation. These key components include application quality management algorithms linked to MPSoC resource management mechanisms and policies, adapted to the provided MPSoC platform services. First, we describe the role, the responsibilities and the boundary conditions of an MPSoC run-time manager in a generic way. This includes a de??nition of the multiprocessor run-time management design space, a description of the run-time manager design trade-offs and a brief discussion on how these trade-offs affect the key MPSoC requirements. This design space de??nition and the trade-offs are illustrated based on ongoing research and on existing commercial and academic multiprocessor run-time management solutions. Consequently, we introduce a fast and ef??cient resource allocation heuristic that considers FPGA fabric properties such as fragmentation. In addition, this thesis introduces a novel task assignment algorithm for handling soft IP cores denoted as hierarchical con??guration. Hierarchical con??guration managed by the run-time manager enables easier application design and increases the run-time spatial mapping freedom. In turn, this improves the performance of the resource assignment algorithm. Furthermore, we introduce run-time task migration components. We detail a new run-time task migration policy closely coupled to the run-time resource assignment algorithm. In addition to detailing a design-environment supported mechanism that enables moving tasks between an ISP and ??ne-grained recon??gurable hardware, we also propose two novel task migration mechanisms tailored to the Network-on-Chip environment. Finally, we propose a novel mechanism for task migration initiation, based on reusing debug registers in modern embedded microprocessors. We propose a reactive on-chip communication management mechanism. We show that by exploiting an injection rate control mechanism it is possible to provide a communication management system capable of providing a soft (reactive) QoS in a NoC. We introduce a novel, platform independent run-time algorithm to perform quality management, i.e. to select an application quality operating point at run-time based on the user requirements and the available platform resources, as reported by the resource manager. This contribution also proposes a novel way to manage the interaction between the quality manager and the resource manager. In order to have a the realistic, reproducible and ??exible run-time manager testbench with respect to applications with multiple quality levels and implementation tradev offs, we have created an input data generation tool denoted Pareto Surfaces For Free (PSFF). The the PSFF tool is, to the best of our knowledge, the ??rst tool that generates multiple realistic application operating points either based on pro??ling information of a real-life application or based on a designer-controlled random generator. Finally, we provide a proof-of-concept demonstrator that combines these concepts and shows how these mechanisms and policies can operate for real-life situations. In addition, we show that the proposed solutions can be integrated into existing platform operating systems

    Particle Swarm Optimization

    Get PDF
    Particle swarm optimization (PSO) is a population based stochastic optimization technique influenced by the social behavior of bird flocking or fish schooling.PSO shares many similarities with evolutionary computation techniques such as Genetic Algorithms (GA). The system is initialized with a population of random solutions and searches for optima by updating generations. However, unlike GA, PSO has no evolution operators such as crossover and mutation. In PSO, the potential solutions, called particles, fly through the problem space by following the current optimum particles. This book represents the contributions of the top researchers in this field and will serve as a valuable tool for professionals in this interdisciplinary field

    Use of Inferential Statistics to Design Effective Communication Protocols for Wireless Sensor Networks

    Get PDF
    This thesis explores the issues and techniques associated with employing the principles of inferential statistics to design effective Medium Access Control (MAC), routing and duty cycle management strategies for multihop Wireless Sensor Networks (WSNs). The main objective of these protocols are to maximise the throughput of the network, to prolong the lifetime of nodes and to reduce the end-to-end delay of packets over a general network scenario without particular considerations for specific topology configurations, traffic patterns or routing policies. WSNs represent one of the leading-edge technologies that have received substantial research efforts due to their prominent roles in many applications. However, to design effective communication protocols for WSNs is particularly challenging due to the scarce resources of these networks and the requirement for large-scale deployment. The MAC, routing and duty cycle management protocols are amongst the important strategies that are required to ensure correct operations of WSNs. This thesis makes use of the inferential statistics field to design these protocols; inferential statistics was selected as it provides a rich design space with powerful approaches and methods. The MAC protocol proposed in this thesis exploits the statistical characteristics of the Gamma distribution to enable each node to adjust its contention parameters dynamically based on its inference for the channel occupancy. This technique reduces the service time of packets and leverages the throughput by improving the channel utilisation. Reducing the service time minimises the energy consumed in contention to access the channel which in turn prolongs the lifetime of nodes. The proposed duty cycle management scheme uses non-parametric Bayesian inference to enable each node to determine the best times and durations for its sleeping durations without posing overheads on the network. Hence the lifetime of node is prolonged by mitigating the amount of energy wasted in overhearing and idle listening. Prolonging the lifetime of nodes increases the throughput of the network and reduces the end-to-end delay as it allows nodes to route their packets over optimal paths for longer periods. The proposed routing protocol uses one of the state-of-the-art inference techniques dubbed spatial reasoning that enables each node to figure out the spatial relationships between nodes without overwhelming the network with control packets. As a result, the end-to-end delay is reduced while the throughput and lifetime are increased. Besides the proposed protocols, this thesis utilises the analytical aspects of statistics to develop rigorous analytical models that can accurately predict the queuing and medium access delay and energy consumption over multihop networks. Moreover, this thesis provides a broader perspective for design of communication protocols for WSNs by casting the operations of these networks in the domains of the artificial chemistry discipline and the harmony search optimisation algorithm

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    A systematic approach for integrated product, materials, and design-process design

    Get PDF
    Designers are challenged to manage customer, technology, and socio-economic uncertainty causing dynamic, unquenchable demands on limited resources. In this context, increased concept flexibility, referring to a designer s ability to generate concepts, is crucial. Concept flexibility can be significantly increased through the integrated design of product and material concepts. Hence, the challenge is to leverage knowledge of material structure-property relations that significantly affect system concepts for function-based, systematic design of product and materials concepts in an integrated fashion. However, having selected an integrated product and material system concept, managing complexity in embodiment design-processes is important. Facing a complex network of decisions and evolving analysis models a designer needs the flexibility to systematically generate and evaluate embodiment design-process alternatives. In order to address these challenges and respond to the primary research question of how to increase a designer s concept and design-process flexibility to enhance product creation in the conceptual and early embodiment design phases, the primary hypothesis in this dissertation is embodied as a systematic approach for integrated product, materials and design-process design. The systematic approach consists of two components i) a function-based, systematic approach to the integrated design of product and material concepts from a systems perspective, and ii) a systematic strategy to design-process generation and selection based on a decision-centric perspective and a value-of-information-based Process Performance Indicator. The systematic approach is validated using the validation-square approach that consists of theoretical and empirical validation. Empirical validation of the framework is carried out using various examples including: i) design of a reactive material containment system, and ii) design of an optoelectronic communication system.Ph.D.Committee Chair: Allen, Janet K.; Committee Member: Aidun, Cyrus K.; Committee Member: Klein, Benjamin; Committee Member: McDowell, David L.; Committee Member: Mistree, Farrokh; Committee Member: Yoder, Douglas P

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Exploiting development to enhance the scalability of hardware evolution.

    Get PDF
    Evolutionary algorithms do not scale well to the large, complex circuit design problems typical of the real world. Although techniques based on traditional design decomposition have been proposed to enhance hardware evolution's scalability, they often rely on traditional domain knowledge that may not be appropriate for evolutionary search and might limit evolution's opportunity to innovate. It has been proposed that reliance on such knowledge can be avoided by introducing a model of biological development to the evolutionary algorithm, but this approach has not yet achieved its potential. Prior demonstrations of how development can enhance scalability used toy problems that are not indicative of evolving hardware. Prior attempts to apply development to hardware evolution have rarely been successful and have never explored its effect on scalability in detail. This thesis demonstrates that development can enhance scalability in hardware evolution, primarily through a statistical comparison of hardware evolution's performance with and without development using circuit design problems of various sizes. This is reinforced by proposing and demonstrating three key mechanisms that development uses to enhance scalability: the creation of modules, the reuse of modules, and the discovery of design abstractions. The thesis includes several minor contributions: hardware is evolved using a common reconfigurable architecture at a lower level of abstraction than reported elsewhere. It is shown that this can allow evolution to exploit the architecture more efficiently and perhaps search more effectively. Also the benefits of several features of developmental models are explored through the biases they impose on the evolutionary search. Features that are explored include the type of environmental context development uses and the constraints on symmetry and information transmission they impose, genetic operators that may improve the robustness of gene networks, and how development is mapped to hardware. Also performance is compared against contemporary developmental models

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link
    • …
    corecore