94 research outputs found

    Submicron Systems Architecture Project : Semiannual Technical Report

    Get PDF
    The Mosaic C is an experimental fine-grain multicomputer based on single-chip nodes. The Mosaic C chip includes 64KB of fast dynamic RAM, processor, packet interface, ROM for bootstrap and self-test, and a two-dimensional selftimed router. The chip architecture provides low-overhead and low-latency handling of message packets, and high memory and network bandwidth. Sixty-four Mosaic chips are packaged by tape-automated bonding (TAB) in an 8 x 8 array on circuit boards that can, in turn, be arrayed in two dimensions to build arbitrarily large machines. These 8 x 8 boards are now in prototype production under a subcontract with Hewlett-Packard. We are planning to construct a 16K-node Mosaic C system from 256 of these boards. The suite of Mosaic C hardware also includes host-interface boards and high-speed communication cables. The hardware developments and activities of the past eight months are described in section 2.1. The programming system that we are developing for the Mosaic C is based on the same message-passing, reactive-process, computational model that we have used with earlier multicomputers, but the model is implemented for the Mosaic in a way that supports finegrain concurrency. A process executes only in response to receiving a message, and may in execution send messages, create new processes, and modify its persistent variables before it either exits or becomes dormant in preparation for receiving another message. These computations are expressed in an object-oriented programming notation, a derivative of C++ called C+-. The computational model and the C+- programming notation are described in section 2.2. The Mosaic C runtime system, which is written in C+-, provides automatic process placement and highly distributed management of system resources. The Mosaic C runtime system is described in section 2.3

    Single system image: A survey

    Get PDF
    Single system image is a computing paradigm where a number of distributed computing resources are aggregated and presented via an interface that maintains the illusion of interaction with a single system. This approach encompasses decades of research using a broad variety of techniques at varying levels of abstraction, from custom hardware and distributed hypervisors to specialized operating system kernels and user-level tools. Existing classification schemes for SSI technologies are reviewed, and an updated classification scheme is proposed. A survey of implementation techniques is provided along with relevant examples. Notable deployments are examined and insights gained from hands-on experience are summarized. Issues affecting the adoption of kernel-level SSI are identified and discussed in the context of technology adoption literature

    Reliability impacts of increased wind generation in the Australian national electricity grid

    Get PDF

    Integrating multiple clusters for compute-intensive applications

    Get PDF
    Multicluster grids provide one promising solution to satisfying the growing computational demands of compute-intensive applications. However, it is challenging to seamlessly integrate all participating clusters in different domains into a single virtual computational platform. In order to fully utilize the capabilities of multicluster grids, computer scientists need to deal with the issue of joining together participating autonomic systems practically and efficiently to execute grid-enabled applications. Driven by several compute-intensive applications, this theses develops a multicluster grid management toolkit called Pelecanus to bridge the gap between user\u27s needs and the system\u27s heterogeneity. Application scientists will be able to conduct very large-scale execution across multiclusters with transparent QoS assurance. A novel model called DA-TC (Dynamic Assignment with Task Containers) is developed and is integrated into Pelecanus. This model uses the concept of a task container that allows one to decouple resource allocation from resource binding. It employs static load balancing for task container distribution and dynamic load balancing for task assignment. The slowest resources become useful rather than be bottlenecks in this manner. A cluster abstraction is implemented, which not only provides various cluster information for the DA-TC execution model, but also can be used as a standalone toolkit to monitor and evaluate the clusters\u27 functionality and performance. The performance of the proposed DA-TC model is evaluated both theoretically and experimentally. Results demonstrate the importance of reducing queuing time in decreasing the total turnaround time for an application. Experiments were conducted to understand the performance of various aspects of the DA-TC model. Experiments showed that our model could significantly reduce turnaround time and increase resource utilization for our targeted application scenarios. Four applications are implemented as case studies to determine the applicability of the DA-TC model. In each case the turnaround time is greatly reduced, which demonstrates that the DA-TC model is efficient for assisting application scientists in conducting their research. In addition, virtual resources were integrated into the DA-TC model for application execution. Experiments show that the execution model proposed in this thesis can work seamlessly with multiple hybrid grid/cloud resources to achieve reduced turnaround time

    Reliability of power systems with climate change effects on PV and wind power generation

    Get PDF
    Concerns over global climate change has led utilities to reduce greenhouse gas (GHG) emissions by decarbonising the power sector. The accelerating rate of climate change is likely to expose a decarbonised power system to climate related stresses. In particular, Photo Voltaic (PV) and wind power generation systems comprise a significant share in the power grid, which is potentially vulnerable to climate change, and therefore may impact the reliability of power systems with their integrations. Typical reliability assessments do not consider the climate effects and related stresses either on the PV or wind power generating systems or at their component levels. Therefore, this thesis investigates and addresses the challenges of reliability assessment of power grid with the interaction of climate changes and renewable power generation systems. As a part of the investigation, the thesis proposes a novel systematic framework to assess the PV system components’ availability with the interaction of future changes in climate. The framework is developed to quantify the climate related stresses on the hierarchical levels of a PV system, which include component, subsystem, PV system and the grid. The framework was formed by considering multiple elements including thermal stress, bathtub curve, ageing and degradation level and operated on Markov chain embedded Monte Carlo simulation. The uniqueness of the framework is its ability to identify the critical components in a PV system that lead to climate-associated failures. Thesis also proposes a comprehensive framework to assess the reliability of a PV and wind power integrated power system accounting climate change impacts by deploying diverse levels of GHG emission scenarios. Uncertainties in the future climate scenarios were established by proposing an advanced stochastic model considering likelihood-based Markov chain method for generating future climate scenario. The proposed model is integrated to the reliability assessment framework to assess realistic impacts on the reliability of a power system. Investigations were suggested the impacts of climate change effects on PV and wind power generation system were true and in quantitative terms PV systems are more vulnerable to climate change effects than wind power generating systems. The climate change related true impacts on PV and wind power generating systems could be mitigated by quantifying change in impacts quantitatively and then systematic replacement of vulnerable sub system components in time before their end of life. Further investigations suggest that IGBTs and capacitors are key components that are more sensitive to thermal stresses of climate change effects resulting a considering impacts on their availability and on the power system reliability with their presence. Further assessments also revealed that the impacts on power system reliability due to the climate change effects on PV and wind power generation system were not uniform over the long run which further emphasises the need of a quantitative and system assessment in order to expose true impacts of climate change on PV and wind power generation system extending to the entire power system reliability. The thesis provides a solid foundation of frameworks required in the quantitative assessment

    Modeling and optimization of high-performance many-core systems for energy-efficient and reliable computing

    Full text link
    Thesis (Ph.D.)--Boston UniversityMany-core systems, ranging from small-scale many-core processors to large-scale high performance computing (HPC) data centers, have become the main trend in computing system design owing to their potential to deliver higher throughput per watt. However, power densities and temperatures increase following the growth in the performance capacity, and bring major challenges in energy efficiency, cooling costs, and reliability. These challenges require a joint assessment of performance, power, and temperature tradeoffs as well as the design of runtime optimization techniques that monitor and manage the interplay among them. This thesis proposes novel modeling and runtime management techniques that evaluate and optimize the performance, energy, and reliability of many-core systems. We first address the energy and thermal challenges in 3D-stacked many-core processors. 3D processors with stacked DRAM have the potential to dramatically improve performance owing to lower memory access latency and higher bandwidth. However, the performance increase may cause 3D systems to exceed the power budgets or create thermal hot spots. In order to provide an accurate analysis and enable the design of efficient management policies, this thesis introduces a simulation framework to jointly analyze performance, power, and temperature for 3D systems. We then propose a runtime optimization policy that maximizes the system performance by characterizing the application behavior and predicting the operating points that satisfy the power and thermal constraints. Our policy reduces the energy-delay product (EDP) by up to 61.9% compared to existing strategies. Performance, cooling energy, and reliability are also critical aspects in HPC data centers. In addition to causing reliability degradation, high temperatures increase the required cooling energy. Communication cost, on the other hand, has a significant impact on system performance in HPC data centers. This thesis proposes a topology-aware technique that maximizes system reliability by selecting between workload clustering and balancing. Our policy improves the system reliability by up to 123.3% compared to existing temperature balancing approaches. We also introduce a job allocation methodology to simultaneously optimize the communication cost and the cooling energy in a data center. Our policy reduces the cooling cost by 40% compared to cooling-aware and performance-aware policies, while achieving comparable performance to performance-aware policy

    Dynamic load balancing strategies in heterogeneous distributed system

    Get PDF
    Distributed heterogeneous computing is being widely applied to a variety of large size computational problems. This computational environments are consists of multiple het- erogeneous computing modules, these modules interact with each other to solve the prob-lem. Dynamic load balancing in distributed computing system is desirable because it is an important key to establish dependability in a Heterogeneous Distributed Computing Systems (HDCS). Load balancing problem is an optimization problem with exponential solution space. The complexity of dynamic load balancing increases with the size of a HDCS and becomes difficult to solve effectively. The solution to this intractable problem is discussed under different algorithm paradigm.The load submitted to the a HDCS is assumed to be in the form of tasks. Dynamic allocation of n independent tasks to m computing nodes in heterogeneous distributed computing system can be possible through centralized or decentralized control. In central-ized approach,we have formulated load balancing problem considering task and machine heterogeneity as a linear programming problem to minimize the time by which all task completes the execution in makespan.The load balancing problem in HDCS aims to maintain a balanced allocation of tasks while using the computational resources. The system state changes with time on arrival of tasks from the users. Therefore,heterogeneous distributed system is modeled as an M/M/m queue. The task model is represented either as a consistent or an inconsistent expected time to compute (ETC) matrix. A batch mode heuristic has been used to de-sign dynamic load balancing algorithms for heterogeneous distributed computing systems with four different type of machine heterogeneity. A number of experiments have been conducted to study the performance of load balancing algorithms with three different ar-rival rate for the task. A better performance of the algorithms is observed with increasing of heterogeneity in the HDCS.A new codification scheme suitable to simulated annealing and genetic algorithm has been introduced to design dynamic load balancing algorithms for HDCS. These stochastic iterative load balancing algorithms uses sliding window techniques to select a batch of tasks, and allocate them to the computing nodes in the HDCS. The proposed dynamic genetic algorithm based load balancer has been found to be effective, especially in the case of a large number of tasks
    corecore