1,008 research outputs found

    Designing a scalable dynamic load -balancing algorithm for pipelined single program multiple data applications on a non-dedicated heterogeneous network of workstations

    Get PDF
    Dynamic load balancing strategies have been shown to be the most critical part of an efficient implementation of various applications on large distributed computing systems. The need for dynamic load balancing strategies increases when the underlying hardware is a non-dedicated heterogeneous network of workstations (HNOW). This research focuses on the single program multiple data (SPMD) programming model as it has been extensively used in parallel programming for its simplicity and scalability in terms of computational power and memory size.;This dissertation formally defines and addresses the problem of designing a scalable dynamic load-balancing algorithm for pipelined SPMD applications on non-dedicated HNOW. During this process, the HNOW parameters, SPMD application characteristics, and load-balancing performance parameters are identified.;The dissertation presents a taxonomy that categorizes general load balancing algorithms and a methodology that facilitates creating new algorithms that can harness the HNOW computing power and still preserve the scalability of the SPMD application.;The dissertation devises a new algorithm, DLAH (Dynamic Load-balancing Algorithm for HNOW). DLAH is based on a modified diffusion technique, which incorporates the HNOW parameters. Analytical performance bound for the worst-case scenario of the diffusion technique has been derived.;The dissertation develops and utilizes an HNOW simulation model to conduct extensive simulations. These simulations were used to validate DLAH and compare its performance to related dynamic algorithms. The simulations results show that DLAH algorithm is scalable and performs well for both homogeneous and heterogeneous networks. Detailed sensitivity analysis was conducted to study the effects of key parameters on performance

    Swift: A modern highly-parallel gravity and smoothed particle hydrodynamics solver for astrophysical and cosmological applications

    Full text link
    Numerical simulations have become one of the key tools used by theorists in all the fields of astrophysics and cosmology. The development of modern tools that target the largest existing computing systems and exploit state-of-the-art numerical methods and algorithms is thus crucial. In this paper, we introduce the fully open-source highly-parallel, versatile, and modular coupled hydrodynamics, gravity, cosmology, and galaxy-formation code Swift. The software package exploits hybrid task-based parallelism, asynchronous communications, and domain-decomposition algorithms based on balancing the workload, rather than the data, to efficiently exploit modern high-performance computing cluster architectures. Gravity is solved for using a fast-multipole-method, optionally coupled to a particle mesh solver in Fourier space to handle periodic volumes. For gas evolution, multiple modern flavours of Smoothed Particle Hydrodynamics are implemented. Swift also evolves neutrinos using a state-of-the-art particle-based method. Two complementary networks of sub-grid models for galaxy formation as well as extensions to simulate planetary physics are also released as part of the code. An extensive set of output options, including snapshots, light-cones, power spectra, and a coupling to structure finders are also included. We describe the overall code architecture, summarize the consistency and accuracy tests that were performed, and demonstrate the excellent weak-scaling performance of the code using a representative cosmological hydrodynamical problem with ≈\approx300300 billion particles. The code is released to the community alongside extensive documentation for both users and developers, a large selection of example test problems, and a suite of tools to aid in the analysis of large simulations run with Swift.Comment: 39 pages, 18 figures, submitted to MNRAS. Code, documentation, and examples available at www.swiftsim.co

    Taming and Leveraging Directionality and Blockage in Millimeter Wave Communications

    Get PDF
    To cope with the challenge for high-rate data transmission, Millimeter Wave(mmWave) is one potential solution. The short wavelength unlatched the era of directional mobile communication. The semi-optical communication requires revolutionary thinking. To assist the research and evaluate various algorithms, we build a motion-sensitive mmWave testbed with two degrees of freedom for environmental sensing and general wireless communication.The first part of this thesis contains two approaches to maintain the connection in mmWave mobile communication. The first one seeks to solve the beam tracking problem using motion sensor within the mobile device. A tracking algorithm is given and integrated into the tracking protocol. Detailed experiments and numerical simulations compared several compensation schemes with optical benchmark and demonstrated the efficiency of overhead reduction. The second strategy attempts to mitigate intermittent connections during roaming is multi-connectivity. Taking advantage of properties of rateless erasure code, a fountain code type multi-connectivity mechanism is proposed to increase the link reliability with simplified backhaul mechanism. The simulation demonstrates the efficiency and robustness of our system design with a multi-link channel record.The second topic in this thesis explores various techniques in blockage mitigation. A fast hear-beat like channel with heavy blockage loss is identified in the mmWave Unmanned Aerial Vehicle (UAV) communication experiment due to the propeller blockage. These blockage patterns are detected through Holm\u27s procedure as a problem of multi-time series edge detection. To reduce the blockage effect, an adaptive modulation and coding scheme is designed. The simulation results show that it could greatly improve the throughput given appropriately predicted patterns. The last but not the least, the blockage of directional communication also appears as a blessing because the geometrical information and blockage event of ancillary signal paths can be utilized to predict the blockage timing for the current transmission path. A geometrical model and prediction algorithm are derived to resolve the blockage time and initiate active handovers. An experiment provides solid proof of multi-paths properties and the numeral simulation demonstrates the efficiency of the proposed algorithm

    Scheduling Heterogeneous HPC Applications in Next-Generation Exascale Systems

    Get PDF
    Next generation HPC applications will increasingly time-share system resources with emerging workloads such as in-situ analytics, resilience tasks, runtime adaptation services and power management activities. HPC systems must carefully schedule these co-located codes in order to reduce their impact on application performance. Among the techniques traditionally used to mitigate the performance effects of time- share systems is gang scheduling. This approach, however, leverages global synchronization and time agreement mechanisms that will become hard to support as systems increase in size. Alternative performance interference mitigation approaches must be explored for future HPC systems. This dissertation evaluates the impacts of workload concurrency in future HPC systems. It uses simulation and modeling techniques to study the performance impacts of existing and emerging interference sources on a selection of HPC benchmarks, mini-applications, and applications. It also quantifies the cost and benefits of different approaches to scheduling co-located workloads, studies performance interference mitigation solutions based on gang scheduling, and examines their synchronization requirements. To do so, this dissertation presents and leverages a new Extreme Value Theory- based model to characterize interference sources, and investigate their impact on Bulk Synchronous Parallel (BSP) applications. It demonstrates how this model can be used to analyze the interference attenuation effects of alternative fine-grained OS scheduling approaches based on periodic real time schedulers. This analysis can, in turn, guide the design of those mitigation techniques by providing tools to understand the tradeoffs of selecting scheduling parameters

    Use It or Lose It: Proactive, Deterministic Longevity in Future Chip Multiprocessors

    Get PDF
    Ever since the VLSI process technology crossed the sub-micron threshold, there is an increased interest in design of fault-tolerant systems to mitigate the wearout of transistors. Hot Carrier Injection (HCI) and Negative Bias Temperature Instability (NBTI) are two prominent usage based transistor degradation mechanisms in the deep sub-micron process technologies. This wearout of transistors can lead to timing violations along the critical paths which will eventually lead to permanent failures of the chip. While there have been many studies which concentrate on decreasing the wearout in a single core, the failure of an individual core need not be catastrophic in the context of Chip Multi-Processors (CMPs). However, a failure in the interconnect in these CMPs can lead to the failure of entire chip as it could lead to protocol-level deadlocks, or even partition away vital components such as the memory controller or other critical I/O. Analysis of HCI and NBTI stresses caused by real workloads on interconnect microachitecture shows that wearout in the CMP on-chip interconnect is correlated with lack of load observed in the network-on-chip routers. It is proven that exercising the wearout-sensitive components of routers under low load with random inputs can decelerate the NBTI wearout. In this work, we propose a novel deterministic approach for the generation of appropriate exercise mode data to maximize the life-time improvement, ensuring design parameter targets are met. The results from this new proposed design yields ~2300x decrease in the rate of CMP wear due to NBTI compared to that of ~28x decrease shown by previous work

    Technical approaches for measurement of human errors

    Get PDF
    Human error is a significant contributing factor in a very high proportion of civil transport, general aviation, and rotorcraft accidents. The technical details of a variety of proven approaches for the measurement of human errors in the context of the national airspace system are presented. Unobtrusive measurements suitable for cockpit operations and procedures in part of full mission simulation are emphasized. Procedure, system performance, and human operator centered measurements are discussed as they apply to the manual control, communication, supervisory, and monitoring tasks which are relevant to aviation operations

    Spectrum measurement, sensing, analysis and simulation in the context of cognitive radio

    Get PDF
    The radio frequency (RF) spectrum is a scarce natural resource, currently regulated locally by national agencies. Spectrum has been assigned to different services and it is very difficult for emerging wireless technologies to gain access due to rigid spectmm policy and heavy opportunity cost. Current spectrum management by licensing causes artificial spectrum scarcity. Spectrum monitoring shows that many frequencies and times are unused. Dynamic spectrum access (DSA) is a potential solution to low spectrum efficiency. In DSA, an unlicensed user opportunistically uses vacant licensed spectrum with the help of cognitive radio. Cognitive radio is a key enabling technology for DSA. In a cognitive radio system, an unlicensed Secondary User (SU) identifies vacant licensed spectrum allocated to a Primary User (PU) and uses it without harmful interference to the PU. Cognitive radio increases spectrum usage efficiency while protecting legacy-licensed systems. The purpose of this thesis is to bring together a group of CR concepts and explore how we can make the transition from conventional radio to cognitive radio. Specific goals of the thesis are firstly the measurement of the radio spectrum to understand the current spectrum usage in the Humber region, UK in the context of cognitive radio. Secondly, to characterise the performance of cyclostationary feature detectors through theoretical analysis, hardware implementation, and real-time performance measurements. Thirdly, to mitigate the effect of degradation due to multipath fading and shadowing, the use of -wideband cooperative sensing techniques using adaptive sensing technique and multi-bit soft decision is proposed, which it is believed will introduce more spectral opportunities over wider frequency ranges and achieve higher opportunistic aggregate throughput.Understanding spectrum usage is the first step toward the future deployment of cognitive radio systems. Several spectrum usage measurement campaigns have been performed, mainly in the USA and Europe. These studies show locality and time dependence. In the first part of this thesis a spectrum usage measurement campaign in the Humber region, is reported. Spectrum usage patterns are identified and noise is characterised. A significant amount of spectrum was shown to be underutilized and available for the secondary use. The second part addresses the question: how can you tell if a spectrum channel is being used? Two spectrum sensing techniques are evaluated: Energy Detection and Cyclostationary Feature Detection. The performance of these techniques is compared using the measurements performed in the second part of the thesis. Cyclostationary feature detection is shown to be more robust to noise. The final part of the thesis considers the identification of vacant channels by combining spectrum measurements from multiple locations, known as cooperative sensing. Wideband cooperative sensing is proposed using multi resolution spectrum sensing (MRSS) with a multi-bit decision technique. Next, a two-stage adaptive system with cooperative wideband sensing is proposed based on the combination of energy detection and cyclostationary feature detection. Simulations using the system above indicate that the two-stage adaptive sensing cooperative wideband outperforms single site detection in terms of detection success and mean detection time in the context of wideband cooperative sensing

    Towards effective dynamic resource allocation for enterprise applications

    Get PDF
    The growing use of online services requires substantial supporting infrastructure. The efficient deployment of applications relies on the cost effectiveness of commercial hosting providers who deliver an agreed quality of service as governed by a service level agreement for a fee. The priorities of the commercial hosting provider are to maximise revenue, by delivering agreed service levels, and minimise costs, through high resource utilisation. In order to deliver high service levels and resource utilisation, it may be necessary to reorganise resources during periods of high demand. This reorganisation process may be manual or alternatively controlled by an autonomous process governed by a dynamic resource allocation algorithm. Dynamic resource allocation has been shown to improve service levels and utilisation and hence, profitability. In this thesis several facets of dynamic resource allocation are examined to asses its suitability for the modern data centre. Firstly, three theoretically derived policies are implemented as a middleware for a modern multi-tier Web application and their performance is examined under a range of workloads in a real world test bed. The scalability of state-of-the art resource allocation policies are explored in two dimensions, namely the number of applications and the quantity of servers under control of the resources allocation policy. The results demonstrate that current policies presented in the literature demonstrate poor scalability in one or both of these dimensions. A new policy is proposed which has significantly improved scalability characteristics and the new policy is demonstrated at scale through simulation. The placement of applications in across a datacenter makes them susceptible to failures in shared infrastructure. To address this issue an application placement mechanism is developed to augment any dynamic resource allocation policy. The results of this placement mechanism demonstrate a significant improvement in the worst case when compared to a random allocation mechanism. A model for the reallocation of resources in a dynamic resource allocation system is also devised. The model demonstrates that the assumption of a constant resource reallocation cost is invalid under both physical reallocation and migration of virtualised resources

    Integrated Application of Active Controls (IAAC) technology to an advanced subsonic transport project. ACT/Control/Guidance System study, volume 1

    Get PDF
    The active control technology (ACT) control/guidance system task of the integrated application of active controls (IAAC) technology project within the NASA energy efficient transport program was documented. The air traffic environment of navigation and air traffic control systems and procedures were extrapolated. An approach to listing flight functions which will be performed by systems and crew of an ACT configured airplane of the 1990s, and a determination of function criticalities to safety of flight, are the basis of candidate integrated ACT/Control/Guidance System architecture. The system mechanizes five active control functions: pitch augmented stability, angle of attack limiting, lateral/directional augmented stability, gust load alleviation, and maneuver load control. The scope and requirements of a program for simulating the integrated ACT avionics and flight deck system, with pilot in the loop, are defined, system and crew interface elements are simulated, and mechanization is recommended. Relationships between system design and crew roles and procedures are evaluated

    Wavelet transform methods for identifying onset of SEMG activity

    Get PDF
    Quantifying improvements in motor control is predicated on the accurate identification of the onset of surface electromyograpic (sEMG) activity. Applying methods from wavelet theory developed in the past decade to digitized signals, a robust algorithm has been designed for use with sEMG collected during reaching tasks executed with the less-affected arm of stroke patients. The method applied both Discretized Continuous Wavelet Transforms (CWT) and Discrete Wavelet Transforms (DWT) for event detection and no-lag filtering, respectively. Input parameters were extracted from the assessed signals. The onset times found in the sEMG signals using the wavelet method were compared with physiological instants of motion onset, determined from video data. Robustness was evaluated by considering the response in onset time with variations of input parameter values. The wavelet method found physiologically relevant onset times in all signals, averaging 147 ms prior to motion onset, compared to predicted onset latencies of 90-110 ins. Latency exhibited slight dependence on subject, but no other variables
    • …
    corecore