1,435 research outputs found

    Topology-aware GPU scheduling for learning workloads in cloud environments

    Get PDF
    Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef and Asser Tantawi for the valuable discussions. We also thank SC17 committee member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version

    Insights into WebAssembly: Compilation performance and shared code caching in node.js

    Get PDF
    Alongside JavaScript, V8 and Node.js have become essential components of contemporary web and cloud applications. With the addition of WebAssembly to the web, developers finally have a fast platform for performance-critical code. However, this addition also introduces new challenges to client and server applications. New application architectures, such as serverless computing, require instantaneous performance without long startup times. In this paper, we investigate the performance of WebAssembly compilation in V8 and Node.js, and present the design and implementation of a multi-process shared code cache for Node.js applications. We demonstrate how such a cache can significantly increase application performance, and reduce application startup time, CPU usage, and memory footprint

    Energy-efficient Transitional Near-* Computing

    Get PDF
    Studies have shown that communication networks, devices accessing the Internet, and data centers account for 4.6% of the worldwide electricity consumption. Although data centers, core network equipment, and mobile devices are getting more energy-efficient, the amount of data that is being processed, transferred, and stored is vastly increasing. Recent computer paradigms, such as fog and edge computing, try to improve this situation by processing data near the user, the network, the devices, and the data itself. In this thesis, these trends are summarized under the new term near-* or near-everything computing. Furthermore, a novel paradigm designed to increase the energy efficiency of near-* computing is proposed: transitional computing. It transfers multi-mechanism transitions, a recently developed paradigm for a highly adaptable future Internet, from the field of communication systems to computing systems. Moreover, three types of novel transitions are introduced to achieve gains in energy efficiency in near-* environments, spanning from private Infrastructure-as-a-Service (IaaS) clouds, Software-defined Wireless Networks (SDWNs) at the edge of the network, Disruption-Tolerant Information-Centric Networks (DTN-ICNs) involving mobile devices, sensors, edge devices as well as programmable components on a mobile System-on-a-Chip (SoC). Finally, the novel idea of transitional near-* computing for emergency response applications is presented to assist rescuers and affected persons during an emergency event or a disaster, although connections to cloud services and social networks might be disturbed by network outages, and network bandwidth and battery power of mobile devices might be limited

    Remediation by Design: New Linguistic Domains for Changing Organizational Practices

    Get PDF
    The paper examines the impact of novel linguistic vocabularies on the remediation of practices using computer-reliant media. As linguistic vocabularies we consider social Web services (with certain material agency) allowing recurrent engagement of users in designated communication acts. On the other hand, remediation is conceived as an evolving state of affairs where new practices (as defined by computer-mediated linguistic conventions) are improvised on the basis of old practices that work differently in new technological settings. In this vein, remediation of organizational routines takes place when established human activities are retooled using digital materials to convey new possibilities for action. The paper advances a proposition and a scaffold for remediating by design which is then ‘tested’ by reflecting upon an empirical case

    Halos of Spiral Galaxies. III. Metallicity Distributions

    Full text link
    (Abriged) We report results of a campaign to image the stellar populations in the halos of highly inclined spiral galaxies, with the fields roughly 10 kpc (projected) from the nuclei. We use the F814W (I) and F606W (V) filters in the Wide Field Planetary Camera 2, on board the Hubble Space telescope. Extended halo populations are detected in all galaxies. The color-magnitude diagrams appear to be completely dominated by giant-branch stars, with no evidence for the presence of young stellar populations in any of the fields. We find that the metallicity distribution functions are dominated by metal-rich populations, with a tail extending toward the metal poor end. To first order, the overall shapes of the metallicity distribution functions are similar to what is predicted by simple, single-component model of chemical evolution with the effective yields increasing with galaxy luminosity. However, metallicity distributions significantly narrower than the simple model are observed for a few of the most luminous galaxies in the sample. It appears clear that more luminous spiral galaxies also have more metal-rich stellar halos. The increasingly significant departures from the closed-box model for the more luminous galaxies indicate that a parameter in addition to a single yield is required to describe chemical evolution. This parameter, which could be related to gas infall or outflow either in situ or in progenitor dwarf galaxies that later merge to form the stellar halo, tends to act to make the metallicity distributions narrower at high metallicity.Comment: 20 pages, 8 figures (ApJ, in press

    Halo Properties in Cosmological Simulations of Self-Interacting Cold Dark Matter

    Full text link
    We present a comparison of halo properties in cosmological simulations of collisionless cold dark matter (CDM) and self-interacting dark matter (SIDM) for a range of dark matter cross sections. We find, in agreement with various authors, that CDM yields cuspy halos that are too centrally concentrated as compared to observations. Conversely, SIDM simulations using a Monte Carlo N-body technique produce halos with significantly reduced central densities and flatter cores with increasing cross section. We introduce a concentration parameter based on enclosed mass that we expect will be straightforward to determine observationally, unlike that of Navarro, Frenk & White, and provide predictions for SIDM and CDM. SIDM also produces more spherical halos than CDM, providing possibly the strongest observational test of SIDM. We discuss our findings in relation to various relevant observations as well as SIDM simulations of other groups. Taking proper account of simulation limitations, we find that a dark matter cross section per unit mass of sigma_DM ~= 10^{-23}-10^{-24} cm^2/GeV is consistent with all current observational constraints.Comment: 14 pages, submitted to Ap

    Cloud enterprise resource planning development model based on software factory approach

    Get PDF
    Literature reviews revealed that Cloud Enterprise Resource Planning (Cloud ERP) is significantly growing, yet from software developers’ perspective, it has succumbed to high management complexity, high workload, inconsistency software quality, and knowledge retention problems. Previous researches lack a solution that holistically addresses all the research problem components. Software factory approach was chosen to be adapted along with relevant theories to develop a model referred to as Cloud ERP Factory Model (CEF Model), which intends to pave the way in solving the above-mentioned problems. There are three specific objectives, those are (i) to develop the model by identifying the components with its elements and compile them into the CEF Model, (ii) to verify the model’s deployment technical feasibility, and (iii) to validate the model field usability in a real Cloud ERP production case studies. The research employed Design Science methodology, with a mixed method evaluation approach. The developed CEF Model consists of five components; those are Product Lines, Platform, Workflow, Product Control, and Knowledge Management, which can be used to setup a CEF environment that simulates a process-oriented software production environment with capacity and resource planning features. The model was validated through expert reviews and the finalized model was verified to be technically feasible by a successful deployment into a selected commercial Cloud ERP production facility. Three Cloud ERP commercial deployment case studies were conducted using the prototype environment. Using the survey instruments developed, the results yielded a Likert score mean of 6.3 out of 7 thus reaffirming that the model is usable and the research has met its objective in addressing the problem components. The models along with its deployment verification processes are the main research contributions. Both items can also be used by software industry practitioners and academician as references in developing a robust Cloud ERP production facility
    • 

    corecore