1,435 research outputs found
Topology-aware GPU scheduling for learning workloads in cloud environments
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to â1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing
collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Unionâs Horizon
2020 research and innovation programme (grant agreement No 639595). It is
also partially supported by the Ministry of Economy of Spain under contract
TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051,
by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program
(SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef
and Asser Tantawi for the valuable discussions. We also thank SC17 committee
member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version
Insights into WebAssembly: Compilation performance and shared code caching in node.js
Alongside JavaScript, V8 and Node.js have become essential components of contemporary web and cloud applications. With the addition of WebAssembly to the web, developers finally have a fast platform for performance-critical code. However, this addition also introduces new challenges to client and server applications. New application architectures, such as serverless computing, require instantaneous performance without long startup times. In this paper, we investigate the performance of WebAssembly compilation in V8 and Node.js, and present the design and implementation of a multi-process shared code cache for Node.js applications. We demonstrate how such a cache can significantly increase application performance, and reduce application startup time, CPU usage, and memory footprint
Energy-efficient Transitional Near-* Computing
Studies have shown that communication networks, devices accessing the Internet, and data centers account for 4.6% of the worldwide electricity consumption.
Although data centers, core network equipment, and mobile devices are getting more energy-efficient, the amount of data that is being processed, transferred, and stored is vastly increasing.
Recent computer paradigms, such as fog and edge computing, try to improve this situation by processing data near the user, the network, the devices, and the data itself.
In this thesis, these trends are summarized under the new term near-* or near-everything computing.
Furthermore, a novel paradigm designed to increase the energy efficiency of near-* computing is proposed: transitional computing.
It transfers multi-mechanism transitions, a recently developed paradigm for a highly adaptable future Internet, from the field of communication systems to computing systems.
Moreover, three types of novel transitions are introduced to achieve gains in energy efficiency in near-* environments, spanning from private Infrastructure-as-a-Service (IaaS) clouds, Software-defined Wireless Networks (SDWNs) at the edge of the network, Disruption-Tolerant Information-Centric Networks (DTN-ICNs) involving mobile devices, sensors, edge devices as well as programmable components on a mobile System-on-a-Chip (SoC).
Finally, the novel idea of transitional near-* computing for emergency response applications is presented
to assist rescuers and affected persons during an emergency event or a disaster, although connections to cloud services and social networks might be disturbed by network outages, and network bandwidth and battery power of mobile devices might be limited
Remediation by Design: New Linguistic Domains for Changing Organizational Practices
The paper examines the impact of novel linguistic vocabularies on the remediation of practices using computer-reliant media. As linguistic vocabularies we consider social Web services (with certain material agency) allowing recurrent engagement of users in designated communication acts. On the other hand, remediation is conceived as an evolving state of affairs where new practices (as defined by computer-mediated linguistic conventions) are improvised on the basis of old practices that work differently in new technological settings. In this vein, remediation of organizational routines takes place when established human activities are retooled using digital materials to convey new possibilities for action. The paper advances a proposition and a scaffold for remediating by design which is then âtestedâ by reflecting upon an empirical case
Halos of Spiral Galaxies. III. Metallicity Distributions
(Abriged) We report results of a campaign to image the stellar populations in
the halos of highly inclined spiral galaxies, with the fields roughly 10 kpc
(projected) from the nuclei. We use the F814W (I) and F606W (V) filters in the
Wide Field Planetary Camera 2, on board the Hubble Space telescope. Extended
halo populations are detected in all galaxies. The color-magnitude diagrams
appear to be completely dominated by giant-branch stars, with no evidence for
the presence of young stellar populations in any of the fields. We find that
the metallicity distribution functions are dominated by metal-rich populations,
with a tail extending toward the metal poor end. To first order, the overall
shapes of the metallicity distribution functions are similar to what is
predicted by simple, single-component model of chemical evolution with the
effective yields increasing with galaxy luminosity. However, metallicity
distributions significantly narrower than the simple model are observed for a
few of the most luminous galaxies in the sample. It appears clear that more
luminous spiral galaxies also have more metal-rich stellar halos. The
increasingly significant departures from the closed-box model for the more
luminous galaxies indicate that a parameter in addition to a single yield is
required to describe chemical evolution. This parameter, which could be related
to gas infall or outflow either in situ or in progenitor dwarf galaxies that
later merge to form the stellar halo, tends to act to make the metallicity
distributions narrower at high metallicity.Comment: 20 pages, 8 figures (ApJ, in press
Halo Properties in Cosmological Simulations of Self-Interacting Cold Dark Matter
We present a comparison of halo properties in cosmological simulations of
collisionless cold dark matter (CDM) and self-interacting dark matter (SIDM)
for a range of dark matter cross sections. We find, in agreement with various
authors, that CDM yields cuspy halos that are too centrally concentrated as
compared to observations. Conversely, SIDM simulations using a Monte Carlo
N-body technique produce halos with significantly reduced central densities and
flatter cores with increasing cross section. We introduce a concentration
parameter based on enclosed mass that we expect will be straightforward to
determine observationally, unlike that of Navarro, Frenk & White, and provide
predictions for SIDM and CDM. SIDM also produces more spherical halos than CDM,
providing possibly the strongest observational test of SIDM. We discuss our
findings in relation to various relevant observations as well as SIDM
simulations of other groups. Taking proper account of simulation limitations,
we find that a dark matter cross section per unit mass of sigma_DM ~=
10^{-23}-10^{-24} cm^2/GeV is consistent with all current observational
constraints.Comment: 14 pages, submitted to Ap
Cloud enterprise resource planning development model based on software factory approach
Literature reviews revealed that Cloud Enterprise Resource Planning (Cloud ERP) is
significantly growing, yet from software developersâ perspective, it has succumbed to high management complexity, high workload, inconsistency software quality, and knowledge retention problems. Previous researches lack a solution that holistically addresses all the research problem components. Software factory approach was chosen to be adapted along with relevant theories to develop a model referred to as Cloud ERP Factory Model (CEF Model), which intends to pave the way in solving the above-mentioned problems. There are three specific objectives, those are (i) to develop the model by identifying the components with its elements and compile them into the CEF Model, (ii) to verify the modelâs deployment technical feasibility, and (iii) to validate the model field usability in a real Cloud ERP production case studies. The research employed Design Science methodology, with a mixed method
evaluation approach. The developed CEF Model consists of five components; those are Product Lines, Platform, Workflow, Product Control, and Knowledge Management, which can be used to setup a CEF environment that simulates a process-oriented software production environment with capacity and resource planning features. The model was validated through expert reviews and the finalized model was verified to be technically feasible by a successful deployment into a selected commercial Cloud ERP production facility. Three Cloud ERP commercial deployment case studies were conducted using the prototype environment. Using the survey instruments developed, the results yielded a Likert score mean of 6.3 out of 7 thus reaffirming that the model is usable and the research has met its objective in addressing the problem components. The models along with its deployment verification processes are the main research contributions. Both items can also be used by software industry practitioners and academician as references in developing a robust Cloud ERP production facility
- âŠ