9,671 research outputs found
Topology-aware GPU scheduling for learning workloads in cloud environments
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing
collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon
2020 research and innovation programme (grant agreement No 639595). It is
also partially supported by the Ministry of Economy of Spain under contract
TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051,
by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program
(SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef
and Asser Tantawi for the valuable discussions. We also thank SC17 committee
member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version
Coverage centralities for temporal networks
Structure of real networked systems, such as social relationship, can be
modeled as temporal networks in which each edge appears only at the prescribed
time. Understanding the structure of temporal networks requires quantifying the
importance of a temporal vertex, which is a pair of vertex index and time. In
this paper, we define two centrality measures of a temporal vertex based on the
fastest temporal paths which use the temporal vertex. The definition is free
from parameters and robust against the change in time scale on which we focus.
In addition, we can efficiently compute these centrality values for all
temporal vertices. Using the two centrality measures, we reveal that
distributions of these centrality values of real-world temporal networks are
heterogeneous. For various datasets, we also demonstrate that a majority of the
highly central temporal vertices are located within a narrow time window around
a particular time. In other words, there is a bottleneck time at which most
information sent in the temporal network passes through a small number of
temporal vertices, which suggests an important role of these temporal vertices
in spreading phenomena.Comment: 13 pages, 10 figure
Sampling Local Fungal Diversity in an Undergraduate Laboratory using DNA Barcoding
Traditional methods for fungal species identification require diagnostic morphological characters and are often limited by the availability of fresh fruiting bodies and local identification resources. DNA barcoding offers an additional method of species identification and is rapidly developing as a critical tool in fungal taxonomy. As an exercise in an undergraduate biology course, we identified 9 specimens collected from the Hendrix College campus in Conway, Arkansas, USA to the genus or species level using morphology. We report that DNA barcoding targeting the internal transcribed spacer (ITS) region supported several of our taxonomic determinations and we were able to contribute 5 ITS sequences to GenBank that were supported by vouchered collection information. We suggest that small-scale barcoding projects are possible and that they have value for documenting fungal diversity
Combined burden and functional impact tests for cancer driver discovery using DriverPower
The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery
Communication in organizations: the heart of information systems
We propose a theory characterizing information systems (IS) as language communities which use and develop domain-specific languages for communication. Our theory is anchored in Language Critique, a branch of philosophy of language. In developing our theory, we draw on Systems Theory and Cybernetics as a theoretical framework. "Organization" of a system is directly related to communication of its sub-systems. "Big systems" are self-organizing and the control of this ability is disseminated throughout the system itself. Therefore, the influence on changes of the system from its outside is limited. Operations intended to change an organization are restricted to indirect approaches. The creation of domain-specific languages by the system itself leads to advantageous communication costs compared to colloquial communication at the price of set-up costs for language communities. Furthermore, we demonstrate how our theoretical constructs help to describe and predict the behavior of IS. Finally, we discuss implications of our theory for further research and IS in general. Keywords: Language Critique, language communities, communication, self-organization, IS researc
Order-Revealing Encryption and the Hardness of Private Learning
An order-revealing encryption scheme gives a public procedure by which two
ciphertexts can be compared to reveal the ordering of their underlying
plaintexts. We show how to use order-revealing encryption to separate
computationally efficient PAC learning from efficient -differentially private PAC learning. That is, we construct a concept
class that is efficiently PAC learnable, but for which every efficient learner
fails to be differentially private. This answers a question of Kasiviswanathan
et al. (FOCS '08, SIAM J. Comput. '11).
To prove our result, we give a generic transformation from an order-revealing
encryption scheme into one with strongly correct comparison, which enables the
consistent comparison of ciphertexts that are not obtained as the valid
encryption of any message. We believe this construction may be of independent
interest.Comment: 28 page
First assessment of the plant phenology index (PPI) for estimating gross primary productivity in African semi-arid ecosystems
The importance of semi-arid ecosystems in the global carbon cycle as sinks
for CO2 emissions has recently been highlighted. Africa is a carbon sink and
nearly half its area comprises arid and semi-arid ecosystems. However, there
are uncertainties regarding CO2 fluxes for semi-arid ecosystems in Africa,
particularly savannas and dry tropical woodlands. In order to improve on
existing remote-sensing based methods for estimating carbon uptake across
semi-arid Africa we applied and tested the recently developed plant phenology
index (PPI). We developed a PPI-based model estimating gross primary
productivity (GPP) that accounts for canopy water stress, and compared it
against three other Earth observation-based GPP models: the temperature and
greenness model, the greenness and radiation model and a light use efficiency
model. The models were evaluated against in situ data from four semi-arid sites
in Africa with varying tree canopy cover (3 to 65 percent). Evaluation results
from the four GPP models showed reasonable agreement with in situ GPP measured
from eddy covariance flux towers (EC GPP) based on coefficient of variation,
root-mean-square error, and Bayesian information criterion. The PPI-based GPP
model was able to capture the magnitude of EC GPP better than the other tested
models. The results of this study show that a PPI-based GPP model is a
promising tool for the estimation of GPP in the semi-arid ecosystems of Africa.Comment: Accepted manuscript; 12 pages, 4 tables, 9 figure
- …