9,671 research outputs found

    Topology-aware GPU scheduling for learning workloads in cloud environments

    Get PDF
    Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef and Asser Tantawi for the valuable discussions. We also thank SC17 committee member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version

    Coverage centralities for temporal networks

    Full text link
    Structure of real networked systems, such as social relationship, can be modeled as temporal networks in which each edge appears only at the prescribed time. Understanding the structure of temporal networks requires quantifying the importance of a temporal vertex, which is a pair of vertex index and time. In this paper, we define two centrality measures of a temporal vertex based on the fastest temporal paths which use the temporal vertex. The definition is free from parameters and robust against the change in time scale on which we focus. In addition, we can efficiently compute these centrality values for all temporal vertices. Using the two centrality measures, we reveal that distributions of these centrality values of real-world temporal networks are heterogeneous. For various datasets, we also demonstrate that a majority of the highly central temporal vertices are located within a narrow time window around a particular time. In other words, there is a bottleneck time at which most information sent in the temporal network passes through a small number of temporal vertices, which suggests an important role of these temporal vertices in spreading phenomena.Comment: 13 pages, 10 figure

    Sampling Local Fungal Diversity in an Undergraduate Laboratory using DNA Barcoding

    Get PDF
    Traditional methods for fungal species identification require diagnostic morphological characters and are often limited by the availability of fresh fruiting bodies and local identification resources. DNA barcoding offers an additional method of species identification and is rapidly developing as a critical tool in fungal taxonomy. As an exercise in an undergraduate biology course, we identified 9 specimens collected from the Hendrix College campus in Conway, Arkansas, USA to the genus or species level using morphology. We report that DNA barcoding targeting the internal transcribed spacer (ITS) region supported several of our taxonomic determinations and we were able to contribute 5 ITS sequences to GenBank that were supported by vouchered collection information. We suggest that small-scale barcoding projects are possible and that they have value for documenting fungal diversity

    Combined burden and functional impact tests for cancer driver discovery using DriverPower

    Get PDF
    The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery

    Communication in organizations: the heart of information systems

    Get PDF
    We propose a theory characterizing information systems (IS) as language communities which use and develop domain-specific languages for communication. Our theory is anchored in Language Critique, a branch of philosophy of language. In developing our theory, we draw on Systems Theory and Cybernetics as a theoretical framework. "Organization" of a system is directly related to communication of its sub-systems. "Big systems" are self-organizing and the control of this ability is disseminated throughout the system itself. Therefore, the influence on changes of the system from its outside is limited. Operations intended to change an organization are restricted to indirect approaches. The creation of domain-specific languages by the system itself leads to advantageous communication costs compared to colloquial communication at the price of set-up costs for language communities. Furthermore, we demonstrate how our theoretical constructs help to describe and predict the behavior of IS. Finally, we discuss implications of our theory for further research and IS in general. Keywords: Language Critique, language communities, communication, self-organization, IS researc

    Order-Revealing Encryption and the Hardness of Private Learning

    Full text link
    An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient (ϵ,δ)(\epsilon, \delta)-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest.Comment: 28 page

    First assessment of the plant phenology index (PPI) for estimating gross primary productivity in African semi-arid ecosystems

    Full text link
    The importance of semi-arid ecosystems in the global carbon cycle as sinks for CO2 emissions has recently been highlighted. Africa is a carbon sink and nearly half its area comprises arid and semi-arid ecosystems. However, there are uncertainties regarding CO2 fluxes for semi-arid ecosystems in Africa, particularly savannas and dry tropical woodlands. In order to improve on existing remote-sensing based methods for estimating carbon uptake across semi-arid Africa we applied and tested the recently developed plant phenology index (PPI). We developed a PPI-based model estimating gross primary productivity (GPP) that accounts for canopy water stress, and compared it against three other Earth observation-based GPP models: the temperature and greenness model, the greenness and radiation model and a light use efficiency model. The models were evaluated against in situ data from four semi-arid sites in Africa with varying tree canopy cover (3 to 65 percent). Evaluation results from the four GPP models showed reasonable agreement with in situ GPP measured from eddy covariance flux towers (EC GPP) based on coefficient of variation, root-mean-square error, and Bayesian information criterion. The PPI-based GPP model was able to capture the magnitude of EC GPP better than the other tested models. The results of this study show that a PPI-based GPP model is a promising tool for the estimation of GPP in the semi-arid ecosystems of Africa.Comment: Accepted manuscript; 12 pages, 4 tables, 9 figure
    • …
    corecore