78 research outputs found

    Streaming Graph Challenge: Stochastic Block Partition

    Full text link
    An important objective for analyzing real-world graphs is to achieve scalable performance on large, streaming graphs. A challenging and relevant example is the graph partition problem. As a combinatorial problem, graph partition is NP-hard, but existing relaxation methods provide reasonable approximate solutions that can be scaled for large graphs. Competitive benchmarks and challenges have proven to be an effective means to advance state-of-the-art performance and foster community collaboration. This paper describes a graph partition challenge with a baseline partition algorithm of sub-quadratic complexity. The algorithm employs rigorous Bayesian inferential methods based on a statistical model that captures characteristics of the real-world graphs. This strong foundation enables the algorithm to address limitations of well-known graph partition approaches such as modularity maximization. This paper describes various aspects of the challenge including: (1) the data sets and streaming graph generator, (2) the baseline partition algorithm with pseudocode, (3) an argument for the correctness of parallelizing the Bayesian inference, (4) different parallel computation strategies such as node-based parallelism and matrix-based parallelism, (5) evaluation metrics for partition correctness and computational requirements, (6) preliminary timing of a Python-based demonstration code and the open source C++ code, and (7) considerations for partitioning the graph in streaming fashion. Data sets and source code for the algorithm as well as metrics, with detailed documentation are available at GraphChallenge.org.Comment: To be published in 2017 IEEE High Performance Extreme Computing Conference (HPEC

    THE RESPONSE OF YOUNG VALENCIA ORANGE TREES TO DIFFERENTIAL BORON SUPPLY IN SAND CULTURE

    Full text link

    Performance Measurements of Supercomputing and Cloud Storage Solutions

    Full text link
    Increasing amounts of data from varied sources, particularly in the fields of machine learning and graph analytics, are causing storage requirements to grow rapidly. A variety of technologies exist for storing and sharing these data, ranging from parallel file systems used by supercomputers to distributed block storage systems found in clouds. Relatively few comparative measurements exist to inform decisions about which storage systems are best suited for particular tasks. This work provides these measurements for two of the most popular storage technologies: Lustre and Amazon S3. Lustre is an open-source, high performance, parallel file system used by many of the largest supercomputers in the world. Amazon's Simple Storage Service, or S3, is part of the Amazon Web Services offering, and offers a scalable, distributed option to store and retrieve data from anywhere on the Internet. Parallel processing is essential for achieving high performance on modern storage systems. The performance tests used span the gamut of parallel I/O scenarios, ranging from single-client, single-node Amazon S3 and Lustre performance to a large-scale, multi-client test designed to demonstrate the capabilities of a modern storage appliance under heavy load. These results show that, when parallel I/O is used correctly (i.e., many simultaneous read or write processes), full network bandwidth performance is achievable and ranged from 10 gigabits/s over a 10 GigE S3 connection to 0.35 terabits/s using Lustre on a 1200 port 10 GigE switch. These results demonstrate that S3 is well-suited to sharing vast quantities of data over the Internet, while Lustre is well-suited to processing large quantities of data locally.Comment: 5 pages, 4 figures, to appear in IEEE HPEC 201

    THE INFLUENCE OF ROOTSTOCK ON THE MINERAL COMPOSITION OF VALENCIA ORANGE LEAVES

    Full text link

    GraphChallenge.org: Raising the Bar on Graph Analytic Performance

    Full text link
    The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of pre-parsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations. The submissions highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization. These submissions produced many comparable performance measurements that can be used for assessing the current state of the art of the field. There were numerous submissions that implemented the triangle counting challenge and resulted in over 350 distinct measurements. Analysis of these submissions show that their execution time is a strong function of the number of edges in the graph, NeN_e, and is typically proportional to Ne4/3N_e^{4/3} for large values of NeN_e. Combining the model fits of the submissions presents a picture of the current state of the art of graph analysis, which is typically 10810^8 edges processed per second for graphs with 10810^8 edges. These results are 3030 times faster than serial implementations commonly used by many graph analysts and underscore the importance of making these performance benefits available to the broader community. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs.Comment: 7 pages, 6 figures; submitted to IEEE HPEC Graph Challenge. arXiv admin note: text overlap with arXiv:1708.0686

    Steppe-tundra composition and deglacial floristic turnover in interior Alaska revealed by sedimentary ancient DNA (sedaDNA)

    Get PDF
    When tracing vegetation dynamics over long timescales, obtaining enough floristic information to gain a detailed understanding of past communities and their transitions can be challenging. The first high-resolution sedimentary DNA (sedaDNA) metabarcoding record from lake sediments in Alaska—reported here—covers nearly 15,000 years of change. It shows in unprecedented detail the composition of late-Pleistocene “steppe-tundra” vegetation of ice-free Alaska, part of an intriguing late-Quaternary “no-analogue” biome, and it covers the subsequent changes that led to the development of modern spruce-dominated boreal forest. The site (Chisholm Lake) lies close to key archaeological sites, and the record throws new light on the landscape and resources available to early humans. Initially, vegetation was dominated by forbs found in modern tundra and/or subarctic steppe vegetation (e.g., Potentilla, Draba, Eritrichium, Anemone patens), and graminoids (e.g., Bromus pumpellianus, Festuca, Calamagrostis, Puccinellia), with Salix the only prominent woody taxon. Predominantly xeric, warm-to-cold habitats are indicated, and we explain the mixed ecological preferences of the fossil assemblages as a topo-mosaic strongly affected by insolation load. At ca. 14,500 cal yr BP (calendar years before C.E. 1950), about the same time as well documented human arrivals and coincident with an increase in effective moisture, Betula expanded. Graminoids became less abundant, but many open-ground forb taxa persisted. This woody-herbaceous mosaic is compatible with the observed persistence of Pleistocene megafaunal species (animals weighing ≥44 kg)—important resources for early humans. The greatest taxonomic turnover, marking a transition to regional woodland and a further moisture increase, began ca. 11,000 cal yr BP when Populus expanded, along with new shrub taxa (e.g., Shepherdia, Eleagnus, Rubus, Viburnum). Picea then expanded ca. 9500 cal yr BP, along with shrub and forb taxa typical of evergreen boreal woodland (e.g., Spiraea, Cornus, Linnaea). We found no evidence for Picea in the late Pleistocene, however. Most taxa present today were established by ca. 5000 cal yr BP after almost complete taxonomic turnover since the start of the record (though Larix appeared only at ca. 1500 cal yr BP). Prominent fluctuations in aquatic communities ca. 14,000–9,500 cal yr BP are probably related to lake-level fluctuations prior to the lake reaching its high, near-modern depth ca. 8,000 cal yr BP

    Feasibility study of small and micro wind turbines for residential use in New Zealand: an analysis of technical implementation, spatial planning processes and of economic viability of small and micro scale wind energy generation systems for residential use in New Zealand

    Get PDF
    Even though there might not seem to be any similarity between a holiday lodge on the verge of New Zealand’s Banks Peninsula, a satellite earth station on the unmanned Black Island in the middle of the Ross Ice Shelf in the Antarctica and an American stargazer on his property in the middle of the Arizona desert, they all have something in common. They, among many other people across the globe, use the free resource wind to generate eco-friendly electricity, facilitating small and micro scale wind turbines. Japan, the USA and the UK, for example, have already installed thousands of domestic wind turbines. In New Zealand small and micro scale wind energy generation still has not established itself among other distributed energy generation methods on a domestic scale, even though the conditions for wind energy generation are perfect in many places. The aim of this study was to assess the potential of domestic wind turbines in New Zealand. It established an overview of small and micro scale wind energy generation planning and implementation processes to gain insight into effectiveness, feasibility and straight forwardness of the processes involved. Hereby, economic, technical and planning aspects of domestic wind energy generation systems were analysed to investigate the benefits from small and micro scale wind energy generation

    Using Transport Services instead of specific Transport Protocols

    Get PDF
    For most applications the used transport service providers are predetermined during the development of the application. This makes it difficult to consider the application communication requirements and to exploit specific features of the network technology. Specialized protocols that are more efficient and offer a qualitative improved service are typically not supported by most applications because they are not commonly available. In this paper we propose a concept for the realization of protocol independent transport services. Only a transport service is predetermined during the development of the application and an appropriate transport service provider is dynamically selected at run time. This enables to exploit specialized protocols if possible, but standard protocols could still be used if necessary. The main focus of this paper is how a transport service could provide a new transport service provider transparently to existing applications. A prototype is presented that maps TCP/IP based applications to an ATM specific transport service provider which offers a reliable and unreliable transport service like TCP/IP
    • …
    corecore