776,642 research outputs found

    Profiling and Improving I/O Performance of a Large-Scale Climate Scientific Application

    Get PDF
    Exascale computing systems are soon to emerge, which will pose great challenges on the huge gap between computing and I/O performance. Many large-scale scientific applications play an important role in our daily life. The huge amounts of data generated by such applications require highly parallel and efficient I/O management policies. In this paper, we adopt a mission-critical scientific application, GEOS-5, as a case to profile and analyze the communication and I/O issues that are preventing applications from fully utilizing the underlying parallel storage systems. Through in-detail architectural and experimental characterization, we observe that current legacy I/O schemes incur significant network communication overheads and are unable to fully parallelize the data access, thus degrading applications' I/O performance and scalability. To address these inefficiencies, we redesign its I/O framework along with a set of parallel I/O techniques to achieve high scalability and performance. Evaluation results on the NASA discover cluster show that our optimization of GEOS-5 with ADIOS has led to significant performance improvements compared to the original GEOS-5 implementation

    Leveraging Reinforcement Learning for Task Resource Allocation in Scientific Workflows

    Full text link
    Scientific workflows are designed as directed acyclic graphs (DAGs) and consist of multiple dependent task definitions. They are executed over a large amount of data, often resulting in thousands of tasks with heterogeneous compute requirements and long runtimes, even on cluster infrastructures. In order to optimize the workflow performance, enough resources, e.g., CPU and memory, need to be provisioned for the respective tasks. Typically, workflow systems rely on user resource estimates which are known to be highly error-prone and can result in over- or underprovisioning. While resource overprovisioning leads to high resource wastage, underprovisioning can result in long runtimes or even failed tasks. In this paper, we propose two different reinforcement learning approaches based on gradient bandits and Q-learning, respectively, in order to minimize resource wastage by selecting suitable CPU and memory allocations. We provide a prototypical implementation in the well-known scientific workflow management system Nextflow, evaluate our approaches with five workflows, and compare them against the default resource configurations and a state-of-the-art feedback loop baseline. The evaluation yields that our reinforcement learning approaches significantly reduce resource wastage compared to the default configuration. Further, our approaches also reduce the allocated CPU hours compared to the state-of-the-art feedback loop by 6.79% and 24.53%.Comment: Paper accepted in 2022 IEEE International Conference on Big Data Workshop BPOD 202

    Soft peer review: social software and distributed scientific evaluation

    Get PDF
    The debate on the prospects of peer-review in the Internet age and the increasing criticism leveled against the dominant role of impact factor indicators are calling for new measurable criteria to assess scientific quality. Usage-based metrics offer a new avenue to scientific quality assessment but face the same risks as first generation search engines that used unreliable metrics (such as raw traffic data) to estimate content quality. In this article I analyze the contribution that social bookmarking systems can provide to the problem of usage-based metrics for scientific evaluation. I suggest that collaboratively aggregated metadata may help fill the gap between traditional citation-based criteria and raw usage factors. I submit that bottom-up, distributed evaluation models such as those afforded by social bookmarking will challenge more traditional quality assessment models in terms of coverage, efficiency and scalability. Services aggregating user-related quality indicators for online scientific content will come to occupy a key function in the scholarly communication system

    Development of a framework for the evaluation of the environmental benefits of controlled traffic farming

    Get PDF
    Although controlled traffic farming (CTF) is an environmentally friendly soil management system, no quantitative evaluation of environmental benefits is available. This paper aims at establishing a framework for quantitative evaluation of the environmental benefits of CTF, considering a list of environmental benefits, namely, reducing soil compaction, runoff/erosion, energy requirement and greenhouse gas emission (GHG), conserving organic matter, enhancing soil biodiversity and fertiliser use efficiency. Based on a comprehensive literature review and the European Commission Soil Framework Directive, the choice of and the weighting of the impact of each of the environmental benefits were made. The framework was validated using data from three selected farms. For Colworth farm (Unilever, UK), the framework predicted the largest overall environmental benefit of 59.3% of the theoretically maximum achievable benefits (100%), as compared to the other two farms in Scotland (52%) and Australia (47.3%). This overall benefit could be broken down into: reducing soil compaction (24%), tillage energy requirement (10%) and GHG emissions (3%), enhancing soil biodiversity (7%) and erosion control (6%), conserving organic matter (6%), and improving fertiliser use efficiency (3%). Similar evaluation can be performed for any farm worldwide, providing that data on soil properties, topography, machinery, and weather are available

    Energy rating of a water pumping station using multivariate analysis

    Get PDF
    Among water management policies, the preservation and the saving of energy demand in water supply and treatment systems play key roles. When focusing on energy, the customary metric to determine the performance of water supply systems is linked to the definition of component-based energy indicators. This approach is unfit to account for interactions occurring among system elements or between the system and its environment. On the other hand, the development of information technology has led to the availability of increasing large amount of data, typically gathered from distributed sensor networks in so-called smart grids. In this context, data intensive methodologies address the possibility of using complex network modeling approaches, and advocate the issues related to the interpretation and analysis of large amount of data produced by smart sensor networks. In this perspective, the present work aims to use data intensive techniques in the energy analysis of a water management network. The purpose is to provide new metrics for the energy rating of the system and to be able to provide insights into the dynamics of its operations. The study applies neural network as a tool to predict energy demand, when using flowrate and vibration data as predictor variables

    Large-Scale Analysis of the Accuracy of the Journal Classification Systems of Web of Science and Scopus

    Full text link
    Journal classification systems play an important role in bibliometric analyses. The two most important bibliographic databases, Web of Science and Scopus, each provide a journal classification system. However, no study has systematically investigated the accuracy of these classification systems. To examine and compare the accuracy of journal classification systems, we define two criteria on the basis of direct citation relations between journals and categories. We use Criterion I to select journals that have weak connections with their assigned categories, and we use Criterion II to identify journals that are not assigned to categories with which they have strong connections. If a journal satisfies either of the two criteria, we conclude that its assignment to categories may be questionable. Accordingly, we identify all journals with questionable classifications in Web of Science and Scopus. Furthermore, we perform a more in-depth analysis for the field of Library and Information Science to assess whether our proposed criteria are appropriate and whether they yield meaningful results. It turns out that according to our citation-based criteria Web of Science performs significantly better than Scopus in terms of the accuracy of its journal classification system

    Holistic management – a critical review of Allan Savory’s grazing method

    Get PDF
    Allan Savory is the man behind holistic grazing and the founder of the Savory Institute. Savory claims that holistic grazing can stop desertification and reduce atmo­spheric carbon dioxide levels to pre-industrial levels in a few decades. In this report, we review the literature on holistic grazing in order to evaluate the scientific support behind these statements
    • …
    corecore