16,542 research outputs found

    Mining developer communication data streams

    Full text link
    This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build

    ISER 2012 Working Paper No. 1

    Get PDF
    Large resource development projects take years to plan. During that planning time, the public frequently debates the potential benefits and risks of a project, but with incomplete information. In these debates, some people might assert that a project would have great benefits, while others might assert that it would certainly harm the environment. At the same time, the developer will be assessing different designs, before finally submitting one to the government permitting agencies for evaluation and public scrutiny. For large mines in Alaska, the government permitting process takes years, and often includes an ecological risk assessment. This assessment is a data-intensive, scientific evaluation of the project’s potential ecological risks, based on the specific details of the project. Recently, some organizations have tried to bring scientific rigor to the pre-design public discussions, especially for mining projects, through a pre-design risk ecological risk assessment. This is a scientific assessment of the environmental risks a project might pose, before the details of project design, risk-prevention, and risk-mitigation measures are known. It is important to know whether pre-design risk assessment is a viable method for drawing conclusions about risks of projects. If valid risk predictions can be made at that stage, then people or governments would not have to wait for either a design or for the detailed evaluation that is done during the permitting process. Such an approach could be used to short cut permitting. It could affect project financing; it could affect the schedule, priority, or even the resources that governments put toward evaluating a project. But perhaps most important: in an age where public perceptions are an important influence on a project’s viability and government permitting decisions, a realistic risk assessment can be used to focus public attention on the facts. But if the methodology is flawed and results in poor quality information and unsupportable conclusions, then a pre-design risk assessment could unjustifiably either inflame or calm the public, depending on what it predicts.Executive Summary / Section 1. Introduction / Section 2. Overview of Ecological Risk / Section 3. Ecological Risk Assessment Methodology / Section 4. Examples of Post-Design Ecological Risk Assessments / Section 5. Pre-Design Ecological Risk Assessment: Risks of Large Scale Mining in the Bristol Bay Watershed / Section 6. Conclusion / Bibliograph

    SOLUTIONS FOR OPTIMIZING THE DATA PARALLEL PREFIX SUM ALGORITHM USING THE COMPUTE UNIFIED DEVICE ARCHITECTURE

    Get PDF
    In this paper, we analyze solutions for optimizing the data parallel prefix sum function using the Compute Unified Device Architecture (CUDA) that provides a viable solution for accelerating a broad class of applications. The parallel prefix sum function is an essential building block for many data mining algorithms, and therefore its optimization facilitates the whole data mining process. Finally, we benchmark and evaluate the performance of the optimized parallel prefix sum building block in CUDA.CUDA, threads, GPGPU, parallel prefix sum, parallel processing, task synchronization, warp
    corecore