276,280 research outputs found

    Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure

    Full text link
    As machine learning systems move from computer-science laboratories into the open world, their accountability becomes a high priority problem. Accountability requires deep understanding of system behavior and its failures. Current evaluation methods such as single-score error metrics and confusion matrices provide aggregate views of system performance that hide important shortcomings. Understanding details about failures is important for identifying pathways for refinement, communicating the reliability of systems in different settings, and for specifying appropriate human oversight and engagement. Characterization of failures and shortcomings is particularly complex for systems composed of multiple machine learned components. For such systems, existing evaluation methods have limited expressiveness in describing and explaining the relationship among input content, the internal states of system components, and final output quality. We present Pandora, a set of hybrid human-machine methods and tools for describing and explaining system failures. Pandora leverages both human and system-generated observations to summarize conditions of system malfunction with respect to the input content and system architecture. We share results of a case study with a machine learning pipeline for image captioning that show how detailed performance views can be beneficial for analysis and debugging

    Co-management: A Synthesis of the Lessons Learned from the DFID Fisheries Management Science Programme

    Get PDF
    For the last eleven years, the UK Department for International Development (DfID) have been funding research projects to support the sustainable management of fisheries resources (both inland and marine) in developing countries through the Fisheries Management Science Programme (FMSP). A number of these projects that have been commissioned in this time have examined fisheries co-management. While these projects have, for the most part, been implemented separately, the FMSP has provided an opportunity to synthesise and draw together some of the information generated by these projects. We feel that there is value in distilling some of the important lessons and describing some of the useful tools and examples and making these available through a single, accessible resource. The wealth of information generated means that it is impossible to cover everything in detail but it is hoped that this synthesis will at least provide an overview of the co-management process together with some useful information relating to implementing co-management in a developing country context and links to the more detailed re-sources available, in particular on information systems for co-managed fisheries, participatory fish stock assessment (ParFish) and adaptive learning that have, in particular, been drawn upon for this synthesis. This synthesis is aimed at anyone interested in fisheries management in a developing country context

    A Concurrency-Agnostic Protocol for Multi-Paradigm Concurrent Debugging Tools

    Get PDF
    Today's complex software systems combine high-level concurrency models. Each model is used to solve a specific set of problems. Unfortunately, debuggers support only the low-level notions of threads and shared memory, forcing developers to reason about these notions instead of the high-level concurrency models they chose. This paper proposes a concurrency-agnostic debugger protocol that decouples the debugger from the concurrency models employed by the target application. As a result, the underlying language runtime can define custom breakpoints, stepping operations, and execution events for each concurrency model it supports, and a debugger can expose them without having to be specifically adapted. We evaluated the generality of the protocol by applying it to SOMns, a Newspeak implementation, which supports a diversity of concurrency models including communicating sequential processes, communicating event loops, threads and locks, fork/join parallelism, and software transactional memory. We implemented 21 breakpoints and 20 stepping operations for these concurrency models. For none of these, the debugger needed to be changed. Furthermore, we visualize all concurrent interactions independently of a specific concurrency model. To show that tooling for a specific concurrency model is possible, we visualize actor turns and message sends separately.Comment: International Symposium on Dynamic Language

    Functional Skills Support Programme: Developing functional skills in design and technology

    Get PDF
    This booklet is part of "... a series of 11 booklets which helps schools to implement functional skills across the curriculum. The booklets illustrate how functional skills can be applied and developed in different subjects and contexts, supporting achievement at Key Stage 3 and Key Stage 4. Each booklet contains an introduction to functional skills for subject teachers, three practical planning examples with links to related websites and resources, a process for planning and a list of additional resources to support the teaching and learning of functional skills." - The National Strategies website

    GCE AS and A level subject criteria for design and technology

    Get PDF

    A Guide to Evaluating Marine Spatial Plans

    Get PDF
    Marine spatial plans are being developed in over 40 countries around the world, to distribute human activities in marine areas more sustainably and achieve ecological, social, and economic objectives. Monitoring and evaluation are often considered only after a plan has been developed. This guide will help marine planners and managers, monitor and evaluate the success of marine plans in achieving real results and outcomes. This report emphasizes the importance of early integration of monitoring and evaluation in the planning process, the importance of measurable and specific objectives, clear management actions, relevant indicators and targets, and involvement of stakeholders throughout the planning process.
    corecore