13,793 research outputs found

    Vision of a Visipedia

    Get PDF
    The web is not perfect: while text is easily searched and organized, pictures (the vast majority of the bits that one can find online) are not. In order to see how one could improve the web and make pictures first-class citizens of the web, I explore the idea of Visipedia, a visual interface for Wikipedia that is able to answer visual queries and enables experts to contribute and organize visual knowledge. Five distinct groups of humans would interact through Visipedia: users, experts, editors, visual workers, and machine vision scientists. The latter would gradually build automata able to interpret images. I explore some of the technical challenges involved in making Visipedia happen. I argue that Visipedia will likely grow organically, combining state-of-the-art machine vision with human labor

    Task analysis of discrete and continuous skills: a dual methodology approach to human skills capture for automation

    Get PDF
    There is a growing requirement within the field of intelligent automation for a formal methodology to capture and classify explicit and tacit skills deployed by operators during complex task performance. This paper describes the development of a dual methodology approach which recognises the inherent differences between continuous tasks and discrete tasks and which proposes separate methodologies for each. Both methodologies emphasise capturing operators’ physical, perceptual, and cognitive skills, however, they fundamentally differ in their approach. The continuous task analysis recognises the non-arbitrary nature of operation ordering and that identifying suitable cues for subtask is a vital component of the skill. Discrete task analysis is a more traditional, chronologically ordered methodology and is intended to increase the resolution of skill classification and be practical for assessing complex tasks involving multiple unique subtasks through the use of taxonomy of generic actions for physical, perceptual, and cognitive actions

    Multi-Modal Trip Hazard Affordance Detection On Construction Sites

    Full text link
    Trip hazards are a significant contributor to accidents on construction and manufacturing sites, where over a third of Australian workplace injuries occur [1]. Current safety inspections are labour intensive and limited by human fallibility,making automation of trip hazard detection appealing from both a safety and economic perspective. Trip hazards present an interesting challenge to modern learning techniques because they are defined as much by affordance as by object type; for example wires on a table are not a trip hazard, but can be if lying on the ground. To address these challenges, we conduct a comprehensive investigation into the performance characteristics of 11 different colour and depth fusion approaches, including 4 fusion and one non fusion approach; using colour and two types of depth images. Trained and tested on over 600 labelled trip hazards over 4 floors and 2000m2\mathrm{^{2}} in an active construction site,this approach was able to differentiate between identical objects in different physical configurations (see Figure 1). Outperforming a colour-only detector, our multi-modal trip detector fuses colour and depth information to achieve a 4% absolute improvement in F1-score. These investigative results and the extensive publicly available dataset moves us one step closer to assistive or fully automated safety inspection systems on construction sites.Comment: 9 Pages, 12 Figures, 2 Tables, Accepted to Robotics and Automation Letters (RA-L
    • …
    corecore