Search CORE

17 research outputs found

Crowdsourcing step-by-step information extraction to enhance existing how-to videos

Author: Gajos Krzysztof Z.
Guo Philip J.
Kim Ju Ho
Miller Robert C.
Nguyen Phu Tran
Weir Sarah
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations. We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player. To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall

CiteSeerX

DSpace@MIT

Crossref

Understanding in-video dropouts and interaction peaks in online lecture videos

Author: Gajos Krzysztof Z.
Guo Philip J.
Kim Ju Ho
Miller Robert C.
Mitros Piotr
Seaton Daniel T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

With thousands of learners watching the same online lecture videos, analyzing video watching patterns provides a unique opportunity to understand how students learn with videos. This paper reports a large-scale analysis of in-video dropout and peaks in viewership and student activity, using second-by-second user interaction data from 862 videos in four Massive Open Online Courses (MOOCs) on edX. We find higher dropout rates in longer videos, re-watching sessions (vs first-time), and tutorials (vs lectures). Peaks in re-watching sessions and play events indicate points of interest and confusion. Results show that tutorials (vs lectures) and re-watching sessions (vs first-time) lead to more frequent and sharper peaks. In attempting to reason why peaks occur by sampling 80 videos, we observe that 61% of the peaks accompany visual transitions in the video, e.g., a slide view to a classroom view. Based on this observation, we identify five student activity patterns that can explain peaks: starting from the beginning of a new material, returning to missed content, following a tutorial step, replaying a brief segment, and repeating a non-visual explanation. Our analysis has design implications for video authoring, editing, and interface design, providing a richer understanding of video learning on MOOCs

CiteSeerX

DSpace@MIT

Crossref

Learnersourcing Subgoal Labels for How-to Videos

Author: Gajos Krzysztof Z.
Kim Ju Ho
Miller Robert C.
Weir Sarah
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2015
Field of study

Websites like YouTube host millions of how-to videos, but the interfaces are not optimized for learning. Previous research suggests that users learn more from how-to videos when the information from the video is presented in outline form, with individual steps and labels for groups of steps (subgoals) shown. We envision an alternative video player where the steps and subgoals are displayed alongside the video. To generate this information for existing videos, we propose a learnersourcing approach, where people actively learning from a video provide such information. To demonstrate this method, we created a workflow where learners contribute and refine subgoal labels for how-to videos. We deployed a live website with our workflow implemented on a set of introductory web programming videos. For the four videos with the highest participation, we found that a majority of learner-generated subgoals were comparable in quality to expert-generated ones. Learners commented that the system helped them grasp the material, suggesting that our workflow did not detract from the learning experience.Massachusetts Institute of Technology. Undergraduate Research Opportunities ProgramCisco Systems, Inc.Quanta Computer (Firm) (Qmulus Project)National Science Foundation (U.S.) (Award SOCS-1111124)Alfred P. Sloan Foundation (Sloan Research Fellowship)Samsung (Firm) (Fellowship

DSpace@MIT

StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Author: Bay Herbert
Brady Erin
Fusco Giovanni
Guo Anhong
Guo Anhong
Guo Anhong
Gurari Danna
Jia-Jun Li Toby
Kane Shaun K.
Kim Juho
Lasecki Walter S.
Li Yang
Vanderheiden Gregg C.
Vezhnevets Vladimir
Yeh Tom
Zhang Xiaoyi
Zhong Yu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/08/2019
Field of study

Blind people frequently encounter inaccessible dynamic touchscreens in their everyday lives that are difficult, frustrating, and often impossible to use independently. Touchscreens are often the only way to control everything from coffee machines and payment terminals, to subway ticket machines and in-flight entertainment systems. Interacting with dynamic touchscreens is difficult non-visually because the visual user interfaces change, interactions often occur over multiple different screens, and it is easy to accidentally trigger interface actions while exploring the screen. To solve these problems, we introduce StateLens - a three-part reverse engineering solution that makes existing dynamic touchscreens accessible. First, StateLens reverse engineers the underlying state diagrams of existing interfaces using point-of-view videos found online or taken by users using a hybrid crowd-computer vision pipeline. Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface. Finally, a set of 3D-printed accessories enable blind people to explore capacitive touchscreens without the risk of triggering accidental touches on the interface. Our technical evaluation shows that StateLens can accurately reconstruct interfaces from stationary, hand-held, and web videos; and, a user study of the complete system demonstrates that StateLens successfully enables blind users to access otherwise inaccessible dynamic touchscreens.Comment: ACM UIST 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Crowdsourcing step-by-step information extraction to enhance existing how-to videos

Author: Gajos Krzysztof Z
Guo Philip J.
Kim Juho
Miller Robert C.
Nguyen Phu Tran
Weir Sarah
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/03/2017
Field of study

Harvard University - DASH

Recommended from our members

Citizen-led Work using Social Computing and Procedural Guidance

Author: Pandey Vineet
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Online platforms enable people to interact with friends, family, and the world at large. How might people go beyond sharing stories and ideas to building and testing theories in the real world? While many are motivated to dig deeper into their lived experience, limited expertise and lack of platform support make complex activities like experimentation dauntingly hard. Novices benefit greatly from expert guidance: this thesis advocates baking the guidance into the interface itself.This dissertation introduces procedural guidance to build just-in-time expertise for difficult tasks. Procedural guidance has multiple advantages: it is minimal, leverages teachable moments, and can be ability-specific. This dissertation instantiates this insight of procedural guidance through a sequence of increasingly complex social computing systems: Gut Instinct for curating ideas, Docent for generating hypotheses, and Galileo for citizen-led experiments.Gut Instinct hosts online learning materials and enables people to collaboratively brainstorm potential influences on people’s microbiome. Docent explicitly teaches people to create hypotheses by combining personal insights and online learning with task-specific scaffolding. Finally, Galileo reifies experimentation in the software, provides multiple roles for contribution, and automatically manages interdependencies. Multiple evaluations—controlled experiments and field deployments with online communities including American Gut participants—demonstrate that procedural guidance enables people to transform intuitions to hypotheses and structurally-sound experiments. By enabling people to draw on lived experience, this dissertation harbingers a future where people can convert their intuitions to actionable plans and implement these plans with online communities. This dissertation concludes by discussing opportunities for complex work using social computing platforms

eScholarship - University of California