Search CORE

3 research outputs found

Online computer vision toolkit

Author: Chiu Kevin (Kevin Geeyoung)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2011
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2011.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 60-64).In this thesis, we present an online toolkit, based on a combination of a Scratch-based programming environment and computer vision libraries, manifested as blocks within the environment, integrated with a community platform for diffusing advances in computer vision to a general populace. We show that by providing these tools, non-developers are able to create and publish computer vision applications. The visual development environment includes a collection of algorithms that, despite being well known in the computer vision community, provide capabilities to commodity cameras that are not yet common knowledge. In support of this visual development environment, we also present an online community that allows users to share applications made in the environment, assisting the dissemination of both the knowledge of camera capabilities and advanced camera capabilities to users who have not yet been exposed to their existence or comfortable with their use. Initial evaluations consist of user studies that quantify the abilities afforded to the novice computer vision users by the toolkit, baselined against experienced computer vision users.by Kevin Chiu.S.M

DSpace@MIT

Video anatomy : spatial-temporal video profile

Author: Cai Hongyuan
Publication venue
Publication date: 01/01/2013
Field of study

Indiana University-Purdue University Indianapolis (IUPUI)A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview

IUPUIScholarWorks

Purdue E-Pubs