Search CORE

78 research outputs found

Real-time content-aware video retargeting on the Android platform for tunnel vision assistance

Author: Knack Thomas
Publication venue: RIT Scholar Works
Publication date: 01/05/2012
Field of study

As mobile devices continue to rise in popularity, advances in overall mobile device processing power lead to further expansion of their capabilities. This, coupled with the fact that many people suffer from low vision, leaves substantial room for advancing mobile development for low vision assistance. Computer vision is capable of assisting and accommodating individuals with blind spots or tunnel vision by extracting the necessary information and presenting it to the user in a manner they are able to visualize. Such a system would enable individuals with low vision to function with greater ease. Additionally, offering assistance on a mobile platform allows greater access. The objective of this thesis is to develop a computer vision application for low vision assistance on the Android mobile device platform. Specifically, the goal of the application is to reduce the effects tunnel vision inflicts on individuals. This is accomplished by providing an in-depth real-time video retargeting model that builds upon previous works and applications. Seam carving is a content-aware retargeting operator which defines 8-connected paths, or seams, of pixels. The optimality of these seams is based on a specific energy function. Discrete removal of these seams permits changes in the aspect ratio while simultaneously preserving important regions. The video retargeting model incorporates spatial and temporal considerations to provide effective image and video retargeting. Data reduction techniques are utilized in order to generate an efficient model. Additionally, a minimalistic multi-operator approach is constructed to diminish the disadvantages experienced by individual operators. In the event automated techniques fail, interactive options are provided that allow for user intervention. Evaluation of the application and its video retargeting model is based on its comparison to existing standard algorithms and its ability to extend itself to real-time. Performance metrics are obtained for both PC environments and mobile device platforms for comparison

RIT Scholar Works

Towards Data-Driven Large Scale Scientific Visualization and Exploration

Author: Ip Cheuk Yiu
Publication venue
Publication date: 01/01/2013
Field of study

Technological advances have enabled us to acquire extremely large datasets but it remains a challenge to store, process, and extract information from them. This dissertation builds upon recent advances in machine learning, visualization, and user interactions to facilitate exploration of large-scale scientific datasets. First, we use data-driven approaches to computationally identify regions of interest in the datasets. Second, we use visual presentation for effective user comprehension. Third, we provide interactions for human users to integrate domain knowledge and semantic information into this exploration process. Our research shows how to extract, visualize, and explore informative regions on very large 2D landscape images, 3D volumetric datasets, high-dimensional volumetric mouse brain datasets with thousands of spatially-mapped gene expression profiles, and geospatial trajectories that evolve over time. The contribution of this dissertation include: (1) We introduce a sliding-window saliency model that discovers regions of user interest in very large images; (2) We develop visual segmentation of intensity-gradient histograms to identify meaningful components from volumetric datasets; (3) We extract boundary surfaces from a wealth of volumetric gene expression mouse brain profiles to personalize the reference brain atlas; (4) We show how to efficiently cluster geospatial trajectories by mapping each sequence of locations to a high-dimensional point with the kernel distance framework. We aim to discover patterns, relationships, and anomalies that would lead to new scientific, engineering, and medical advances. This work represents one of the first steps toward better visual understanding of large-scale scientific data by combining machine learning and human intelligence

Digital Repository at the University of Maryland

State of the Art Report on Video-based Graphics and Video Visualizations

Author: Agarwal
Agarwal
Agarwala
Aggarwal
Ahonen
Andriluka
Arulampalam
Assa
Assa
Avidan
Bai
Ballan
Barnes
Barron
Bartoli
Bay
Bennett
Bhat
Bishop
Botchen
Bousseau
Boykov
Brandel
Bruhn
Brutzer
Buehler
Caspi
Chen
Cheng
Collomosse
Cornelis
Correa
Coughlan
Cremers
Dalal
Daniel
Davison
Dellaert
Deutscher
Divvala
Dollar
Durou
Faugeras
Felzenszwalb
Felzenszwalb
Felzenszwalb
Fleet
Furukawa
Gall
Galvin
Gibson
Goldman
Hannuna
Harris
Hartley
Hoiem
Horn
Hu
Huang
Höferlin
Kakumanu
Kang
Kang
Ke
Kimber
Klein
Koutsourakis
Kumar
Kutulakos
Kwatra
Laptev
Laptev
Laurentini
Le
Lee
Li
Lindeberg
Liu
Lobay
Lowe
Lucas
Matas
McIvor
Mei
Mikolajczyk
Mikolajczyk
Moons
Moreels
Nienhaus
Patel
Peker
Pellegrini
Petrovic
Piccardi
Pritch
Radke
Ramanan
Rav-Acha
Rav-Acha
Rav-Acha
Reisfeld
Romdhani
Rother
Rubinstein
Rubinstein
Rubinstein
Russell
Schoeffmann
Seitz
Setlur
Setlur
Sezgin
Shesh
Shi
Sion
Starck
Stein
Stoykova
Sull
Sun
Szeliski
Szeliski
Teodosio
Torresani
Torresani
Truong
Urtasun
Van
Viola
Vlasic
Vogiatzis
Wang
Wang
Wang
Wang
Wang
Wang
Weickert
Welch
Wilson
Winnemöller
Wolf
Xu
Yeung
Zhao
Zhu
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

Crossref

Cronfa at Swansea University

A Survey on Video-based Graphics and Video Visualization

Author: Xianghua Xie
Publication venue: EUROGRAPHICS
Publication date: 01/01/2011
Field of study

Cronfa at Swansea University

Understanding Visual Feedback in Large-Display Touchless Interactions: An Exploratory Study

Author: Bolchini Davide
Chattopadhyay Debaleena
Publication venue
Publication date: 20/07/2014
Field of study

Touchless interactions synthesize input and output from physically disconnected motor and display spaces without any haptic feedback. In the absence of any haptic feedback, touchless interactions primarily rely on visual cues, but properties of visual feedback remain unexplored. This paper systematically investigates how large-display touchless interactions are affected by (1) types of visual feedback—discrete, partial, and continuous; (2) alternative forms of touchless cursors; (3) approaches to visualize target-selection; and (4) persistent visual cues to support out-of-range and drag-and-drop gestures. Results suggest that continuous was more effective than partial visual feedback; users disliked opaque cursors, and efficiency did not increase when cursors were larger than display artifacts’ size. Semantic visual feedback located at the display border improved users’ efficiency to return within the display range; however, the path of movement echoed in drag-and-drop operations decreased efficiency. Our findings contribute key ingredients to design suitable visual feedback for large-display touchless environments.This work was partially supported by an IUPUI Research Support Funds Grant (RSFG)

IUPUIScholarWorks

Physical Interaction Concepts for Knowledge Work Practices

Author: Khalilbeigi Khameneh Mohammadreza
Publication venue: TU Prints
Publication date: 08/05/2014
Field of study

The majority of workplaces in developed countries concern knowledge work. Accordingly, the IT industry and research made great efforts for many years to support knowledge workers -- and indeed, computer-based information workplaces have come of age. Nevertheless, knowledge work in the physical world has still quite a number of unique advantages, and the integration of physical and digital knowledge work leaves a lot to be desired. The present thesis aims at reducing these deficiencies; thereby, it leverages late technology trends, in particular interactive tabletops and resizable hand-held displays. We start from the observation that knowledge workers develop highly efficient practices, skills, and dexterity of working with physical objects in the real world, whether content-unrelated (coffee mugs, stationery etc.) or content-related (books, notepads etc.). Among the latter, paper-based objects -- the notorious analog information bearers -- represent by far the most relevant (super-) category. We discern two kinds of practices: collective practices concern the arrangement of objects with respect to other objects and the desk, while specific practices operate on individual objects and usually alter them. The former are mainly employed for an effective management of the physical desktop workspace -- e.g., everyday objects are frequently moved on tables to optimize the desk as a workplace -- or an effective organization of paper-based documents on the desktop -- e.g., stacking, fanning out, sorting etc. The latter concern the specific manipulation of physical objects related to the task at hand, i.e. knowledge work. Widespread assimilated practices concern not only writing on, annotating, or spatially arranging paper documents but also sophisticated manipulations -- such as flipping, folding, bending, etc. Compared to the wealth of such well-established practices in the real world, those for digital knowledge work are bound by the indirection imposed by mouse and keyboard input, where the mouse provided such a great advancement that researchers were seduced to calling its use "direct manipulation". In this light, the goal of this thesis can be rephrased as exploring novel interaction concepts for knowledge workers that i) exploit the flexible and direct manipulation potential of physical objects (as present in the real world) for more intuitive and expressive interaction with digital content, and ii) improve the integration of the physical and digital knowledge workplace. Thereby, two directions of research are pursued. Firstly, the thesis investigates the collective practices executed on the desks of knowledge workers, thereby discerning content-related (more precisely, paper-based documents) and content-unrelated object -- this part is coined as table-centric approaches and leverages the technology of interactive tabletops. Secondly, the thesis looks at specific practices executed on paper, obviously concentrating on knowledge related tasks due to the specific role of paper -- this part is coined as paper-centric approaches and leverages the affordances of paper-like displays, more precisely of resizable i.e. rollable and foldable displays. The table-centric approach leads to the challenge of blending interactive tabletop technology with the established use of physical desktop workspaces. We first conduct an exploratory user study to investigate behavioral and usage patterns of interaction with both physical and digital documents on tabletop surfaces while performing tasks such as grouping and browsing. Based on results of the study, we contribute two sets of interaction and visualization concepts -- coined as PaperTop and ObjecTop -- that concern specific paper based practices and collective practices, respectively. Their efficiency and effectiveness are evaluated in a series of user studies. As mentioned, the paper-centric perspective leverages late ultra-thin resizable display technology. We contribute two sets of novel interaction concepts again -- coined as FoldMe and Xpaaand -- that respond to the design space of dual-sided foldable and of rollout displays, respectively. In their design, we leverage the physical act of resizing not "just" for adjusting the screen real estate but also for interactively performing operations. Initial user studies show a great potential for interaction with digital contents, i.e. for knowledge work

TUbiblio

tuprints

Structure-aware shape processing

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Crossref

Structure-aware shape processing

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

Recommended from our members

Learning human activities and poses with interconnected data sources

Author: Chen Chao-Yeh
Publication venue
Publication date: 07/09/2016
Field of study

Understanding human actions and poses in images or videos is a challenging problem in computer vision. There are different topics related to this problem such as action recognition, pose estimation, human-object interaction, and activity detection. Knowledge of actions and poses could benefit many applications, including video search, surveillance, auto-tagging, event detection, and human-computer interfaces. To understand humans' actions and poses, we need to address several challenges. First, humans are able to perform an enormous amount of poses. For example, simply to move forward, we can do crawling, walking, running, and sprinting. These poses all look different and require examples to cover these variations. Second, the appearance of a person's pose changes when looking from different viewing angles. The learned action model needs to cover the variations from different views. Third, many actions involve interactions between people and other objects, so we need to consider the appearance change corresponding to that object as well. Fourth, collecting such data for learning is difficult and expensive. Last, even if we can learn a good model for an action, to localize when and where the action happens in a long video remains a difficult problem due to the large search space. My key idea to alleviate these obstacles in learning humans' actions and poses is to discover the underlying patterns that connect the information from different data sources. Why will there be underlying patterns? The intuition is that all people share the same articulated physical structure. Though we can change our pose, there are common regulations that limit how our pose can be and how it can move over time. Therefore, all types of human data will follow these rules and they can serve as prior knowledge or regularization in our learning framework. If we can exploit these tendencies, we are able to extract additional information from data and use them to improve learning of humans' actions and poses. In particular, we are able to find patterns for how our pose could vary over time, how our appearance looks in a specific view, how our pose is when we are interacting with objects with certain properties, and how part of our body configuration is shared across different poses. If we could learn these patterns, they can be used to interconnect and extrapolate the knowledge between different data sources. To this end, I propose several new ways to connect human activity data. First, I show how to connect snapshot images and videos by exploring the patterns of how our pose could change over time. Building on this idea, I explore how to connect humans' poses across multiple views by discovering the correlations between different poses and the latent factors that affect the viewpoint variations. In addition, I consider if there are also patterns connecting our poses and nearby objects when we are interacting with them. Furthermore, I explore how we can utilize the predicted interaction as a cue to better address existing recognition problems including image re-targeting and image description generation. Finally, after learning models effectively incorporating these patterns, I propose a robust approach to efficiently localize when and where a complex action happens in a video sequence. The variants of my proposed approaches offer a good trade-off between computational cost and detection accuracy. My thesis exploits various types of underlying patterns in human data. The discovered structure is used to enhance the understanding of humans' actions and poses. By my proposed methods, we are able to 1) learn an action with very few snapshots by connecting them to a pool of label-free videos, 2) infer the pose for some views even without any examples by connecting the latent factors between different views, 3) predict the location of an object that a person is interacting with independent of the type and appearance of that object, then use the inferred interaction as a cue to improve recognition, and 4) localize an action in a complex long video. These approaches improve existing frameworks for understanding humans' actions and poses without extra data collection cost and broaden the problems that we can tackle.Computer Science

Texas ScholarWorks