1,756 research outputs found
Direct interaction with large displays through monocular computer vision
Large displays are everywhere, and have been shown to provide higher productivity gain and user satisfaction compared to traditional desktop monitors. The computer mouse remains the most common input tool for users to interact with these larger displays. Much effort has been made on making this interaction more natural and more intuitive for the user. The use of computer vision for this purpose has been well researched as it provides freedom and mobility to the user and allows them to interact at a distance. Interaction that relies on monocular computer vision, however, has not been well researched, particularly when used for depth information recovery. This thesis aims to investigate the feasibility of using monocular computer vision to allow bare-hand interaction with large display systems from a distance. By taking into account the location of the user and the interaction area available, a dynamic virtual touchscreen can be estimated between the display and the user. In the process, theories and techniques that make interaction with computer display as easy as pointing to real world objects is explored. Studies were conducted to investigate the way human point at objects naturally with their hand and to examine the inadequacy in existing pointing systems. Models that underpin the pointing strategy used in many of the previous interactive systems were formalized. A proof-of-concept prototype is built and evaluated from various user studies. Results from this thesis suggested that it is possible to allow natural user interaction with large displays using low-cost monocular computer vision. Furthermore, models developed and lessons learnt in this research can assist designers to develop more accurate and natural interactive systems that make use of human’s natural pointing behaviours
Recommended from our members
Supporting Multi-User Interaction in Co-Located and Remote Augmented Reality by Improving Reference Performance and Decreasing Physical Interference
One of the most fundamental components of our daily lives is social interaction, ranging from simple activities, such as purchasing a donut in a bakery on the way to work, to complex ones, such as instructing a remote colleague how to repair a broken automobile. While we interact with others, various challenges may arise, such as miscommunication or physical interference. In a bakery, a clerk may misunderstand the donut at which a customer was pointing due to the uncertainty of their finger direction. In a repair task, a technician may remove the wrong bolt and accidentally hit another user while replacing broken parts due to unclear instructions and lack of attention while communicating with a remote advisor.
This dissertation explores techniques for supporting multi-user 3D interaction in augmented reality in a way that addresses these challenges. Augmented Reality (AR) refers to interactively overlaying geometrically registered virtual media on the real world. In particular, we address how an AR system can use overlaid graphics to assist users in referencing local objects accurately and remote objects efficiently, and prevent co-located users from physically interfering with each other. My thesis is that our techniques can provide more accurate referencing for co-located and efficient referencing for remote users and lessen interference among users.
First, we present and evaluate an AR referencing technique for shared environments that is designed to improve the accuracy with which one user (the indicator) can point out a real physical object to another user (the recipient). Our technique is intended for use in otherwise unmodeled environments in which objects in the environment, and the hand of the indicator, are interactively observed by a depth camera, and both users wear tracked see-through displays. This technique allows the indicator to bring a copy of a portion of the physical environment closer and indicate a selection in the copy. At the same time, the recipient gets to see the indicator's live interaction represented virtually in another copy that is brought closer to the recipient, and is also shown the mapping between their copy and the actual portion of the physical environment. A formal user study confirms that our technique performs significantly more accurately than comparison techniques in situations in which the participating users have sufficiently different views of the scene.
Second, we extend the idea of using a copy (virtual replica) of physical object to help a remote expert assist a local user in performing a task in the local user's environment. We develop an approach that uses Virtual Reality (VR) or AR for the remote expert, and AR for the local user. It allows the expert to create and manipulate virtual replicas of physical objects in the local environment to refer to parts of those physical objects and to indicate actions on them. The expert demonstrates actions in 3D by manipulating virtual replicas, supported by constraints and annotations. We performed a user study of a 6DOF alignment task, a key operation in many physical task domains. We compared our approach with another 3D approach that also uses virtual replicas, in which the remote expert identifies corresponding pairs of points to align on a pair of objects, and a 2D approach in which the expert uses a 2D tablet-based drawing system similar to sketching systems developed for prior work by others on remote assistance. The study shows the 3D demonstration approach to be faster than the others.
Third, we present an interference avoidance technique (Redirected Motion) intended to lessen the chance of physical interference among users with tracked hand-held displays, while minimizing their awareness that the technique is being applied. This interaction technique warps virtual space by shifting the virtual location of a user's hand-held display. We conducted a formal user study to evaluate Redirected Motion against other approaches that either modify what a user sees or hears, or restrict the interaction capabilities users have. Our study was performed using a game we developed, in which two players moved their hand-held displays rapidly in the space around a shared gameboard. Our analysis showed that Redirected Motion effectively and imperceptibly kept players further apart physically than the other techniques.
These interaction techniques were implemented using an extensible programming framework we developed for supporting a broad range of multi-user immersive AR applications. This framework, Goblin XNA, integrates a 3D scene graph with support for 6DOF tracking, rigid body physics simulation, networking, shaders, particle systems, and 2D user interface primitives.
In summary, we showed that our referencing approaches can enhance multi-user AR by improving accuracy for co-located users and increasing efficiency for remote users. In addition, we demonstrated that our interference-avoidance approach can lessen the chance of unwanted physical interference between co-located users, without their being aware of its use
Enhanced Virtuality: Increasing the Usability and Productivity of Virtual Environments
Mit stetig steigender Bildschirmauflösung, genauerem Tracking und fallenden Preisen stehen Virtual Reality (VR) Systeme kurz davor sich erfolgreich am Markt zu etablieren. Verschiedene Werkzeuge helfen Entwicklern bei der Erstellung komplexer Interaktionen mit mehreren Benutzern innerhalb adaptiver virtueller Umgebungen. Allerdings entstehen mit der Verbreitung der VR-Systeme auch zusätzliche Herausforderungen: Diverse Eingabegeräte mit ungewohnten Formen und Tastenlayouts verhindern eine intuitive Interaktion. Darüber hinaus zwingt der eingeschränkte Funktionsumfang bestehender Software die Nutzer dazu, auf herkömmliche PC- oder Touch-basierte Systeme zurückzugreifen. Außerdem birgt die Zusammenarbeit mit anderen Anwendern am gleichen Standort Herausforderungen hinsichtlich der Kalibrierung unterschiedlicher Trackingsysteme und der Kollisionsvermeidung. Beim entfernten Zusammenarbeiten wird die Interaktion durch Latenzzeiten und Verbindungsverluste zusätzlich beeinflusst. Schließlich haben die Benutzer unterschiedliche Anforderungen an die Visualisierung von Inhalten, z.B. Größe, Ausrichtung, Farbe oder Kontrast, innerhalb der virtuellen Welten. Eine strikte Nachbildung von realen Umgebungen in VR verschenkt Potential und wird es nicht ermöglichen, die individuellen Bedürfnisse der Benutzer zu berücksichtigen.
Um diese Probleme anzugehen, werden in der vorliegenden Arbeit Lösungen in den Bereichen Eingabe, Zusammenarbeit und Erweiterung von virtuellen Welten und Benutzern vorgestellt, die darauf abzielen, die Benutzerfreundlichkeit und Produktivität von VR zu erhöhen. Zunächst werden PC-basierte Hardware und Software in die virtuelle Welt übertragen, um die Vertrautheit und den Funktionsumfang bestehender Anwendungen in VR zu erhalten. Virtuelle Stellvertreter von physischen Geräten, z.B. Tastatur und Tablet, und ein VR-Modus für Anwendungen ermöglichen es dem Benutzer reale Fähigkeiten in die virtuelle Welt zu übertragen. Des Weiteren wird ein Algorithmus vorgestellt, der die Kalibrierung mehrerer ko-lokaler VR-Geräte mit hoher Genauigkeit und geringen Hardwareanforderungen und geringem Aufwand ermöglicht. Da VR-Headsets die reale Umgebung der Benutzer ausblenden, wird die Relevanz einer Ganzkörper-Avatar-Visualisierung für die Kollisionsvermeidung und das entfernte Zusammenarbeiten nachgewiesen. Darüber hinaus werden personalisierte räumliche oder zeitliche Modifikationen vorgestellt, die es erlauben, die Benutzerfreundlichkeit, Arbeitsleistung und soziale Präsenz von Benutzern zu erhöhen. Diskrepanzen zwischen den virtuellen Welten, die durch persönliche Anpassungen entstehen, werden durch Methoden der Avatar-Umlenkung (engl. redirection) kompensiert. Abschließend werden einige der Methoden und Erkenntnisse in eine beispielhafte Anwendung integriert, um deren praktische Anwendbarkeit zu verdeutlichen.
Die vorliegende Arbeit zeigt, dass virtuelle Umgebungen auf realen Fähigkeiten und Erfahrungen aufbauen können, um eine vertraute und einfache Interaktion und Zusammenarbeit von Benutzern zu gewährleisten. Darüber hinaus ermöglichen individuelle Erweiterungen des virtuellen Inhalts und der Avatare Einschränkungen der realen Welt zu überwinden und das Erlebnis von VR-Umgebungen zu steigern
Low-Cost Sensors and Biological Signals
Many sensors are currently available at prices lower than USD 100 and cover a wide range of biological signals: motion, muscle activity, heart rate, etc. Such low-cost sensors have metrological features allowing them to be used in everyday life and clinical applications, where gold-standard material is both too expensive and time-consuming to be used. The selected papers present current applications of low-cost sensors in domains such as physiotherapy, rehabilitation, and affective technologies. The results cover various aspects of low-cost sensor technology from hardware design to software optimization
Learning Algorithm Design for Human-Robot Skill Transfer
In this research, we develop an intelligent learning scheme for performing human-robot skills transfer. Techniques adopted in the scheme include the Dynamic Movement Prim- itive (DMP) method with Dynamic Time Warping (DTW), Gaussian Mixture Model (G- MM) with Gaussian Mixture Regression (GMR) and the Radical Basis Function Neural Networks (RBFNNs). A series of experiments are conducted on a Baxter robot, a NAO robot and a KUKA iiwa robot to verify the effectiveness of the proposed design.During the design of the intelligent learning scheme, an online tracking system is de- veloped to control the arm and head movement of the NAO robot using a Kinect sensor. The NAO robot is a humanoid robot with 5 degrees of freedom (DOF) for each arm. The joint motions of the operator’s head and arm are captured by a Kinect V2 sensor, and this information is then transferred into the workspace via the forward and inverse kinematics. In addition, to improve the tracking performance, a Kalman filter is further employed to fuse motion signals from the operator sensed by the Kinect V2 sensor and a pair of MYO armbands, so as to teleoperate the Baxter robot. In this regard, a new strategy is developed using the vector approach to accomplish a specific motion capture task. For instance, the arm motion of the operator is captured by a Kinect sensor and programmed through a processing software. Two MYO armbands with embedded inertial measurement units are worn by the operator to aid the robots in detecting and replicating the operator’s arm movements. For this purpose, the armbands help to recognize and calculate the precise velocity of motion of the operator’s arm. Additionally, a neural network based adaptive controller is designed and implemented on the Baxter robot to illustrate the validation forthe teleoperation of the Baxter robot.Subsequently, an enhanced teaching interface has been developed for the robot using DMP and GMR. Motion signals are collected from a human demonstrator via the Kinect v2 sensor, and the data is sent to a remote PC for teleoperating the Baxter robot. At this stage, the DMP is utilized to model and generalize the movements. In order to learn from multiple demonstrations, DTW is used for the preprocessing of the data recorded on the robot platform, and GMM is employed for the evaluation of DMP to generate multiple patterns after the completion of the teaching process. Next, we apply the GMR algorithm to generate a synthesized trajectory to minimize position errors in the three dimensional (3D) space. This approach has been tested by performing tasks on a KUKA iiwa and a Baxter robot, respectively.Finally, an optimized DMP is added to the teaching interface. A character recombination technology based on DMP segmentation that uses verbal command has also been developed and incorporated in a Baxter robot platform. To imitate the recorded motion signals produced by the demonstrator, the operator trains the Baxter robot by physically guiding it to complete the given task. This is repeated five times, and the generated training data set is utilized via the playback system. Subsequently, the DTW is employed to preprocess the experimental data. For modelling and overall movement control, DMP is chosen. The GMM is used to generate multiple patterns after implementing the teaching process. Next, we employ the GMR algorithm to reduce position errors in the 3D space after a synthesized trajectory has been generated. The Baxter robot, remotely controlled by the user datagram protocol (UDP) in a PC, records and reproduces every trajectory. Additionally, Dragon Natural Speaking software is adopted to transcribe the voice data. This proposed approach has been verified by enabling the Baxter robot to perform a writing task of drawing robot has been taught to write only one character
- …