11 research outputs found
Informal, desktop, audio-video communication
Audio-Video systems have been developed to support many aspects and
modes of human communication, but there has been little support for the informal,
ongoing nature of communication that occurs often in real life. Most existing systems
implement a call metaphor. This presents a barrier to initiating conversation that has a
consequent effect on the formality of the resulting conversation. By contrast, with
informal communication the channel is never explicitly opened or closed. This paper
examines the range of previous systems and seeks to build on these to develop plans for
supporting informal communication, in a desktop environment
FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality
We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. In addition to these face reconstruction components, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions, change gaze directions, or remove the VR goggles in realistic re-renderings. In a live setup with a source and a target actor, we apply these newly-introduced algorithmic components. We assume that the source actor is wearing a VR device, and we capture his facial expressions and eye movement in real-time. For the target video, we mimic a similar tracking process; however, we use the source input to drive the animations of the target video, thus enabling gaze-aware facial reenactment. To render the modified target video on a stereo display, we augment our capture and reconstruction process with stereo data. In the end, FaceVR produces compelling results for a variety of applications, such as gaze-aware facial reenactment, reenactment in virtual reality, removal of VR goggles, and re-targeting of somebody's gaze direction in a video conferencing call
Mutual Gaze Support in Videoconferencing Reviewed
Videoconferencing allows geographically dispersed parties to communicate by simultaneous audio and video transmissions. It is used in a variety of application scenarios with a wide range of coordination needs and efforts, such as private chat, discussion meetings, and negotiation tasks. In particular, in scenarios requiring certain levels of trust and judgement non-verbal communication, cues are highly important for effective communication. Mutual gaze support plays a central role in those high coordination need scenarios but generally lacks adequate technical support from videoconferencing systems. In this paper, we review technical concepts and implementations for mutual gaze support in videoconferencing, classify them, evaluate them according to a defined set of criteria, and give recommendations for future developments. Our review gives decision makers, researchers, and developers a tool to systematically apply and further develop videoconferencing systems in serious settings requiring mutual gaze. This should lead to well-informed decisions regarding the use and development of this technology and to a more widespread exploitation of the benefits of videoconferencing in general. For example, if videoconferencing systems supported high-quality mutual gaze in an easy-to-set-up and easy-to-use way, we could hold more effective and efficient recruitment interviews, court hearings, or contract negotiations
Pointing Gesture Recognition Using Stereo Vision for Video Conferencing
Tohoku University青木孝
Money & Trust in Digital Society, Bitcoin and Stablecoins in ML enabled Metaverse Telecollaboration
We present a state of the art and positioning book, about Digital society
tools, namely; Web3, Bitcoin, Metaverse, AI/ML, accessibility, safeguarding and
telecollaboration. A high level overview of Web3 technologies leads to a
description of blockchain, and the Bitcoin network is specifically selected for
detailed examination. Suitable components of the extended Bitcoin ecosystem are
described in more depth. Other mechanisms for native digital value transfer are
described, with a focus on `money'. Metaverse technology is over-viewed,
primarily from the perspective of Bitcoin and extended reality. Bitcoin is
selected as the best contender for value transfer in metaverses because of it's
free and open source nature, and network effect. Challenges and risks of this
approach are identified. A cloud deployable virtual machine based technology
stack deployment guide with a focus on cybersecurity best practice can be
downloaded from GitHub to experiment with the technologies. This deployable lab
is designed to inform development of secure value transaction, for small and
medium sized companies
Ein Beitrag zur Entwicklung von Methoden zur Stereoanalyse und Bildsynthese im Anwendungskontext der Videokommunikation
This thesis contributes to the research area of stereo vision and view
synthesis in the field of private video communication. During private video
communication eye contact between the participants is typically lost due to
the different placement of the camera and the video window. The goal of
this thesis is to re-establish the eye contact by synthesizing of the view
of a virtual camera such that the virtual camera faces towards the
participant.
The thesis firstly sketches the positive effect of eye contact in video
communication. An in-depth review of mathematical foundations in the fields
of stereo vision and view synthesis follows. On this foundation the thesis
comprehensively covers the state of the art of image based rendering and
particularly of eye-gaze correction via 3D-analysis and synthesis.In the
first step of the method development the thesis establishes a model of
quality factors which determines decisions about camera placement and
recording system. Measurements with respect to synchronization and data
storage are presented. Local and global algorithms for stereo vision are
analyzed and adapted. The thesis contributes to the field of stereo vision
algorithms by means of development and combination of different cost
functions, consistency based inpainting, spatial and temporal smoothing and
segmentation with respect to the use case of private video communication.
Using the extracted disparity map, two approaches for view synthesis -
trifocal transfer and 3D warping - are employed and extended. One important
contribution of the thesis is a contour-based inpainting algorithm as well
as point base image smoothing techniques.
Two comprehensive subjective studies prove the assumption that eye contact
can be re-established by the proposed system. They demonstrate the well
perceived eye-contact as well as the significantly improved acceptance of
quality due to the developed methods compared to the initial situation. The
thesis finally discusses the results, followed by a qualitative comparison
to the state of the art.Die vorliegende Arbeit leistet einen Beitrag zum Forschungsbereich der
Stereoanalyse und Bildsynthese im speziellen Kontext der privaten
Videokommunikation. Bei der privaten Videokommunikation geht durch die
unterschiedliche Positionierung der Kamera und des Videofensters
typischerweise der Blickkontakt zwischen den Kommunikationsteilnehmern
verloren. Ziel dieser Arbeit ist die Wiederherstellung des Blickkontaktes
mittels der Synthese einer virtuellen Kameraansicht, die in Blickrichtung
der Kommunizierenden ausgerichtet ist.
Die Arbeit umreißt zunächst den positiven Einfluss des Blickkontaktes in
der Videokommunikation. Anschließend wird eine tiefgehende Betrachtung der
notwendigen technischen Grundlagen im Bereich Stereoanalyse und
Bildsynthese durchgeführt. Aufbauend auf diesen Grundlagen wird der der
Stand der Technik im Bereich des bildbasierten Renderings im Allgemeinen
sowie der Blickkorrektur mittels 3D-Analyse und -synthese im Speziellen
umfassend behandelt.
Zunächst wird ein Modell von Qualitätsparametern entwickelt, welches die
Entscheidungen hinsichtlich Kameraanordnung und Aufnahmesystem
determiniert. Notwendige Messungen hinsichtlich Synchronizität und
Datenspeicherung werden präsentiert. Im Bereich der Algorithmen der
Stereoanalyse werden etablierte lokale und globale Algorithmen analysiert
und adaptiert. Verschiedene Kostenmaße, konsistenzbasiertes Füllen,
zeitliche und örtliche Glättung sowie eine abschließende Segmentierung
werden hinsichtlich des konkreten Anwendungsfalls der Blickkorrektur in der
privaten Videokommunikation entwickelt. Darauf aufbauend werden die beiden
Syntheseverfahren des trifokalen Transfers sowie des 3D-Warpings weiter
entwickelt. Ein wichtiger Beitrag der Arbeit ist ein konturbasiertes
Füllverfahren sowie Maßnahmen im Bereich der Punktglättung.
Zwei umfangreiche Experimente mit zahlreichen Probanden bestätigen die
Korrektheit der Annahme, dass Blickkontakt durch das vorgestellte Verfahren
hergestellt werden kann. Sie demonstrieren sowohl die sehr gute Wahrnehmung
des Augenkontaktes als auch die signifikante Verbesserung der Akzeptanz und
subjektiven Qualitätswahrnehmung durch die entwickelten Algorithmen im
Vergleich zum Ausgangspunkt der Arbeit. Eine qualitativer Vergleich mit dem
Stand der Technik und eine Diskussion der Ergebnisse, gepaart mit einem
Ausblick in die Zukunft des behandelten Forschungsgebietes, schließen die
Arbeit ab