76 research outputs found
Computational Multimedia for Video Self Modeling
Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras
MonoSLAM: Real-time single camera SLAM
Published versio
Physical Adversarial Attack meets Computer Vision: A Decade Survey
Although Deep Neural Networks (DNNs) have achieved impressive results in
computer vision, their exposed vulnerability to adversarial attacks remains a
serious concern. A series of works has shown that by adding elaborate
perturbations to images, DNNs could have catastrophic degradation in
performance metrics. And this phenomenon does not only exist in the digital
space but also in the physical space. Therefore, estimating the security of
these DNNs-based systems is critical for safely deploying them in the real
world, especially for security-critical applications, e.g., autonomous cars,
video surveillance, and medical diagnosis. In this paper, we focus on physical
adversarial attacks and provide a comprehensive survey of over 150 existing
papers. We first clarify the concept of the physical adversarial attack and
analyze its characteristics. Then, we define the adversarial medium, essential
to perform attacks in the physical world. Next, we present the physical
adversarial attack methods in task order: classification, detection, and
re-identification, and introduce their performance in solving the trilemma:
effectiveness, stealthiness, and robustness. In the end, we discuss the current
challenges and potential future directions.Comment: 32 pages. Under Revie
Design of a Multi-biometric Platform, based on physical traits and physiological measures: Face, Iris, Ear, ECG and EEG
Security and safety is one the main concerns both for governments and for private
companies in the last years so raising growing interests and investments in
the area of biometric recognition and video surveillance, especially after the sad
happenings of September 2001. Outlays assessments of the U.S. government for
the years 2001-2005 estimate that the homeland security spending climbed from
100 billion of 2005. In this lapse of
time, new pattern recognition techniques have been developed and, even more
important, new biometric traits have been investigated and refined; besides
the well-known physical and behavioral characteristics, also physiological measures
have been studied, so providing more features to enhance discrimination
capabilities of individuals. This dissertation proposes the design of a multimodal
biometric platform, FAIRY, based on the following biometric traits: ear,
face, iris EEG and ECG signals. In the thesis the modular architecture of the
platform has been presented, together with the results obtained for the solution
to the recognition problems related to the different biometrics and their possible
fusion. Finally, an analysis of the pattern recognition issues concerning the
area of videosurveillance has been discussed
Human-Centric Machine Vision
Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans
- …