4,603 research outputs found
Dynamic Deep Multi-modal Fusion for Image Privacy Prediction
With millions of images that are shared online on social networking sites,
effective methods for image privacy prediction are highly needed. In this
paper, we propose an approach for fusing object, scene context, and image tags
modalities derived from convolutional neural networks for accurately predicting
the privacy of images shared online. Specifically, our approach identifies the
set of most competent modalities on the fly, according to each new target image
whose privacy has to be predicted. The approach considers three stages to
predict the privacy of a target image, wherein we first identify the
neighborhood images that are visually similar and/or have similar sensitive
content as the target image. Then, we estimate the competence of the modalities
based on the neighborhood images. Finally, we fuse the decisions of the most
competent modalities and predict the privacy label for the target image.
Experimental results show that our approach predicts the sensitive (or private)
content more accurately than the models trained on individual modalities
(object, scene, and tags) and prior privacy prediction works. Also, our
approach outperforms strong baselines, that train meta-classifiers to obtain an
optimal combination of modalities.Comment: Accepted by The Web Conference (WWW) 201
Proceedings of the 2009 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory
The joint workshop of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Karlsruhe, and the Vision and Fusion Laboratory (Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT)), is organized annually since 2005 with the aim to report on the latest research and development findings of the doctoral students of both institutions. This book provides a collection of 16 technical reports on the research results presented on the 2009 workshop
A Survey of Multimodal Information Fusion for Smart Healthcare: Mapping the Journey from Data to Wisdom
Multimodal medical data fusion has emerged as a transformative approach in
smart healthcare, enabling a comprehensive understanding of patient health and
personalized treatment plans. In this paper, a journey from data to information
to knowledge to wisdom (DIKW) is explored through multimodal fusion for smart
healthcare. We present a comprehensive review of multimodal medical data fusion
focused on the integration of various data modalities. The review explores
different approaches such as feature selection, rule-based systems, machine
learning, deep learning, and natural language processing, for fusing and
analyzing multimodal data. This paper also highlights the challenges associated
with multimodal fusion in healthcare. By synthesizing the reviewed frameworks
and theories, it proposes a generic framework for multimodal medical data
fusion that aligns with the DIKW model. Moreover, it discusses future
directions related to the four pillars of healthcare: Predictive, Preventive,
Personalized, and Participatory approaches. The components of the comprehensive
survey presented in this paper form the foundation for more successful
implementation of multimodal fusion in smart healthcare. Our findings can guide
researchers and practitioners in leveraging the power of multimodal fusion with
the state-of-the-art approaches to revolutionize healthcare and improve patient
outcomes.Comment: This work has been submitted to the ELSEVIER for possible
publication. Copyright may be transferred without notice, after which this
version may no longer be accessibl
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
First impressions: A survey on vision-based apparent personality trait analysis
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft
- …