1,117 research outputs found
An Overview of Deep Semi-Supervised Learning
Deep neural networks demonstrated their ability to provide remarkable
performances on a wide range of supervised learning tasks (e.g., image
classification) when trained on extensive collections of labeled data (e.g.,
ImageNet). However, creating such large datasets requires a considerable amount
of resources, time, and effort. Such resources may not be available in many
practical cases, limiting the adoption and the application of many deep
learning methods. In a search for more data-efficient deep learning methods to
overcome the need for large annotated datasets, there is a rising research
interest in semi-supervised learning and its applications to deep neural
networks to reduce the amount of labeled data required, by either developing
novel methods or adopting existing semi-supervised learning frameworks for a
deep learning setting. In this paper, we provide a comprehensive overview of
deep semi-supervised learning, starting with an introduction to the field,
followed by a summarization of the dominant semi-supervised approaches in deep
learning.Comment: Preprin
Conditional Invertible Generative Models for Supervised Problems
Invertible neural networks (INNs), in the setting of normalizing flows, are a type of unconditional generative likelihood model. Despite various attractive properties compared to other common generative model types, they are rarely useful for supervised tasks or real applications due to their unguided outputs. In this work, we therefore present three new methods that extend the standard INN setting, falling under a broader category we term generative invertible models. These new methods allow leveraging the theoretical and practical benefits of INNs to solve supervised problems in new ways, including real-world applications from different branches of science. The key finding is that our approaches enhance many aspects of trustworthiness in comparison to conventional feed-forward networks, such as uncertainty estimation and quantification, explainability, and proper handling of outlier data
Understanding Text-driven Motion Synthesis with Keyframe Collaboration via Diffusion Models
The emergence of text-driven motion synthesis technique provides animators
with great potential to create efficiently. However, in most cases, textual
expressions only contain general and qualitative motion descriptions, while
lack fine depiction and sufficient intensity, leading to the synthesized
motions that either (a) semantically compliant but uncontrollable over specific
pose details, or (b) even deviates from the provided descriptions, bringing
animators with undesired cases. In this paper, we propose DiffKFC, a
conditional diffusion model for text-driven motion synthesis with keyframes
collaborated. Different from plain text-driven designs, full interaction among
texts, keyframes and the rest diffused frames are conducted at training,
enabling realistic generation under efficient, collaborative dual-level
control: coarse guidance at semantic level, with only few keyframes for direct
and fine-grained depiction down to body posture level, to satisfy animator
requirements without tedious labor. Specifically, we customize efficient
Dilated Mask Attention modules, where only partial valid tokens participate in
local-to-global attention, indicated by the dilated keyframe mask. For user
flexibility, DiffKFC supports adjustment on importance of fine-grained keyframe
control. Experimental results show that our model achieves state-of-the-art
performance on text-to-motion datasets HumanML3D and KIT
- …