900 research outputs found
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
In this work, we propose an ID-preserving talking head generation framework,
which advances previous methods in two aspects. First, as opposed to
interpolating from sparse flow, we claim that dense landmarks are crucial to
achieving accurate geometry-aware flow fields. Second, inspired by
face-swapping methods, we adaptively fuse the source identity during synthesis,
so that the network better preserves the key characteristics of the image
portrait. Although the proposed model surpasses prior generation fidelity on
established benchmarks, to further make the talking head generation qualified
for real usage, personalized fine-tuning is usually needed. However, this
process is rather computationally demanding that is unaffordable to standard
users. To solve this, we propose a fast adaptation model using a meta-learning
approach. The learned model can be adapted to a high-quality personalized model
as fast as 30 seconds. Last but not the least, a spatial-temporal enhancement
module is proposed to improve the fine details while ensuring temporal
coherency. Extensive experiments prove the significant superiority of our
approach over the state of the arts in both one-shot and personalized settings.Comment: CVPR 2023, project page: https://meta-portrait.github.i
Storytelling with salient stills
Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 59-63).Michale J. Massey.M.S
Applications of Graphene Quantum Dots in Biomedical Sensors
Due to the proliferative cancer rates, cardiovascular diseases, neurodegenerative disorders, autoimmune diseases and a plethora of infections across the globe, it is essential to introduce strategies that can rapidly and specifically detect the ultralow concentrations of relevant biomarkers, pathogens, toxins and pharmaceuticals in biological matrices. Considering these pathophysiologies, various research works have become necessary to fabricate biosensors for their early diagnosis and treatment, using nanomaterials like quantum dots (QDs). These nanomaterials effectively ameliorate the sensor performance with respect to their reproducibility, selectivity as well as sensitivity. In particular, graphene quantum dots (GQDs), which are ideally graphene fragments of nanometer size, constitute discrete features such as acting as attractive fluorophores and excellent electro-catalysts owing to their photo-stability, water-solubility, biocompatibility, non-toxicity and lucrativeness that make them favorable candidates for a wide range of novel biomedical applications. Herein, we reviewed about 300 biomedical studies reported over the last five years which entail the state of art as well as some pioneering ideas with respect to the prominent role of GQDs, especially in the development of optical, electrochemical and photoelectrochemical biosensors. Additionally, we outline the ideal properties of GQDs, their eclectic methods of synthesis, and the general principle behind several biosensing techniques.DFG, 428780268, Biomimetische Rezeptoren auf NanoMIP-Basis zur Virenerkennung und -entfernung mittels integrierter Ansätz
SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models
Diffusion models, which have emerged to become popular text-to-image
generation models, can produce high-quality and content-rich images guided by
textual prompts. However, there are limitations to semantic understanding and
commonsense reasoning in existing models when the input prompts are concise
narrative, resulting in low-quality image generation. To improve the capacities
for narrative prompts, we propose a simple-yet-effective parameter-efficient
fine-tuning approach called the Semantic Understanding and Reasoning adapter
(SUR-adapter) for pre-trained diffusion models. To reach this goal, we first
collect and annotate a new dataset SURD which consists of more than 57,000
semantically corrected multi-modal samples. Each sample contains a simple
narrative prompt, a complex keyword-based prompt, and a high-quality image.
Then, we align the semantic representation of narrative prompts to the complex
prompts and transfer knowledge of large language models (LLMs) to our
SUR-adapter via knowledge distillation so that it can acquire the powerful
semantic understanding and reasoning capabilities to build a high-quality
textual semantic representation for text-to-image generation. We conduct
experiments by integrating multiple LLMs and popular pre-trained diffusion
models to show the effectiveness of our approach in enabling diffusion models
to understand and reason concise natural language without image quality
degradation. Our approach can make text-to-image diffusion models easier to use
with better user experience, which demonstrates our approach has the potential
for further advancing the development of user-friendly text-to-image generation
models by bridging the semantic gap between simple narrative prompts and
complex keyword-based prompts. The code is released at
https://github.com/Qrange-group/SUR-adapter.Comment: accepted by ACM MM 202
- …