278 research outputs found
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
This paper presents ER-NeRF, a novel conditional Neural Radiance Fields
(NeRF) based architecture for talking portrait synthesis that can concurrently
achieve fast convergence, real-time rendering, and state-of-the-art performance
with small model size. Our idea is to explicitly exploit the unequal
contribution of spatial regions to guide talking portrait modeling.
Specifically, to improve the accuracy of dynamic head reconstruction, a compact
and expressive NeRF-based Tri-Plane Hash Representation is introduced by
pruning empty spatial regions with three planar hash encoders. For speech
audio, we propose a Region Attention Module to generate region-aware condition
feature via an attention mechanism. Different from existing methods that
utilize an MLP-based encoder to learn the cross-modal relation implicitly, the
attention mechanism builds an explicit connection between audio features and
spatial regions to capture the priors of local motions. Moreover, a direct and
fast Adaptive Pose Encoding is introduced to optimize the head-torso separation
problem by mapping the complex transformation of the head pose into spatial
coordinates. Extensive experiments demonstrate that our method renders better
high-fidelity and audio-lips synchronized talking portrait videos, with
realistic details and high efficiency compared to previous methods.Comment: Accepted by ICCV 202
Fast Updating Truncated SVD for Representation Learning with Sparse Matrices
Updating a truncated Singular Value Decomposition (SVD) is crucial in
representation learning, especially when dealing with large-scale data matrices
that continuously evolve in practical scenarios. Aligning SVD-based models with
fast-paced updates becomes increasingly important. Existing methods for
updating truncated SVDs employ Rayleigh-Ritz projection procedures, where
projection matrices are augmented based on original singular vectors. However,
these methods suffer from inefficiency due to the densification of the update
matrix and the application of the projection to all singular vectors. To
address these limitations, we introduce a novel method for dynamically
approximating the truncated SVD of a sparse and temporally evolving matrix. Our
approach leverages sparsity in the orthogonalization process of augmented
matrices and utilizes an extended decomposition to independently store
projections in the column space of singular vectors. Numerical experiments
demonstrate a remarkable efficiency improvement of an order of magnitude
compared to previous methods. Remarkably, this improvement is achieved while
maintaining a comparable precision to existing approaches
Robust Synthetic-to-Real Transfer for Stereo Matching
With advancements in domain generalized stereo matching networks, models
pre-trained on synthetic data demonstrate strong robustness to unseen domains.
However, few studies have investigated the robustness after fine-tuning them in
real-world scenarios, during which the domain generalization ability can be
seriously degraded. In this paper, we explore fine-tuning stereo matching
networks without compromising their robustness to unseen domains. Our
motivation stems from comparing Ground Truth (GT) versus Pseudo Label (PL) for
fine-tuning: GT degrades, but PL preserves the domain generalization ability.
Empirically, we find the difference between GT and PL implies valuable
information that can regularize networks during fine-tuning. We also propose a
framework to utilize this difference for fine-tuning, consisting of a frozen
Teacher, an exponential moving average (EMA) Teacher, and a Student network.
The core idea is to utilize the EMA Teacher to measure what the Student has
learned and dynamically improve GT and PL for fine-tuning. We integrate our
framework with state-of-the-art networks and evaluate its effectiveness on
several real-world datasets. Extensive experiments show that our method
effectively preserves the domain generalization ability during fine-tuning.Comment: Accepted at CVPR 202
Nasopharyngeal carcinoma with non-squamous phenotype may be a variant of nasopharyngeal squamous cell carcinoma after inhibition of EGFR/PI3K/AKT/mTOR pathway
Nasopharyngeal carcinoma (NPC) is a cancerous tumor that develops in the nasopharynx epithelium and typically has squamous differentiation. The squamous phenotype is evident in immunohisto-chemistry, with diffuse nuclear positivity for p63 and p40. Nonetheless, a few NPCs have been identified by clinicopathological diagnosis that do not exhibit the squamous phenotype; these NPCs are currently referred to as non-squamous immuno-phenotype nasopharyngeal carcinomas (NSNPCs). In a previous work, we have revealed similarities between the histological appearance, etiology, and gene alterations of NSNPC and conventional NPC. According to ultrastructural findings, NSNPC still falls under the category of non-keratinized squamous cell carcinoma that is undifferentiated. NSNPC has an excellent prognosis and a low level of malignancy, according to a retrospective investigation. Based on prior research, we investigated the molecular mechanism of NSNPC not expressing the squamous phenotype and its biological behavior. IHC was used to determine the expression of EGFR, PI3K, AKT, p-AKT, mTOR, p-mTOR, Notch, STAT3 and p-STAT3 in a total of 20 NSNPC tissue samples and 20 classic NPC tissue samples. We obtained human NPC cell lines (CNE-2,5-8F) and used EGFR overexpression plasmid and shRNAs to transfect them. To find out whether mRNA and proteins were expressed in the cells, we used Western blotting and qRT-PCR. Cell biological behavior was discovered using the CCK-8 assay, cell migration assay, and cell invasion assay. EGFR, PI3K, p-AKT and p-mTOR proteins were lowly expressed in NSNPC tissues by immunohistochemistry, compared with classical NPC. In the classical NPC cell lines CNE-2 and 5-8F, overexpression EGFR can up-regulate the expression of p63 through the PI3K/AKT/mTOR pathway, and promote the proliferation, migration, and invasion of nasopharyngeal carcinoma cells. At the same time, knockout of EGFR can down-regulate p63 expression through the PI3K/AKT/mTOR pathway, and inhibit the proliferation, migration, and invasion of nasopharyngeal carcinoma cells. The lack of p63 expression in NSNPC was linked with the inhibition of the EGFR/PI3K/AKT/mTOR pathway, and NSNPC may be a variant of classical NPC
Regulations of the key mediators in inflammation and atherosclerosis by Aspirin in human macrophages
Although its role to prevent secondary cardiovascular complications has been well established, how acetyl salicylic acid (ASA, aspirin) regulates certain key molecules in the atherogenesis is still not known. Considering the role of matrix metalloproteinase-9 (MMP-9) to destabilize the atherosclerotic plaques, the roles of the scavenger receptor class BI (SR-BI) and ATP-binding cassette transporter A1 (ABCA1) to promote cholesterol efflux in the foam cells at the plaques, and the role of NF-κB in the overall inflammation related to the atherosclerosis, we addressed whether these molecules are all related to a common mechanism that may be regulated by acetyl salicylic acid. We investigated the effect of ASA to regulate the expressions and activities of these molecules in THP-1 macrophages. Our results showed that ASA inhibited MMP-9 mRNA expression, and caused the decrease in the MMP-9 activities from the cell culture supernatants. In addition, it inhibited the nuclear translocation of NF-κB p65 subunit, thus the activity of this inflammatory molecule. On the contrary, acetyl salicylic acid induced the expressions of ABCA1 and SR-BI, two molecules known to reduce the progression of atherosclerosis, at both mRNA and protein levels. It also stimulated the cholesterol efflux out of macrophages. These data suggest that acetyl salicylic acid may alleviate symptoms of atherosclerosis by two potential mechanisms: maintaining the plaque stability via inhibiting activities of inflammatory molecules MMP-9 and NF-κB, and increasing the cholesterol efflux through inducing expressions of ABCA1 and SR-BI
Recommended from our members
Enhancing the electric charge output in LiNbO3-based piezoelectric pressure sensors.
Lithium niobate (LiNbO3) single crystals are a kind of ferroelectric material with a high piezoelectric coefficient and Curie temperature, which is suitable for the preparation of piezoelectric pressure sensors. However, there is little research reporting on the use of LiNbO3 single crystals to prepare piezoelectric pressure sensors. Therefore, in this paper, LiNbO3 was used to prepare piezoelectric pressure sensors to study the feasibility of using LiNbO3 single crystals as a sensitive material for piezoelectric pressure sensors. In addition, chemical mechanical polishing (CMP) technology was used to prepare LiNbO3 crystals with different thicknesses to study the influence of these LiNbO3 crystals on the electric charge output of the sensors. The results showed that the sensitivity of a 300 μm sample (0.218 mV kPa-1) was about 1.23 times that of a 500 μm sample (0.160 mV kPa-1). Low-temperature polymer heterogeneous integration and oxygen plasma activation technologies were used to realize the heterogeneous integration of LiNbO3 and silicon to prepare piezoelectric pressure sensors, which could significantly improve the sensitivity of the sensor by approximately 16.06 times (2.569 mV kPa-1) that of the original sample (0.160 mV kPa-1) due to an appropriate residual stress that did not shatter LiNbO3 or silicon, thus providing a possible method for integrating piezoelectric pressure sensors and integrated circuits
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
Existing methods of privacy-preserving action recognition (PPAR) mainly focus
on frame-level (spatial) privacy removal through 2D CNNs. Unfortunately, they
have two major drawbacks. First, they may compromise temporal dynamics in input
videos, which are critical for accurate action recognition. Second, they are
vulnerable to practical attacking scenarios where attackers probe for privacy
from an entire video rather than individual frames. To address these issues, we
propose a novel framework STPrivacy to perform video-level PPAR. For the first
time, we introduce vision Transformers into PPAR by treating a video as a
tubelet sequence, and accordingly design two complementary mechanisms, i.e.,
sparsification and anonymization, to remove privacy from a spatio-temporal
perspective. In specific, our privacy sparsification mechanism applies adaptive
token selection to abandon action-irrelevant tubelets. Then, our anonymization
mechanism implicitly manipulates the remaining action-tubelets to erase privacy
in the embedding space through adversarial learning. These mechanisms provide
significant advantages in terms of privacy preservation for human eyes and
action-privacy trade-off adjustment during deployment. We additionally
contribute the first two large-scale PPAR benchmarks, VP-HMDB51 and VP-UCF101,
to the community. Extensive evaluations on them, as well as two other tasks,
validate the effectiveness and generalization capability of our framework
MangaGAN: Unpaired Photo-to-Manga Translation Based on The Methodology of Manga Drawing
Manga is a world popular comic form originated in Japan, which typically
employs black-and-white stroke lines and geometric exaggeration to describe
humans' appearances, poses, and actions. In this paper, we propose MangaGAN,
the first method based on Generative Adversarial Network (GAN) for unpaired
photo-to-manga translation. Inspired by how experienced manga artists draw
manga, MangaGAN generates the geometric features of manga face by a designed
GAN model and delicately translates each facial region into the manga domain by
a tailored multi-GANs architecture. For training MangaGAN, we construct a new
dataset collected from a popular manga work, containing manga facial features,
landmarks, bodies, and so on. Moreover, to produce high-quality manga faces, we
further propose a structural smoothing loss to smooth stroke-lines and avoid
noisy pixels, and a similarity preserving module to improve the similarity
between domains of photo and manga. Extensive experiments show that MangaGAN
can produce high-quality manga faces which preserve both the facial similarity
and a popular manga style, and outperforms other related state-of-the-art
methods.Comment: 17 page
- …