239 research outputs found
Enhancing the Authenticity of Rendered Portraits with Identity-Consistent Transfer Learning
Despite rapid advances in computer graphics, creating high-quality
photo-realistic virtual portraits is prohibitively expensive. Furthermore, the
well-know ''uncanny valley'' effect in rendered portraits has a significant
impact on the user experience, especially when the depiction closely resembles
a human likeness, where any minor artifacts can evoke feelings of eeriness and
repulsiveness. In this paper, we present a novel photo-realistic portrait
generation framework that can effectively mitigate the ''uncanny valley''
effect and improve the overall authenticity of rendered portraits. Our key idea
is to employ transfer learning to learn an identity-consistent mapping from the
latent space of rendered portraits to that of real portraits. During the
inference stage, the input portrait of an avatar can be directly transferred to
a realistic portrait by changing its appearance style while maintaining the
facial identity. To this end, we collect a new dataset, Daz-Rendered-Faces-HQ
(DRFHQ), that is specifically designed for rendering-style portraits. We
leverage this dataset to fine-tune the StyleGAN2 generator, using our carefully
crafted framework, which helps to preserve the geometric and color features
relevant to facial identity. We evaluate our framework using portraits with
diverse gender, age, and race variations. Qualitative and quantitative
evaluations and ablation studies show the advantages of our method compared to
state-of-the-art approaches.Comment: 10 pages, 8 figures, 2 table
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Current mainstream vision-language (VL) tracking framework consists of three
parts, \ie a visual feature extractor, a language feature extractor, and a
fusion model. To pursue better performance, a natural modus operandi for VL
tracking is employing customized and heavier unimodal encoders, and multi-modal
fusion models. Albeit effective, existing VL trackers separate feature
extraction and feature integration, resulting in extracted features that lack
semantic guidance and have limited target-aware capability in complex
scenarios, \eg similar distractors and extreme illumination. In this work,
inspired by the recent success of exploring foundation models with unified
architecture for both natural language and computer vision tasks, we propose an
All-in-One framework, which learns joint feature extraction and interaction by
adopting a unified transformer backbone. Specifically, we mix raw vision and
language signals to generate language-injected vision tokens, which we then
concatenate before feeding into the unified backbone architecture. This
approach achieves feature integration in a unified backbone, removing the need
for carefully-designed fusion modules and resulting in a more effective and
efficient VL tracking framework. To further improve the learning efficiency, we
introduce a multi-modal alignment module based on cross-modal and intra-modal
contrastive objectives, providing more reasonable representations for the
unified All-in-One transformer backbone. Extensive experiments on five
benchmarks, \ie OTB99-L, TNL2K, LaSOT, LaSOT and WebUAV-3M,
demonstrate the superiority of the proposed tracker against existing
state-of-the-arts on VL tracking. Codes will be made publicly available.Comment: Work in progres
On Bringing Robots Home
Throughout history, we have successfully integrated various machines into our
homes. Dishwashers, laundry machines, stand mixers, and robot vacuums are a few
recent examples. However, these machines excel at performing only a single task
effectively. The concept of a "generalist machine" in homes - a domestic
assistant that can adapt and learn from our needs, all while remaining
cost-effective - has long been a goal in robotics that has been steadily
pursued for decades. In this work, we initiate a large-scale effort towards
this goal by introducing Dobb-E, an affordable yet versatile general-purpose
system for learning robotic manipulation within household settings. Dobb-E can
learn a new task with only five minutes of a user showing it how to do it,
thanks to a demonstration collection tool ("The Stick") we built out of cheap
parts and iPhones. We use the Stick to collect 13 hours of data in 22 homes of
New York City, and train Home Pretrained Representations (HPR). Then, in a
novel home environment, with five minutes of demonstrations and fifteen minutes
of adapting the HPR model, we show that Dobb-E can reliably solve the task on
the Stretch, a mobile robot readily available on the market. Across roughly 30
days of experimentation in homes of New York City and surrounding areas, we
test our system in 10 homes, with a total of 109 tasks in different
environments, and finally achieve a success rate of 81%. Beyond success
percentages, our experiments reveal a plethora of unique challenges absent or
ignored in lab robotics. These range from effects of strong shadows, to
variable demonstration quality by non-expert users. With the hope of
accelerating research on home robots, and eventually seeing robot butlers in
every home, we open-source Dobb-E software stack and models, our data, and our
hardware designs at https://dobb-e.comComment: Project website and videos are available at https://dobb-e.com,
technical documentation for getting started is available at
https://docs.dobb-e.com, and code is released at
https://github.com/notmahi/dobb-
Recommended from our members
Grace-AKO: a novel and stable knockoff filter for variable selection incorporating gene network structures
Motivation
Variable selection is a common statistical approach to identifying genes associated with clinical outcomes of scientific interest. There are thousands of genes in genomic studies, while only a limited number of individual samples are available. Therefore, it is important to develop a method to identify genes associated with outcomes of interest that can control finite-sample false discovery rate (FDR) in high-dimensional data settings.
Results
This article proposes a novel method named Grace-AKO for graph-constrained estimation (Grace), which incorporates aggregation of multiple knockoffs (AKO) with the network-constrained penalty. Grace-AKO can control FDR in finite-sample settings and improve model stability simultaneously. Simulation studies show that Grace-AKO has better performance in finite-sample FDR control than the original Grace model. We apply Grace-AKO to the prostate cancer data in The Cancer Genome Atlas program by incorporating prostate-specific antigen (PSA) pathways in the Kyoto Encyclopedia of Genes and Genomes as the prior information. Grace-AKO finally identifies 47 candidate genes associated with PSA level, and more than 75% of the detected genes can be validated
Model-Driven Federated Learning for Channel Estimation in Millimeter-Wave Massive MIMO Systems
This paper investigates the model-driven federated learning (FL) for channel estimation in multi-user millimeterwave (mmWave) massive multiple-input multiple-output (MIMO) systems. Firstly, we formulate it as a sparse signal recovery problem by exploiting the beamspace domain sparsity of the mmWave channels. Then, we propose an FL-based learned approximate message passing (LAMP) channel estimation scheme, namely FL-LAMP, where the LAMP network is trained by an FL framework. Specifically, the base station (BS) and users jointly train the LAMP network, where the users update the local LAMP network parameters by local datasets consisting of measurement signals and beamspace channels, and the BS calculates the global LAMP network parameters by aggregating the local network parameters from all the users. The beamspace channel can thus be obtained in real time from the measurement signal based on the parameters of the trained LAMP network. Simulation results demonstrate that the proposed FL-LAMP scheme can achieve better channel estimation accuracy than the existing orthogonal matching pursuit (OMP) and approximate message passing (AMP) schemes, and provides satisfactory prediction capability for multipath channels
Protection against acute cerebral ischemia/reperfusion injury by QiShenYiQi via neuroinflammatory network mobilization
Cerebral ischemia/reperfusion injury (CI/RI) is a common feature of ischemic stroke, involving a period of impaired blood supply to the brain, followed by the restoration of cerebral perfusion through medical intervention. Although ischemia and reperfusion brain damage is a complex pathological process with an unclear physiological mechanism, more attention is currently focused on the neuroinflammatory response of an ischemia/reperfusion origin, and anti-inflammatory appears to be a potential therapeutic strategy following ischemic stroke. QiShenYiQi (QSYQ), a component-based Chinese medicine with Qi-tonifying and blood-activating property, has pharmacological actions of anti-inflammatory, antioxidant, mitochondrial protectant, anti-apoptosis, and antiplatelet aggregation. We have previously reported that the cardioprotective effect of QSYQ against ischemia/reperfusion injury is via improvement of mitochondrial functional integrity. In this research work, we aimed to investigate the possible mechanism involved in the neuroprotection of QSYQ in mice model of cerebral ischemia/reperfusion injury based on the inflammatory pathway. The cerebral protection was evaluated in the stroke mice after 24 h reperfusion by assessing the neurological deficit, cerebral infarction, brain edema, BBB functionality, and via histopathological assessment. TCM-based network pharmacology method was performed to establish and analyze compound-target-disease & function-pathway network so as to find the possible mechanism linking to the role of QSYQ in CI/RI. In addition, RT-qPCR was used to verify the accuracy of predicted signaling gene expression. As a result, improvement of neurological outcome, reduction of infarct volume and brain edema, a decrease in BBB disruption, and amelioration of histopathological alteration were observed in mice pretreated with QSYQ after experimental stroke surgery. Network pharmacology analysis revealed neuroinflammatory response was associated with the action of QSYQ in CI/RI. RT-qPCR data showed that the mice pretreated with QSYQ could significantly decrease IFNG-γ, IL-6, TNF-α, NF-κB p65, and TLR-4 mRNA levels and increase TGF-β1 mRNA level in the brain compared to the untreated mice after CI/RI (p \u3c 0.05). In conclusion, our study indicated the cerebral protective effect of pretreatment with QSYQ against CI/RI, which may be partly related to its potential to the reduction of neuroinflammatory response in a stroke subject
- …