239 research outputs found

    Enhancing the Authenticity of Rendered Portraits with Identity-Consistent Transfer Learning

    Full text link
    Despite rapid advances in computer graphics, creating high-quality photo-realistic virtual portraits is prohibitively expensive. Furthermore, the well-know ''uncanny valley'' effect in rendered portraits has a significant impact on the user experience, especially when the depiction closely resembles a human likeness, where any minor artifacts can evoke feelings of eeriness and repulsiveness. In this paper, we present a novel photo-realistic portrait generation framework that can effectively mitigate the ''uncanny valley'' effect and improve the overall authenticity of rendered portraits. Our key idea is to employ transfer learning to learn an identity-consistent mapping from the latent space of rendered portraits to that of real portraits. During the inference stage, the input portrait of an avatar can be directly transferred to a realistic portrait by changing its appearance style while maintaining the facial identity. To this end, we collect a new dataset, Daz-Rendered-Faces-HQ (DRFHQ), that is specifically designed for rendering-style portraits. We leverage this dataset to fine-tune the StyleGAN2 generator, using our carefully crafted framework, which helps to preserve the geometric and color features relevant to facial identity. We evaluate our framework using portraits with diverse gender, age, and race variations. Qualitative and quantitative evaluations and ablation studies show the advantages of our method compared to state-of-the-art approaches.Comment: 10 pages, 8 figures, 2 table

    All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment

    Full text link
    Current mainstream vision-language (VL) tracking framework consists of three parts, \ie a visual feature extractor, a language feature extractor, and a fusion model. To pursue better performance, a natural modus operandi for VL tracking is employing customized and heavier unimodal encoders, and multi-modal fusion models. Albeit effective, existing VL trackers separate feature extraction and feature integration, resulting in extracted features that lack semantic guidance and have limited target-aware capability in complex scenarios, \eg similar distractors and extreme illumination. In this work, inspired by the recent success of exploring foundation models with unified architecture for both natural language and computer vision tasks, we propose an All-in-One framework, which learns joint feature extraction and interaction by adopting a unified transformer backbone. Specifically, we mix raw vision and language signals to generate language-injected vision tokens, which we then concatenate before feeding into the unified backbone architecture. This approach achieves feature integration in a unified backbone, removing the need for carefully-designed fusion modules and resulting in a more effective and efficient VL tracking framework. To further improve the learning efficiency, we introduce a multi-modal alignment module based on cross-modal and intra-modal contrastive objectives, providing more reasonable representations for the unified All-in-One transformer backbone. Extensive experiments on five benchmarks, \ie OTB99-L, TNL2K, LaSOT, LaSOTExt_{\rm Ext} and WebUAV-3M, demonstrate the superiority of the proposed tracker against existing state-of-the-arts on VL tracking. Codes will be made publicly available.Comment: Work in progres

    On Bringing Robots Home

    Full text link
    Throughout history, we have successfully integrated various machines into our homes. Dishwashers, laundry machines, stand mixers, and robot vacuums are a few recent examples. However, these machines excel at performing only a single task effectively. The concept of a "generalist machine" in homes - a domestic assistant that can adapt and learn from our needs, all while remaining cost-effective - has long been a goal in robotics that has been steadily pursued for decades. In this work, we initiate a large-scale effort towards this goal by introducing Dobb-E, an affordable yet versatile general-purpose system for learning robotic manipulation within household settings. Dobb-E can learn a new task with only five minutes of a user showing it how to do it, thanks to a demonstration collection tool ("The Stick") we built out of cheap parts and iPhones. We use the Stick to collect 13 hours of data in 22 homes of New York City, and train Home Pretrained Representations (HPR). Then, in a novel home environment, with five minutes of demonstrations and fifteen minutes of adapting the HPR model, we show that Dobb-E can reliably solve the task on the Stretch, a mobile robot readily available on the market. Across roughly 30 days of experimentation in homes of New York City and surrounding areas, we test our system in 10 homes, with a total of 109 tasks in different environments, and finally achieve a success rate of 81%. Beyond success percentages, our experiments reveal a plethora of unique challenges absent or ignored in lab robotics. These range from effects of strong shadows, to variable demonstration quality by non-expert users. With the hope of accelerating research on home robots, and eventually seeing robot butlers in every home, we open-source Dobb-E software stack and models, our data, and our hardware designs at https://dobb-e.comComment: Project website and videos are available at https://dobb-e.com, technical documentation for getting started is available at https://docs.dobb-e.com, and code is released at https://github.com/notmahi/dobb-

    Model-Driven Federated Learning for Channel Estimation in Millimeter-Wave Massive MIMO Systems

    Get PDF
    This paper investigates the model-driven federated learning (FL) for channel estimation in multi-user millimeterwave (mmWave) massive multiple-input multiple-output (MIMO) systems. Firstly, we formulate it as a sparse signal recovery problem by exploiting the beamspace domain sparsity of the mmWave channels. Then, we propose an FL-based learned approximate message passing (LAMP) channel estimation scheme, namely FL-LAMP, where the LAMP network is trained by an FL framework. Specifically, the base station (BS) and users jointly train the LAMP network, where the users update the local LAMP network parameters by local datasets consisting of measurement signals and beamspace channels, and the BS calculates the global LAMP network parameters by aggregating the local network parameters from all the users. The beamspace channel can thus be obtained in real time from the measurement signal based on the parameters of the trained LAMP network. Simulation results demonstrate that the proposed FL-LAMP scheme can achieve better channel estimation accuracy than the existing orthogonal matching pursuit (OMP) and approximate message passing (AMP) schemes, and provides satisfactory prediction capability for multipath channels

    GeSICA: Genome segmentation from intra-chromosomal associations

    Full text link

    Protection against acute cerebral ischemia/reperfusion injury by QiShenYiQi via neuroinflammatory network mobilization

    Get PDF
    Cerebral ischemia/reperfusion injury (CI/RI) is a common feature of ischemic stroke, involving a period of impaired blood supply to the brain, followed by the restoration of cerebral perfusion through medical intervention. Although ischemia and reperfusion brain damage is a complex pathological process with an unclear physiological mechanism, more attention is currently focused on the neuroinflammatory response of an ischemia/reperfusion origin, and anti-inflammatory appears to be a potential therapeutic strategy following ischemic stroke. QiShenYiQi (QSYQ), a component-based Chinese medicine with Qi-tonifying and blood-activating property, has pharmacological actions of anti-inflammatory, antioxidant, mitochondrial protectant, anti-apoptosis, and antiplatelet aggregation. We have previously reported that the cardioprotective effect of QSYQ against ischemia/reperfusion injury is via improvement of mitochondrial functional integrity. In this research work, we aimed to investigate the possible mechanism involved in the neuroprotection of QSYQ in mice model of cerebral ischemia/reperfusion injury based on the inflammatory pathway. The cerebral protection was evaluated in the stroke mice after 24 h reperfusion by assessing the neurological deficit, cerebral infarction, brain edema, BBB functionality, and via histopathological assessment. TCM-based network pharmacology method was performed to establish and analyze compound-target-disease & function-pathway network so as to find the possible mechanism linking to the role of QSYQ in CI/RI. In addition, RT-qPCR was used to verify the accuracy of predicted signaling gene expression. As a result, improvement of neurological outcome, reduction of infarct volume and brain edema, a decrease in BBB disruption, and amelioration of histopathological alteration were observed in mice pretreated with QSYQ after experimental stroke surgery. Network pharmacology analysis revealed neuroinflammatory response was associated with the action of QSYQ in CI/RI. RT-qPCR data showed that the mice pretreated with QSYQ could significantly decrease IFNG-γ, IL-6, TNF-α, NF-κB p65, and TLR-4 mRNA levels and increase TGF-β1 mRNA level in the brain compared to the untreated mice after CI/RI (p \u3c 0.05). In conclusion, our study indicated the cerebral protective effect of pretreatment with QSYQ against CI/RI, which may be partly related to its potential to the reduction of neuroinflammatory response in a stroke subject
    • …
    corecore