77 research outputs found

    GANHead: Towards Generative Animatable Neural Head Avatars

    Full text link
    To bring digital avatars into people's lives, it is highly demanded to efficiently generate complete, realistic, and animatable head avatars. This task is challenging, and it is difficult for existing methods to satisfy all the requirements at once. To achieve these goals, we propose GANHead (Generative Animatable Neural Head Avatar), a novel generative head model that takes advantages of both the fine-grained control over the explicit expression parameters and the realistic rendering results of implicit representations. Specifically, GANHead represents coarse geometry, fine-gained details and texture via three networks in canonical space to obtain the ability to generate complete and realistic head avatars. To achieve flexible animation, we define the deformation filed by standard linear blend skinning (LBS), with the learned continuous pose and expression bases and LBS weights. This allows the avatars to be directly animated by FLAME parameters and generalize well to unseen poses and expressions. Compared to state-of-the-art (SOTA) methods, GANHead achieves superior performance on head avatar generation and raw scan fitting.Comment: Camera-ready for CVPR 2023. Project page: https://wsj-sjtu.github.io/GANHead

    Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

    Full text link
    The rapid evolution of Multi-modality Large Language Models (MLLMs) has catalyzed a shift in computer vision from specialized models to general-purpose foundation models. Nevertheless, there is still an inadequacy in assessing the abilities of MLLMs on low-level visual perception and understanding. To address this gap, we present Q-Bench, a holistic benchmark crafted to systematically evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment. a) To evaluate the low-level perception ability, we construct the LLVisionQA dataset, consisting of 2,990 diverse-sourced images, each equipped with a human-asked question focusing on its low-level attributes. We then measure the correctness of MLLMs on answering these questions. b) To examine the description ability of MLLMs on low-level information, we propose the LLDescribe dataset consisting of long expert-labelled golden low-level text descriptions on 499 images, and a GPT-involved comparison pipeline between outputs of MLLMs and the golden descriptions. c) Besides these two tasks, we further measure their visual quality assessment ability to align with human opinion scores. Specifically, we design a softmax-based strategy that enables MLLMs to predict quantifiable quality scores, and evaluate them on various existing image quality assessment (IQA) datasets. Our evaluation across the three abilities confirms that MLLMs possess preliminary low-level visual skills. However, these skills are still unstable and relatively imprecise, indicating the need for specific enhancements on MLLMs towards these abilities. We hope that our benchmark can encourage the research community to delve deeper to discover and enhance these untapped potentials of MLLMs. Project Page: https://vqassessment.github.io/Q-Bench.Comment: 25 pages, 14 figures, 9 tables, preprint versio

    Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

    Full text link
    Multi-modality foundation models, as represented by GPT-4V, have brought a new paradigm for low-level visual perception and understanding tasks, that can respond to a broad range of natural human instructions in a model. While existing foundation models have shown exciting potentials on low-level visual tasks, their related abilities are still preliminary and need to be improved. In order to enhance these models, we conduct a large-scale subjective experiment collecting a vast number of real human feedbacks on low-level vision. Each feedback follows a pathway that starts with a detailed description on the low-level visual appearance (*e.g. clarity, color, brightness* of an image, and ends with an overall conclusion, with an average length of 45 words. The constructed **Q-Pathway** dataset includes 58K detailed human feedbacks on 18,973 images with diverse low-level appearance. Moreover, to enable foundation models to robustly respond to diverse types of questions, we design a GPT-participated conversion to process these feedbacks into diverse-format 200K instruction-response pairs. Experimental results indicate that the **Q-Instruct** consistently elevates low-level perception and understanding abilities across several foundational models. We anticipate that our datasets can pave the way for a future that general intelligence can perceive, understand low-level visual appearance and evaluate visual quality like a human. Our dataset, model zoo, and demo is published at: https://q-future.github.io/Q-Instruct.Comment: 16 pages, 11 figures, page 12-16 as appendi

    Retaining mTeSR1 Medium during Hepatic Differentiation Facilitates Hepatocyte-Like Cell Survival by Decreasing Apoptosis

    Get PDF
    Background/Aims: Hepatocyte-like cells derived from human pluripotent stem cells could be an important cell source for hepatocyte transplantation. The present study investigated the effect of retaining mTeSR1 medium during hepatic differentiation on hepatocyte-like cells in vitro. Methods: Human embryonic stem cell line H1 were treated with activin A and bone morphogenetic protein 4 (BMP4) for definitive endoderm (DE) cell induction and subsequently treated with BMP2 and fibroblast growth factor 4 (FGF4) for early hepatic cell induction. Hepatocyte growth factor (HGF) and fibroblast growth factor (KGF) were added for early hepatic cell expansion and then mixed with oncostatin-M for maturation. During DE induction, 0%, 25%, 50% and 75% concentrations of mTeSR1 medium were separately added for early hepatic induction and expansion. For optimization, the expression levels of SRY-related HMG-box 17 (SOX17) and forkhead box A2 (FOXA2) at day 4, alpha fetoprotein (AFP) and hepatocyte nuclear factor 4α (HNF4α) at day 15, and albumin (ALB) at day 25 were quantified in differentiated cells by qRT-PCR. The ALB-positive cell proportion was measured by flow cytometry. Functional tests including ALB secretion and indocyanine green (ICG) angiography uptake and release by ELISA, urea production by urea assay kit, and glycogen storage ability by periodic acid Schif reaction (PAS) staining were performed in the differentiated cells. The induced pluripotent stem (iPS) cells were used to examine whether the optimized method was suitable for differentiating iPS cells. DE and hepatic markers were detected by immunostaining, and functional testing was performed as described above. Flow cytometry with an Annexin V-FITC apoptosis detection kit and fluorescence microscopy with Hoechst 33258 were used to analyze apoptosis in differentiated cells derived from H1 cells. Results: All differentiated cells with retention of 0%, 25%, 50% and 75% mTeSR1 expressed SOX17, FOXA2, AFP, HNF4α, and ALB, while higher expression levels were observed in differentiated cells in the 0% and 25% groups. The flow cytometry results showed that the proportion of ALB-positive differentiated cells derived from H1 cells was higher in the 25% mTeSR1 group than in other groups. However, no significant difference in ALB secretion, urea production, ICG uptake and release and glycogen storage ability was detected between the 25% and 0% groups. The iPS cells could differentiate into hepatocyte-like cells with 25% mTeSR1 retention. The apoptosis ratio of differentiated cells was lower in the 25% mTeSR1 group than in the 0% mTeSR1 group. Conclusion: Retaining 25% mTeSR1 medium during hepatic differentiation has been proposed to increase the percentage of ALB-positive cells and cell survival by decreasing cell apoptosis

    Learning Local Neighboring Structure for Robust 3D Shape Representation

    No full text
    Mesh is a powerful data structure for 3D shapes. Representation learning for 3D meshes is important in many computer vision and graphics applications. The recent success of convolutional neural networks (CNNs) for structured data (e.g., images) suggests the value of adapting insight from CNN for 3D shapes. However, 3D shape data are irregular since each node's neighbors are unordered. Various graph neural networks for 3D shapes have been developed with isotropic filters or predefined local coordinate systems to overcome the node inconsistency on graphs. However, isotropic filters or predefined local coordinate systems limit the representation power. In this paper, we propose a local structure-aware anisotropic convolutional operation (LSA-Conv) that learns adaptive weighting matrices for each node according to the local neighboring structure and performs shared anisotropic filters. In fact, the learnable weighting matrix is similar to the attention matrix in random synthesizer -- a new Transformer model for natural language processing (NLP). Comprehensive experiments demonstrate that our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods

    A pilot and comparative study between pathological and serological levels of immunoglobulin and complement among three kinds of primary glomerulonephritis

    No full text
    Abstract Background Immunoglobulin A nephropathy (IgAN), membranous nephropathy (MN) and minimal-change disease (MCD) are three common types of glomerulonephritis in China. Pathological diagnosis based on renal biopsy is the criterion and the golden standard for diagnosing the sub-types of primary or secondary glomerulonephritis. Immunoglobulin and complements might be used in the differential diagnosis of glomerulonephritis without renal biopsies. However, the relationship between IF intensities of immune proteins and the corresponding serum levels remained unclear, and seldom studies combine histopathological examination results and blood tests together for a predictive purpose. This study was considered as a pilot study for integrating histopathological indicators into serum parameters for exploring the relationship of IF intensity and serum values of immunoglobulin and complement, and for screening and investigating effective indicators inIgAN, MN and MCD. Methods Renal tissue immunofluorescence (IF) intensity grades and serum levels of immunoglobulin and complements (IgG, IgA, IgM, C3 and C4) were retrospectively analyzed in 236 cases with IgAN, MN or MCD. IF grades were grouped as negative (−), positive (+) or strong positive (++) with both high and low magnification of microscope. Other serum indicators such as urea nitrogen (BUN), creatinine (Crea) and estimated glomerular filtration rate (eGFR) were also evaluated among the groups. Results There were difference in IgA, IgG and C3 IF intensity grades among IgAN, MN and MCD groups (p = 9.82E-43, 4.60E-39, 7.45E-15, respectively). Serum values of BUN, Crea, eGFR, IgG, IgA, IgM and C4 showed difference in three groups (BUN: p = 0.045, Crea: p = 3.45E-5, eGFR: p = 0.005, IgG: p = 1.68E-14, IgA: p = 9.14E-9, IgM: p = 0.014, C4: p = 0.026). eGFR had the trend to decrease with enhanced IgA IF positive grades (p = 8.99E-4); Crea had trends to decrease with both enhanced IgA and IgG IF intensity grades (p = 2.06E-6, 2.94E-5, respectively). In all subjects, serum IgA levels was inversely correlated with eGFR(r = − 0.216, p = 0.001) and correlated with Crea levels(r = 0.189, p = 0.004); serum IgG and Crea showed no correlation which were discordance with inverse correlation of IgG IF grade and Crea(r = 0.058,p = 0.379). IgG serum level was inverse correlated with its IF grades (p = 3.54E-5, p = 7.08E-6, respectively); C3 serum levels had significantly difference between Neg and positive (+) group (p = 0.0003). IgA serum level was positive correlated with its IF grades (Neg-(+): p = 0.0001; (+)-(++): p = 0.022; Neg-(++): p = 2.01E-10). After matching comparison among C3 groups, C3 Neg. group and C3 ++ group had difference (*p = 0.017). C4 had all negative IF expression in all pathological groups. In IgAN subjects, there were statistical differences of serum C3 levels between its pathological Neg and positive (+) group(p = 0.026), and serum IgA levels showed difference between IgA pathological positive(+) and (++)(p = 0.007). In MN subjects, sIgG levels showed difference between IgG pathological IF grade positive (+) and (++)(p = 0.044); serum C3 levels showed difference between C3 pathological IF grade Neg and positive(+)(p = 0.005); and serum IgA levels showed difference between Neg and positive(+)(p = 0.040). In IgAN, eGFR showed serum IgA levels had significant differences among groups (p = 0.007) and had increasing trend with enhanced its IF grades(Ptrend = 0.016). There were also difference between IgG group Neg and positive (+) (p = 0.005, Ptrend = 0.007) in IgAN. In MN, serum IgG levels had significant differences among IF groups (p = 0.034) and had decreasing trend with its enhanced IF grades (Ptrend = 0.014). Serum C3 concentrations also were found distinctive among IF groups (p = 0.016) and had in inverse correlation with its enhanced IF grades (Ptrend = 0.004). Discussion Our research cross contrasts several immunoprotein IF intensities and relevant serum levels in three kinds of primary glomerular nephritis, and finally acquired helpful results for understanding the relationships between pathological presentation and serological presentation of immunoproteins in kidney diseases. Furthermore, this pilot study is offering a possible method for the analysis of combination of pathology and serology. Conclusion Different pathological types of nephritis presented different expression patterns of immunoglobulin and complement, especially IgA and IgG, which suggested different pathogenesis involved in the development of IgAN and MN. Furthermore, either in tissue or in serum, increased IgA level was closely related with renal function in all of the patients

    DialogueNeRF: Towards Realistic Avatar Face-to-face Conversation Video Generation

    Full text link
    Conversation is an essential component of virtual avatar activities in the metaverse. With the development of natural language processing, textual and vocal conversation generation has achieved a significant breakthrough. Face-to-face conversations account for the vast majority of daily conversations. However, this task has not acquired enough attention. In this paper, we propose a novel task that aims to generate a realistic human avatar face-to-face conversation process and present a new dataset to explore this target. To tackle this novel task, we propose a new framework that utilizes a series of conversation signals, e.g. audio, head pose, and expression, to synthesize face-to-face conversation videos between human avatars, with all the interlocutors modeled within the same network. Our method is evaluated by quantitative and qualitative experiments in different aspects, e.g. image quality, pose sequence trend, and naturalness of the rendering videos. All the code, data, and models will be made publicly available
    • …
    corecore