68 research outputs found

    VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations

    Full text link
    Recent advancements in implicit neural representations have contributed to high-fidelity surface reconstruction and photorealistic novel view synthesis. However, the computational complexity inherent in these methodologies presents a substantial impediment, constraining the attainable frame rates and resolutions in practical applications. In response to this predicament, we propose VQ-NeRF, an effective and efficient pipeline for enhancing implicit neural representations via vector quantization. The essence of our method involves reducing the sampling space of NeRF to a lower resolution and subsequently reinstating it to the original size utilizing a pre-trained VAE decoder, thereby effectively mitigating the sampling time bottleneck encountered during rendering. Although the codebook furnishes representative features, reconstructing fine texture details of the scene remains challenging due to high compression rates. To overcome this constraint, we design an innovative multi-scale NeRF sampling scheme that concurrently optimizes the NeRF model at both compressed and original scales to enhance the network's ability to preserve fine details. Furthermore, we incorporate a semantic loss function to improve the geometric fidelity and semantic coherence of our 3D reconstructions. Extensive experiments demonstrate the effectiveness of our model in achieving the optimal trade-off between rendering quality and efficiency. Evaluation on the DTU, BlendMVS, and H3DS datasets confirms the superior performance of our approach.Comment: Submitted to the 38th Annual AAAI Conference on Artificial Intelligenc

    PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation

    Full text link
    Recent advances in implicit neural representations have achieved impressive results by sampling and fusing individual points along sampling rays in the sampling space. However, due to the explosively growing sampling space, finely representing and synthesizing detailed textures remains a challenge for unbounded large-scale outdoor scenes. To alleviate the dilemma of using individual points to perceive the entire colossal space, we explore learning the surface distribution of the scene to provide structural priors and reduce the samplable space and propose a Point Diffusion implicit Function, PDF, for large-scale scene neural representation. The core of our method is a large-scale point cloud super-resolution diffusion module that enhances the sparse point cloud reconstructed from several training images into a dense point cloud as an explicit prior. Then in the rendering stage, only sampling points with prior points within the sampling radius are retained. That is, the sampling space is reduced from the unbounded space to the scene surface. Meanwhile, to fill in the background of the scene that cannot be provided by point clouds, the region sampling based on Mip-NeRF 360 is employed to model the background representation. Expensive experiments have demonstrated the effectiveness of our method for large-scale scene novel view synthesis, which outperforms relevant state-of-the-art baselines.Comment: Accepted to NeurIPS 202

    LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

    Full text link
    Recent advances in Large Multimodal Models (LMM) have made it possible for various applications in human-machine interactions. However, developing LMMs that can comprehend, reason, and plan in complex and diverse 3D environments remains a challenging topic, especially considering the demand for understanding permutation-invariant point cloud 3D representations of the 3D scene. Existing works seek help from multi-view images, and project 2D features to 3D space as 3D scene representations. This, however, leads to huge computational overhead and performance degradation. In this paper, we present LL3DA, a Large Language 3D Assistant that takes point cloud as direct input and respond to both textual-instructions and visual-prompts. This help LMMs better comprehend human interactions and further help to remove the ambiguities in cluttered 3D scenes. Experiments show that LL3DA achieves remarkable results, and surpasses various 3D vision-language models on both 3D Dense Captioning and 3D Question Answering.Comment: Project Page: https://ll3da.github.io

    ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

    Full text link
    The advent of large language models, enabling flexibility through instruction-driven approaches, has revolutionized many traditional generative tasks, but large models for 3D data, particularly in comprehensively handling 3D shapes with other modalities, are still under-explored. By achieving instruction-based shape generations, versatile multimodal generative shape models can significantly benefit various fields like 3D virtual construction and network-aided design. In this work, we present ShapeGPT, a shape-included multi-modal framework to leverage strong pre-trained language models to address multiple shape-relevant tasks. Specifically, ShapeGPT employs a word-sentence-paragraph framework to discretize continuous shapes into shape words, further assembles these words for shape sentences, as well as integrates shape with instructional text for multi-modal paragraphs. To learn this shape-language model, we use a three-stage training scheme, including shape representation, multimodal alignment, and instruction-based generation, to align shape-language codebooks and learn the intricate correlations among these modalities. Extensive experiments demonstrate that ShapeGPT achieves comparable performance across shape-relevant tasks, including text-to-shape, shape-to-text, shape completion, and shape editing

    Interleukin-41: a novel serum marker for the diagnosis of alpha-fetoprotein-negative hepatocellular carcinoma

    Get PDF
    BackgroundFor the lack of effective serum markers for hepatocellular carcinoma(HCC) diagnosis, it is difficult to detect liver cancer and identify its recurrence early.MethodsDatabases were used to analyze the genes potentially associated with alpha-fetoprotein(AFP). ELISA assay was used to detect the serum IL-41 in HCC, liver metastases, hepatitis, and healthy people. Immunohistochemical staining was used to analyze the relative quantification of IL-41 in HCC and paracancer tissues. Various survival curves were plotted according to clinical pathological data and helped us draw the ROC curve of IL-41 diagnosis of HCC.ResultsThe serum expression of IL-41 was highest in AFP negative HCC patients and significantly higher than that in AFP positive HCC and metastatic cancer patients. There was a significant negative correlation between elevated serum IL-41 and AFP(<1500ng/ml). The clinicopathological features suggested that the serum IL-41 level was significantly correlated with capsule invasion, low differentiation and AFP. High serum expression of IL-41 suggests poorer survival and earlier recurrence after resection, and IL-41 upregulated in patients with early recurrence and death. The expression of IL-41 was higher in HCC tissues of patients with multiple tumors or microvascular invasion. The ROC curve showed that serum IL-41 had a sensitivity of 90.17 for HCC and a sensitivity of 96.63 for AFP-negative HCC, while the specificity was higher than 61%.ConclusionIL-41 in serum and tissue suggests poor prognosis and postoperative recurrence in HCC patients and could be a new serum diagnostic marker for AFP negative patients

    The impact of immunoglobulin G N-glycosylation level on COVID-19 outcome: evidence from a Mendelian randomization study

    Get PDF
    BackgroundThe coronavirus disease 2019 (COVID-19) pandemic has exerted a profound influence on humans. Increasing evidence shows that immune response is crucial in influencing the risk of infection and disease severity. Observational studies suggest an association between COVID‐19 and immunoglobulin G (IgG) N-glycosylation traits, but the causal relevance of these traits in COVID-19 susceptibility and severity remains controversial.MethodsWe conducted a two-sample Mendelian randomization (MR) analysis to explore the causal association between 77 IgG N-glycosylation traits and COVID-19 susceptibility, hospitalization, and severity using summary-level data from genome-wide association studies (GWAS) and applying multiple methods including inverse-variance weighting (IVW), MR Egger, and weighted median. We also used Cochran’s Q statistic and leave-one-out analysis to detect heterogeneity across each single nucleotide polymorphism (SNP). Additionally, we used the MR-Egger intercept test, MR-PRESSO global test, and PhenoScanner tool to detect and remove SNPs with horizontal pleiotropy and to ensure the reliability of our results.ResultsWe found significant causal associations between genetically predicted IgG N-glycosylation traits and COVID-19 susceptibility, hospitalization, and severity. Specifically, we observed reduced risk of COVID-19 with the genetically predicted increased IgG N-glycan trait IGP45 (OR = 0.95, 95% CI = 0.92–0.98; FDR = 0.019). IGP22 and IGP30 were associated with a higher risk of COVID-19 hospitalization and severity. Two (IGP2 and IGP77) and five (IGP10, IGP14, IGP34, IGP36, and IGP50) IgG N-glycosylation traits were causally associated with a decreased risk of COVID-19 hospitalization and severity, respectively. Sensitivity analyses did not identify any horizontal pleiotropy.ConclusionsOur study provides evidence that genetically elevated IgG N-glycosylation traits may have a causal effect on diverse COVID-19 outcomes. Our findings have potential implications for developing targeted interventions to improve COVID-19 outcomes by modulating IgG N-glycosylation levels

    When guanxi meets structural holes : exploring the guanxi networks of Chinese entrepreneurs on digital platforms

    Get PDF
    In this exploratory study, we investigate how Chinese entrepreneurs on digital platforms interact and leverage guanxi (a system of relationships and social network) to buffer the negative impacts of structural holes on knowledge orchestration. We develop our research model and formulate ten hypotheses by drawing on the literature. We adopt a mixed-methods research approach in which we use quantitative surveys to test the hypotheses, and qualitative interviews to explain why certain relationships are stronger in one stage of entrepreneurial development than the other. The study contributes to the literature on digital entrepreneurship in two ways. First, this study offers an initial understanding of the dynamics of guanxi networks for knowledge mobilisation and knowledge coordination across start-up and growth stages of Chinese entrepreneurs on digital platforms. Second, by drawing on the relevant literature, our findings extend the current understanding of knowledge orchestration of digital entrepreneurs and contribute to the literatures of structural holes theory and guanxi

    Image forensics on exchangeable image file format header

    No full text
    In recent years, technologies related to digital photography including both hardware and software have gained rapid progress. As a result of these technologies’ wide application in digital still camera (DSC) system, many researchers become increasingly interested in using the digital cameras to record the real happenings for supporting evidences and historical events, such as news reporting in journal and magazine, police investigation and law enforcement, etc. The digital images as the outputs of digital cameras, can record these events. However, accompanied with these emerging digital photography technologies, various kinds of image editing tools have also been developed. With these tools, digital images can be easily altered without any visual clues left. As a result, the credibility of digital images becomes questionable. For example, when a forged photo is adopted as the court judgment evidence, incorrect verdict may happen. In view of this, this thesis focuses on passive image forensics research which uses both the Exchangeable Image File Format (EXIF) information and the image captured by a camera to verify the authenticity and integrity of the digital image.DOCTOR OF PHILOSOPHY (EEE
    corecore