339 research outputs found

    360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

    Full text link
    Panorama video recently attracts more interest in both study and application, courtesy of its immersive experience. Due to the expensive cost of capturing 360-degree panoramic videos, generating desirable panorama videos by prompts is urgently required. Lately, the emerging text-to-video (T2V) diffusion methods demonstrate notable effectiveness in standard video generation. However, due to the significant gap in content and motion patterns between panoramic and standard videos, these methods encounter challenges in yielding satisfactory 360-degree panoramic videos. In this paper, we propose a pipeline named 360-Degree Video Diffusion model (360DVD) for generating 360-degree panoramic videos based on the given prompts and motion conditions. Specifically, we introduce a lightweight 360-Adapter accompanied by 360 Enhancement Techniques to transform pre-trained T2V models for panorama video generation. We further propose a new panorama dataset named WEB360 consisting of panoramic video-text pairs for training 360DVD, addressing the absence of captioned panoramic video datasets. Extensive experiments demonstrate the superiority and effectiveness of 360DVD for panorama video generation. Our project page is at https://akaneqwq.github.io/360DVD/.Comment: arXiv admin note: text overlap with arXiv:2307.04725 by other author

    ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images

    Full text link
    With the advent of virtual reality technology, omnidirectional image (ODI) rescaling techniques are increasingly embraced for reducing transmitted and stored file sizes while preserving high image quality. Despite this progress, current ODI rescaling methods predominantly focus on enhancing the quality of images in equirectangular projection (ERP) format, which overlooks the fact that the content viewed on head mounted displays (HMDs) is actually a rendered viewport instead of an ERP image. In this work, we emphasize that focusing solely on ERP quality results in inferior viewport visual experiences for users. Thus, we propose ResVR, which is the first comprehensive framework for the joint Rescaling and Viewport Rendering of ODIs. ResVR allows obtaining LR ERP images for transmission while rendering high-quality viewports for users to watch on HMDs. In our ResVR, a novel discrete pixel sampling strategy is developed to tackle the complex mapping between the viewport and ERP, enabling end-to-end training of ResVR pipeline. Furthermore, a spherical pixel shape representation technique is innovatively derived from spherical differentiation to significantly improve the visual quality of rendered viewports. Extensive experiments demonstrate that our ResVR outperforms existing methods in viewport rendering tasks across different fields of view, resolutions, and view directions while keeping a low transmission overhead

    Explainable exercise recommendation with knowledge graph

    Get PDF
    Recommending suitable exercises and providing the reasons for these recommendations is a highly valuable task, as it can significantly improve students’ learning efficiency. Nevertheless, the extensive range of exercise resources and the diverse learning capacities of students present a notable difficulty in recommending exercises. Collaborative filtering approaches frequently have difficulties in recommending suitable exercises, whereas deep learning methods lack explanation, which restricts their practical use. To address these issue, this paper proposes KG4EER, an explainable exercise recommendation with a knowledge graph. KG4EER facilitates the matching of various students with suitable exercises and offers explanations for its recommendations. More precisely, a feature extraction module is introduced to represent students’ learning features, and a knowledge graph is constructed to recommend exercises. This knowledge graph, which includes three primary entities — knowledge concepts, students, and exercises — and their interrelationships, serves to recommend suitable exercises. Extensive experiments conducted on three real-world datasets, coupled with expert interviews, establish the superiority of KG4EER over existing baseline methods and underscore its robust explainability.</p

    Adaptation and synthetic biology of the model cyanobacterium Synechococcus elongatus for sustainable development: a review

    Get PDF
    Synechococcus elongatus is a model cyanobacterium with remarkable adaptability to diverse environmental stresses, making it a promising candidate for the photoautotrophic conversion of carbon dioxide into valuable chemicals. This review explores the adaptive mechanisms that allow S. elongatus to survive under various abiotic stresses, such as changes in CO2 levels, heavy metals, and light conditions. We also highlight recent advancements in synthetic biology that have enabled the engineering of S. elongatus to produce biofuels and other value-added compounds, including fatty acids, alcohols, and carotenoids. Additionally, we discuss the applications of modern omics techniques to elucidate the genetic basis of stress tolerance and metabolic regulation. Despite the promising potential of S. elongatus for industrial applications, challenges remain in scaling up production, enhancing genetic stability, and optimizing bioreactor systems. Finally, we provide insights into future directions, including the integration of genome engineering, system-level modeling, and co-culture strategies, to improve the efficiency of cyanobacterial cell factories for sustainable biotechnology applications

    Efflex: Efficient and Flexible Pipeline for Spatio-Temporal Trajectory Graph Modeling and Representation Learning

    Full text link
    In the landscape of spatio-temporal data analytics, effective trajectory representation learning is paramount. To bridge the gap of learning accurate representations with efficient and flexible mechanisms, we introduce Efflex, a comprehensive pipeline for transformative graph modeling and representation learning of the large-volume spatio-temporal trajectories. Efflex pioneers the incorporation of a multi-scale k-nearest neighbors (KNN) algorithm with feature fusion for graph construction, marking a leap in dimensionality reduction techniques by preserving essential data features. Moreover, the groundbreaking graph construction mechanism and the high-performance lightweight GCN increase embedding extraction speed by up to 36 times faster. We further offer Efflex in two versions, Efflex-L for scenarios demanding high accuracy, and Efflex-B for environments requiring swift data processing. Comprehensive experimentation with the Porto and Geolife datasets validates our approach, positioning Efflex as the state-of-the-art in the domain. Such enhancements in speed and accuracy highlight the versatility of Efflex, underscoring its wide-ranging potential for deployment in time-sensitive and computationally constrained applications

    OPDN: Omnidirectional Position-aware Deformable Network for Omnidirectional Image Super-Resolution

    Full text link
    360{\deg} omnidirectional images have gained research attention due to their immersive and interactive experience, particularly in AR/VR applications. However, they suffer from lower angular resolution due to being captured by fisheye lenses with the same sensor size for capturing planar images. To solve the above issues, we propose a two-stage framework for 360{\deg} omnidirectional image superresolution. The first stage employs two branches: model A, which incorporates omnidirectional position-aware deformable blocks (OPDB) and Fourier upsampling, and model B, which adds a spatial frequency fusion module (SFF) to model A. Model A aims to enhance the feature extraction ability of 360{\deg} image positional information, while Model B further focuses on the high-frequency information of 360{\deg} images. The second stage performs same-resolution enhancement based on the structure of model A with a pixel unshuffle operation. In addition, we collected data from YouTube to improve the fitting ability of the transformer, and created pseudo low-resolution images using a degradation network. Our proposed method achieves superior performance and wins the NTIRE 2023 challenge of 360{\deg} omnidirectional image super-resolution.Comment: Accepted to CVPRW 202
    corecore