77 research outputs found

    Text-Guided Texturing by Synchronized Multi-View Diffusion

    Full text link
    This paper introduces a novel approach to synthesize texture to dress up a given 3D object, given a text prompt. Based on the pretrained text-to-image (T2I) diffusion model, existing methods usually employ a project-and-inpaint approach, in which a view of the given object is first generated and warped to another view for inpainting. But it tends to generate inconsistent texture due to the asynchronous diffusion of multiple views. We believe such asynchronous diffusion and insufficient information sharing among views are the root causes of the inconsistent artifact. In this paper, we propose a synchronized multi-view diffusion approach that allows the diffusion processes from different views to reach a consensus of the generated content early in the process, and hence ensures the texture consistency. To synchronize the diffusion, we share the denoised content among different views in each denoising step, specifically blending the latent content in the texture domain from views with overlap. Our method demonstrates superior performance in generating consistent, seamless, highly detailed textures, comparing to state-of-the-art methods

    Redistributing the Precision and Content in 3D-LUT-based Inverse Tone-mapping for HDR/WCG Display

    Full text link
    ITM(inverse tone-mapping) converts SDR (standard dynamic range) footage to HDR/WCG (high dynamic range /wide color gamut) for media production. It happens not only when remastering legacy SDR footage in front-end content provider, but also adapting on-theair SDR service on user-end HDR display. The latter requires more efficiency, thus the pre-calculated LUT (look-up table) has become a popular solution. Yet, conventional fixed LUT lacks adaptability, so we learn from research community and combine it with AI. Meanwhile, higher-bit-depth HDR/WCG requires larger LUT than SDR, so we consult traditional ITM for an efficiency-performance trade-off: We use 3 smaller LUTs, each has a non-uniform packing (precision) respectively denser in dark, middle and bright luma range. In this case, their results will have less error only in their own range, so we use a contribution map to combine their best parts to final result. With the guidance of this map, the elements (content) of 3 LUTs will also be redistributed during training. We conduct ablation studies to verify method's effectiveness, and subjective and objective experiments to show its practicability. Code is available at: https://github.com/AndreGuo/ITMLUT.Comment: Accepted in CVMP2023 (the 20th ACM SIGGRAPH European Conference on Visual Media Production

    Improved Diffusion-based Image Colorization via Piggybacked Models

    Full text link
    Image colorization has been attracting the research interests of the community for decades. However, existing methods still struggle to provide satisfactory colorized results given grayscale images due to a lack of human-like global understanding of colors. Recently, large-scale Text-to-Image (T2I) models have been exploited to transfer the semantic information from the text prompts to the image domain, where text provides a global control for semantic objects in the image. In this work, we introduce a colorization model piggybacking on the existing powerful T2I diffusion model. Our key idea is to exploit the color prior knowledge in the pre-trained T2I diffusion model for realistic and diverse colorization. A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model to output a latent color prior that conforms to the visual semantics of the grayscale input. A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image. Our model can also achieve conditional colorization with additional inputs (e.g. user hints and texts). Extensive experiments show that our method achieves state-of-the-art performance in terms of perceptual quality.Comment: project page: https://piggyback-color.github.io

    Bacterial regulon modeling and prediction based on systematic \u3ci\u3ecis\u3c/i\u3e regulatory motif analyses

    Get PDF
    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria

    An Integrative and Applicable Phylogenetic Footprinting Framework for \u3cem\u3ecis\u3c/em\u3e-regulatory Motifs Identification in Prokaryotic Genomes

    Get PDF
    Background: Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Results: Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP3 ). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP3 consistently outperformed other popular motif finding tools. We have integrated MP3 into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. Conclusion: The performance evaluation indicated that MP3 is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular

    CBLab: Supporting the Training of Large-scale Traffic Control Policies with Scalable Traffic Simulation

    Full text link
    Traffic simulation provides interactive data for the optimization of traffic control policies. However, existing traffic simulators are limited by their lack of scalability and shortage in input data, which prevents them from generating interactive data from traffic simulation in the scenarios of real large-scale city road networks. In this paper, we present \textbf{C}ity \textbf{B}rain \textbf{Lab}, a toolkit for scalable traffic simulation. CBLab consists of three components: CBEngine, CBData, and CBScenario. CBEngine is a highly efficient simulator supporting large-scale traffic simulation. CBData includes a traffic dataset with road network data of 100 cities all around the world. We also develop a pipeline to conduct a one-click transformation from raw road networks to input data of our traffic simulation. Combining CBEngine and CBData allows researchers to run scalable traffic simulations in the road network of real large-scale cities. Based on that, CBScenario implements an interactive environment and a benchmark for two scenarios of traffic control policies respectively, with which traffic control policies adaptable for large-scale urban traffic can be trained and tuned. To the best of our knowledge, CBLab is the first infrastructure supporting traffic control policy optimization in large-scale urban scenarios. CBLab has supported the City Brain Challenge @ KDD CUP 2021. The project is available on GitHub:~\url{https://github.com/CityBrainLab/CityBrainLab.git}.Comment: Accepted by KDD2023 (Applied Data Science Track
    corecore