77 research outputs found
Text-Guided Texturing by Synchronized Multi-View Diffusion
This paper introduces a novel approach to synthesize texture to dress up a
given 3D object, given a text prompt. Based on the pretrained text-to-image
(T2I) diffusion model, existing methods usually employ a project-and-inpaint
approach, in which a view of the given object is first generated and warped to
another view for inpainting. But it tends to generate inconsistent texture due
to the asynchronous diffusion of multiple views. We believe such asynchronous
diffusion and insufficient information sharing among views are the root causes
of the inconsistent artifact. In this paper, we propose a synchronized
multi-view diffusion approach that allows the diffusion processes from
different views to reach a consensus of the generated content early in the
process, and hence ensures the texture consistency. To synchronize the
diffusion, we share the denoised content among different views in each
denoising step, specifically blending the latent content in the texture domain
from views with overlap. Our method demonstrates superior performance in
generating consistent, seamless, highly detailed textures, comparing to
state-of-the-art methods
Redistributing the Precision and Content in 3D-LUT-based Inverse Tone-mapping for HDR/WCG Display
ITM(inverse tone-mapping) converts SDR (standard dynamic range) footage to
HDR/WCG (high dynamic range /wide color gamut) for media production. It happens
not only when remastering legacy SDR footage in front-end content provider, but
also adapting on-theair SDR service on user-end HDR display. The latter
requires more efficiency, thus the pre-calculated LUT (look-up table) has
become a popular solution. Yet, conventional fixed LUT lacks adaptability, so
we learn from research community and combine it with AI. Meanwhile,
higher-bit-depth HDR/WCG requires larger LUT than SDR, so we consult
traditional ITM for an efficiency-performance trade-off: We use 3 smaller LUTs,
each has a non-uniform packing (precision) respectively denser in dark, middle
and bright luma range. In this case, their results will have less error only in
their own range, so we use a contribution map to combine their best parts to
final result. With the guidance of this map, the elements (content) of 3 LUTs
will also be redistributed during training. We conduct ablation studies to
verify method's effectiveness, and subjective and objective experiments to show
its practicability. Code is available at: https://github.com/AndreGuo/ITMLUT.Comment: Accepted in CVMP2023 (the 20th ACM SIGGRAPH European Conference on
Visual Media Production
Improved Diffusion-based Image Colorization via Piggybacked Models
Image colorization has been attracting the research interests of the
community for decades. However, existing methods still struggle to provide
satisfactory colorized results given grayscale images due to a lack of
human-like global understanding of colors. Recently, large-scale Text-to-Image
(T2I) models have been exploited to transfer the semantic information from the
text prompts to the image domain, where text provides a global control for
semantic objects in the image. In this work, we introduce a colorization model
piggybacking on the existing powerful T2I diffusion model. Our key idea is to
exploit the color prior knowledge in the pre-trained T2I diffusion model for
realistic and diverse colorization. A diffusion guider is designed to
incorporate the pre-trained weights of the latent diffusion model to output a
latent color prior that conforms to the visual semantics of the grayscale
input. A lightness-aware VQVAE will then generate the colorized result with
pixel-perfect alignment to the given grayscale image. Our model can also
achieve conditional colorization with additional inputs (e.g. user hints and
texts). Extensive experiments show that our method achieves state-of-the-art
performance in terms of perceptual quality.Comment: project page: https://piggyback-color.github.io
Bacterial regulon modeling and prediction based on systematic \u3ci\u3ecis\u3c/i\u3e regulatory motif analyses
Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria
Recommended from our members
The effects of timbre on neural responses to musical emotion
Timbre is an important factor that affects the perception of emotion in music. To date, little is known about the effects of timbre on neural responses to musical emotion. To address this issue, we used ERPs to investigate whether there are different neural responses to musical emotion when the same melodies are presented in different timbres. With a cross-modal affective priming paradigm, target faces were primed by affectively congruent or incongruent melodies without lyrics presented in violin, flute, and the voice. Results showed a larger P3 and a larger left anterior distributed LPC in response to affectively incongruent versus congruent trials in the voice version. For the flute version, however, only the LPC effect was found, which was distributed over centro-parietal electrodes. Unlike the voice and flute versions, an N400 effect was observed in the violin version. These findings revealed different patterns of neural responses to emotional processing of music when the same melodies were presented in different timbres, and provide evidence to confirm the hypothesis that there are specialized neural responses to the human voice
An Integrative and Applicable Phylogenetic Footprinting Framework for \u3cem\u3ecis\u3c/em\u3e-regulatory Motifs Identification in Prokaryotic Genomes
Background: Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Results: Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP3 ). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP3 consistently outperformed other popular motif finding tools. We have integrated MP3 into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. Conclusion: The performance evaluation indicated that MP3 is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular
CBLab: Supporting the Training of Large-scale Traffic Control Policies with Scalable Traffic Simulation
Traffic simulation provides interactive data for the optimization of traffic
control policies. However, existing traffic simulators are limited by their
lack of scalability and shortage in input data, which prevents them from
generating interactive data from traffic simulation in the scenarios of real
large-scale city road networks.
In this paper, we present \textbf{C}ity \textbf{B}rain \textbf{Lab}, a
toolkit for scalable traffic simulation. CBLab consists of three components:
CBEngine, CBData, and CBScenario. CBEngine is a highly efficient simulator
supporting large-scale traffic simulation. CBData includes a traffic dataset
with road network data of 100 cities all around the world. We also develop a
pipeline to conduct a one-click transformation from raw road networks to input
data of our traffic simulation. Combining CBEngine and CBData allows
researchers to run scalable traffic simulations in the road network of real
large-scale cities. Based on that, CBScenario implements an interactive
environment and a benchmark for two scenarios of traffic control policies
respectively, with which traffic control policies adaptable for large-scale
urban traffic can be trained and tuned. To the best of our knowledge, CBLab is
the first infrastructure supporting traffic control policy optimization in
large-scale urban scenarios. CBLab has supported the City Brain Challenge @ KDD
CUP 2021. The project is available on
GitHub:~\url{https://github.com/CityBrainLab/CityBrainLab.git}.Comment: Accepted by KDD2023 (Applied Data Science Track
- …