227 research outputs found
CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics
Considerable progress has recently been made in leveraging CLIP (Contrastive
Language-Image Pre-Training) models for text-guided image manipulation.
However, all existing works rely on additional generative models to ensure the
quality of results, because CLIP alone cannot provide enough guidance
information for fine-scale pixel-level changes. In this paper, we introduce
CLIPVG, a text-guided image manipulation framework using differentiable vector
graphics, which is also the first CLIP-based general image manipulation
framework that does not require any additional generative models. We
demonstrate that CLIPVG can not only achieve state-of-art performance in both
semantic correctness and synthesis quality, but also is flexible enough to
support various applications far beyond the capability of all existing methods.Comment: 8 pages, 10 figures, AAAI202
Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs
Graphs are widely used to encapsulate a variety of data formats, but
real-world networks often involve complex node relations beyond only being
pairwise. While hypergraphs and hierarchical graphs have been developed and
employed to account for the complex node relations, they cannot fully represent
these complexities in practice. Additionally, though many Graph Neural Networks
(GNNs) have been proposed for representation learning on higher-order graphs,
they are usually only evaluated on simple graph datasets. Therefore, there is a
need for a unified modelling of higher-order graphs, and a collection of
comprehensive datasets with an accessible evaluation framework to fully
understand the performance of these algorithms on complex graphs. In this
paper, we introduce the concept of hybrid graphs, a unified definition for
higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains
23 real-world hybrid graph datasets across various domains such as biology,
social media, and e-commerce. Furthermore, we provide an extensible evaluation
framework and a supporting codebase to facilitate the training and evaluation
of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various
research opportunities and gaps, including (1) evaluating the actual
performance improvement of hypergraph GNNs over simple graph GNNs; (2)
comparing the impact of different sampling strategies on hybrid graph learning
methods; and (3) exploring ways to integrate simple graph and hypergraph
information. We make our source code and full datasets publicly available at
https://zehui127.github.io/hybrid-graph-benchmark/.Comment: Preprint. Under review. 16 pages, 5 figures, 11 table
HumanMAC: Masked Motion Completion for Human Motion Prediction
Human motion prediction is a classical problem in computer vision and
computer graphics, which has a wide range of practical applications. Previous
effects achieve great empirical performance based on an encoding-decoding
style. The methods of this style work by first encoding previous motions to
latent representations and then decoding the latent representations into
predicted motions. However, in practice, they are still unsatisfactory due to
several issues, including complicated loss constraints, cumbersome training
processes, and scarce switch of different categories of motions in prediction.
In this paper, to address the above issues, we jump out of the foregoing style
and propose a novel framework from a new perspective. Specifically, our
framework works in a masked completion fashion. In the training stage, we learn
a motion diffusion model that generates motions from random noise. In the
inference stage, with a denoising procedure, we make motion prediction
conditioning on observed motions to output more continuous and controllable
predictions. The proposed framework enjoys promising algorithmic properties,
which only needs one loss in optimization and is trained in an end-to-end
manner. Additionally, it accomplishes the switch of different categories of
motions effectively, which is significant in realistic tasks, e.g., the
animation task. Comprehensive experiments on benchmarks confirm the superiority
of the proposed framework. The project page is available at
https://lhchen.top/Human-MAC
Improved field emission performance of carbon nanotube by introducing copper metallic particles
To improve the field emission performance of carbon nanotubes (CNTs), a simple and low-cost method was adopted in this article. We introduced copper particles for decorating the CNTs so as to form copper particle-CNT composites. The composites were fabricated by electrophoretic deposition technique which produced copper metallic particles localized on the outer wall of CNTs and deposited them onto indium tin oxide (ITO) electrode. The results showed that the conductivity increased from 10-5 to 4 Ć 10-5 S while the turn-on field was reduced from 3.4 to 2.2 V/Ī¼m. Moreover, the field emission current tended to be undiminished after continuous emission for 24 h. The reasons were summarized that introducing copper metallic particles to decorate CNTs could increase the surface roughness of the CNTs which was beneficial to field emission, restrain field emission current from saturating when the applied electric field was above the critical field. In addition, it could also improve the electrical contact by increasing the contact area between CNT and ITO electrode that was beneficial to the electron transport and avoided instable electron emission caused by thermal injury of CNTs
Genomic Interpreter: A Hierarchical Genomic Deep Neural Network with 1D Shifted Window Transformer
Given the increasing volume and quality of genomics data, extracting new
insights requires interpretable machine-learning models. This work presents
Genomic Interpreter: a novel architecture for genomic assay prediction. This
model outperforms the state-of-the-art models for genomic assay prediction
tasks. Our model can identify hierarchical dependencies in genomic sites. This
is achieved through the integration of 1D-Swin, a novel Transformer-based block
designed by us for modelling long-range hierarchical data. Evaluated on a
dataset containing 38,171 DNA segments of 17K base pairs, Genomic Interpreter
demonstrates superior performance in chromatin accessibility and gene
expression prediction and unmasks the underlying `syntax' of gene regulation
Latent Diffusion Model for DNA Sequence Generation
The harnessing of machine learning, especially deep generative models, has
opened up promising avenues in the field of synthetic DNA sequence generation.
Whilst Generative Adversarial Networks (GANs) have gained traction for this
application, they often face issues such as limited sample diversity and mode
collapse. On the other hand, Diffusion Models are a promising new class of
generative models that are not burdened with these problems, enabling them to
reach the state-of-the-art in domains such as image generation. In light of
this, we propose a novel latent diffusion model, DiscDiff, tailored for
discrete DNA sequence generation. By simply embedding discrete DNA sequences
into a continuous latent space using an autoencoder, we are able to leverage
the powerful generative abilities of continuous diffusion models for the
generation of discrete data. Additionally, we introduce Fr\'echet
Reconstruction Distance (FReD) as a new metric to measure the sample quality of
DNA sequence generations. Our DiscDiff model demonstrates an ability to
generate synthetic DNA sequences that align closely with real DNA in terms of
Motif Distribution, Latent Embedding Distribution (FReD), and Chromatin
Profiles. Additionally, we contribute a comprehensive cross-species dataset of
150K unique promoter-gene sequences from 15 species, enriching resources for
future generative modelling in genomics. We will make our code public upon
publication
Recommended from our members
Metabolic Profiling Reveals a Dependency of Human Metastatic Breast Cancer on Mitochondrial Serine and One-Carbon Unit Metabolism.
Breast cancer is the most common cancer among American women and a major cause of mortality. To identify metabolic pathways as potential targets to treat metastatic breast cancer, we performed metabolomics profiling on the breast cancer cell line MDA-MB-231 and its tissue-tropic metastatic subclones. Here, we report that these subclones with increased metastatic potential display an altered metabolic profile compared with the parental population. In particular, the mitochondrial serine and one-carbon (1C) unit pathway is upregulated in metastatic subclones. Mechanistically, the mitochondrial serine and 1C unit pathway drives the faster proliferation of subclones through enhanced de novo purine biosynthesis. Inhibition of the first rate-limiting enzyme of the mitochondrial serine and 1C unit pathway, serine hydroxymethyltransferase (SHMT2), potently suppresses proliferation of metastatic subclones in culture and impairs growth of lung metastatic subclones at both primary and metastatic sites in mice. Some human breast cancers exhibit a significant association between the expression of genes in the mitochondrial serine and 1C unit pathway with disease outcome and higher expression of SHMT2 in metastatic tumor tissue compared with primary tumors. In addition to breast cancer, a few other cancer types, such as adrenocortical carcinoma and kidney chromophobe cell carcinoma, also display increased SHMT2 expression during disease progression. Together, these results suggest that mitochondrial serine and 1C unit metabolism plays an important role in promoting cancer progression, particularly in late-stage cancer. IMPLICATIONS: This study identifies mitochondrial serine and 1C unit metabolism as an important pathway during the progression of a subset of human breast cancers
Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model
Current makeup transfer methods are limited to simple makeup styles, making
them difficult to apply in real-world scenarios. In this paper, we introduce
Stable-Makeup, a novel diffusion-based makeup transfer method capable of
robustly transferring a wide range of real-world makeup, onto user-provided
faces. Stable-Makeup is based on a pre-trained diffusion model and utilizes a
Detail-Preserving (D-P) makeup encoder to encode makeup details. It also
employs content and structural control modules to preserve the content and
structural information of the source image. With the aid of our newly added
makeup cross-attention layers in U-Net, we can accurately transfer the detailed
makeup to the corresponding position in the source image. After
content-structure decoupling training, Stable-Makeup can maintain content and
the facial structure of the source image. Moreover, our method has demonstrated
strong robustness and generalizability, making it applicable to varioustasks
such as cross-domain makeup transfer, makeup-guided text-to-image generation
and so on. Extensive experiments have demonstrated that our approach delivers
state-of-the-art (SOTA) results among existing makeup transfer methods and
exhibits a highly promising with broad potential applications in various
related fields
- ā¦