87 research outputs found
Fashion Matrix: Editing Photos by Just Talking
The utilization of Large Language Models (LLMs) for the construction of AI
systems has garnered significant attention across diverse fields. The extension
of LLMs to the domain of fashion holds substantial commercial potential but
also inherent challenges due to the intricate semantic interactions in
fashion-related generation. To address this issue, we developed a hierarchical
AI system called Fashion Matrix dedicated to editing photos by just talking.
This system facilitates diverse prompt-driven tasks, encompassing garment or
accessory replacement, recoloring, addition, and removal. Specifically, Fashion
Matrix employs LLM as its foundational support and engages in iterative
interactions with users. It employs a range of Semantic Segmentation Models
(e.g., Grounded-SAM, MattingAnything, etc.) to delineate the specific editing
masks based on user instructions. Subsequently, Visual Foundation Models (e.g.,
Stable Diffusion, ControlNet, etc.) are leveraged to generate edited images
from text prompts and masks, thereby facilitating the automation of fashion
editing processes. Experiments demonstrate the outstanding ability of Fashion
Matrix to explores the collaborative potential of functionally diverse
pre-trained models in the domain of fashion editing.Comment: 13 pages, 5 figures, 2 table
Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN
Source at https://proceedings.neurips.cc/paper/2021/hash/151de84cca69258b17375e2f44239191-Abstract.html.Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. Yet, as most try-on approaches fit in-shop garments onto a target person, they require the laborious and restrictive construction of a paired training dataset, severely limiting their scalability. While a few recent works attempt to transfer garments directly from one person to another, alleviating the need to collect paired datasets, their performance is impacted by the lack of paired (supervised) information. In particular, disentangling style and spatial information of the garment becomes a challenge, which existing methods either address by requiring auxiliary data or extensive online optimization procedures, thereby still inhibiting their scalability. To achieve a scalable virtual try-on system that can transfer arbitrary garments between a source and a target person in an unsupervised manner, we thus propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on. Specifically, to disentangle the style and spatial information of each garment, PASTA-GAN consists of an innovative patch-routed disentanglement module for successfully retaining garment texture and shape characteristics. Guided by the source person's keypoints, the patch-routed disentanglement module first decouples garments into normalized patches, thus eliminating the inherent spatial information of the garment, and then reconstructs the normalized patches to the warped garment complying with the target person pose. Given the warped garment, PASTA-GAN further introduces novel spatially-adaptive residual blocks that guide the generator to synthesize more realistic garment details. Extensive comparisons with paired and unpaired approaches demonstrate the superiority of PASTA-GAN, highlighting its ability to generate high-quality try-on images when faced with a large variety of garments(e.g. vests, shirts, pants), taking a crucial step towards real-world scalable try-on
GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
Image-based Virtual Try-ON aims to transfer an in-shop garment onto a
specific person. Existing methods employ a global warping module to model the
anisotropic deformation for different garment parts, which fails to preserve
the semantic information of different parts when receiving challenging inputs
(e.g, intricate human poses, difficult garments). Moreover, most of them
directly warp the input garment to align with the boundary of the preserved
region, which usually requires texture squeezing to meet the boundary shape
constraint and thus leads to texture distortion. The above inferior performance
hinders existing methods from real-world applications. To address these
problems and take a step towards real-world virtual try-on, we propose a
General-Purpose Virtual Try-ON framework, named GP-VTON, by developing an
innovative Local-Flow Global-Parsing (LFGP) warping module and a Dynamic
Gradient Truncation (DGT) training strategy. Specifically, compared with the
previous global warping mechanism, LFGP employs local flows to warp garments
parts individually, and assembles the local warped results via the global
garment parsing, resulting in reasonable warped parts and a semantic-correct
intact garment even with challenging inputs.On the other hand, our DGT training
strategy dynamically truncates the gradient in the overlap area and the warped
garment is no more required to meet the boundary constraint, which effectively
avoids the texture squeezing problem. Furthermore, our GP-VTON can be easily
extended to multi-category scenario and jointly trained by using data from
different garment categories. Extensive experiments on two high-resolution
benchmarks demonstrate our superiority over the existing state-of-the-art
methods.Comment: 8 pages, 8 figures, The IEEE/CVF Computer Vision and Pattern
Recognition Conference (CVPR
Quantitative Label-Free Proteomic Analysis of Milk Fat Globule Membrane in Donkey and Human Milk
Previous studies have found donkey milk (DM) has the similar compositions with human milk (HM) and could be used as a potential hypoallergenic replacement diet for babies suffering from cow's milk allergy. Milk fat globule membrane (MFGM) proteins are involved in many biological functions, behaving as important indicators of the nutritional quality of milk. In this study, we used label-free proteomics to quantify the differentially expressed MFGM proteins (DEP) between DM (in 4–5 months of lactation) and HM (in 6–8 months of lactation). In total, 293 DEP were found in these two groups. Gene Ontology (GO) enrichment analysis revealed that the majority of DEP participated in regulation of immune system process, membrane invagination and lymphocyte activation. Several significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were determined for the DEP, such as lysosome, galactose metabolism and peroxisome proliferator-activated receptor (PPAR) signaling pathway. Our study may provide valuable information in the composition of MFGM proteins in DM and HM, and expand our knowledge of different biological functions between DM and HM
Background, progress and prospect of traditional knowledge under the Convention on Biological Diversity
Mitochondrial genome of Salix cardiophylla and its implications for infrageneric division of the genus of Salix
Salix cardiophylla was a member of the genus of Salix in family Salicaceae with unique morphological traits, and once recognized as a separate genus, Toisusu Kimura. Here, we sequenced and assembled the complete mitochondrial genome of S. cardiophylla, which was 735,173 bp in length, including 56 genes, 28 protein-coding genes, 3 rRNA genes, 25 tRNA genes, and one large inverted repeat regions with length of 13,603 bp. Phylogenetic analysis based on 26 mitochondrial CDS confirmed that S. cardiophylla is a member of Salix, and support its merge into Salix in aspect of our new insights on mitogenome phylogenomics
Information Volume Threshold for Graphical Variable Message Signs Based on Drivers’ Visual Cognition Behavior
Variable message signs (VMS) are widely employed to offer drivers dynamic traffic information. However, it is still lacking practical guidance about the information volume displayed on a graphical VMS. Building on the result of the subjective questionnaire survey, a static cognitive experiment was conducted to analyze the influence of volume information (i.e., elements and displaying the number of roads) of graphical VMS on drivers’ visual cognition characteristics and then determine the threshold number of roads displayed on VMS. Forty-five drivers participated in the static cognitive experiment. Five indicators, including visual cognition time, cognition accuracy, comprehension accuracy, general assessment, and information acceptance, were used to estimate the influences of graphical VMS. Study results by descriptive statistics and statistical hypothesis testing indicated that drivers also preferred auxiliary elements (i.e., distance or time information) besides basic design elements (i.e., driving direction, current position, and road name) displayed on graphical VMS. With the increase in information volume, driver visual cognition time increased while other companion indexes (i.e., visual cognition accuracy and comprehension accuracy) generally worsened. Combining the data of drivers’ objective behavior and subjective scoring, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method revealed that the number of roads shown on the graphical VMS should be no greater than five. The study results were verified by dynamic simulation experiments. This finding provides a supplement for the design standards and usage specifications for VMS
The complete chloroplast genome of Eurya rubiginosa var. attenuata H. T. Chang (Pentaphylacaceae)
Eurya rubiginosa var. attenuata is a valuable multiuse tree with a long history of use in China. It has great economic and ecological importance and is used for landscape and urban planting, soil improvement, and raw materials for food production. However, genomic studies of E. rubiginosa var. attenuata are limited. Meanwhile, the classification of this taxon is controversial. In this study, the complete plastome of E. rubiginosa var. attenuata was successfully sequenced and assembled. The chloroplast genome is 157,215 bp in length with a 37.3% GC content. The chloroplast genome structure includes a quadripartite structure comprising a pair of inverted repeat (IR) sequences of 25,872 bp, a small single-copy (SSC) region of 18,216 bp, and a large single-copy (LSC) region of 87,255 bp. The genome contains 128 genes, including 83 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Phylogenetic inference based on complete plastome analysis showed that E. rubiginosa var. attenuata is closely related to E. alata and belongs to the family Pentaphylacaceae, which differs from the results of the traditional Engler system. The chloroplast genome sequence assembly and phylogenetic analysis enrich the genetic resources of Pentaphylacaceae and provide a molecular basis for further studies on the phylogeny of the family
HVS-based quality assessment metrics for 3-D images
Significant efforts from both the academia and industry have been devoted to advance three-dimensional (3-D)
imaging technologies. However, accurate and easy-to-use visual quality assessment metrics still lack for 3-D images. In this paper, we propose three objective quality assessment metrics for 3-D images taking into consideration a set of relevant visual characteristic factors, including contrast sensitivity, multichannel and binocular parallax characteristics. The human visual signal-to-noise ratio (HVSNR), parallax distortion ratio (PDR), and different peak signal-to-noise ratio (DPSNR) are the three independent objective quality assessment metrics proposed in this paper. Experimental results show that the quality assessment results based upon the proposed metrics are
consistent with the quality grades obtained by subjective
assessment
- …