87 research outputs found

    Fashion Matrix: Editing Photos by Just Talking

    Full text link
    The utilization of Large Language Models (LLMs) for the construction of AI systems has garnered significant attention across diverse fields. The extension of LLMs to the domain of fashion holds substantial commercial potential but also inherent challenges due to the intricate semantic interactions in fashion-related generation. To address this issue, we developed a hierarchical AI system called Fashion Matrix dedicated to editing photos by just talking. This system facilitates diverse prompt-driven tasks, encompassing garment or accessory replacement, recoloring, addition, and removal. Specifically, Fashion Matrix employs LLM as its foundational support and engages in iterative interactions with users. It employs a range of Semantic Segmentation Models (e.g., Grounded-SAM, MattingAnything, etc.) to delineate the specific editing masks based on user instructions. Subsequently, Visual Foundation Models (e.g., Stable Diffusion, ControlNet, etc.) are leveraged to generate edited images from text prompts and masks, thereby facilitating the automation of fashion editing processes. Experiments demonstrate the outstanding ability of Fashion Matrix to explores the collaborative potential of functionally diverse pre-trained models in the domain of fashion editing.Comment: 13 pages, 5 figures, 2 table

    Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

    Get PDF
    Source at https://proceedings.neurips.cc/paper/2021/hash/151de84cca69258b17375e2f44239191-Abstract.html.Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. Yet, as most try-on approaches fit in-shop garments onto a target person, they require the laborious and restrictive construction of a paired training dataset, severely limiting their scalability. While a few recent works attempt to transfer garments directly from one person to another, alleviating the need to collect paired datasets, their performance is impacted by the lack of paired (supervised) information. In particular, disentangling style and spatial information of the garment becomes a challenge, which existing methods either address by requiring auxiliary data or extensive online optimization procedures, thereby still inhibiting their scalability. To achieve a scalable virtual try-on system that can transfer arbitrary garments between a source and a target person in an unsupervised manner, we thus propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on. Specifically, to disentangle the style and spatial information of each garment, PASTA-GAN consists of an innovative patch-routed disentanglement module for successfully retaining garment texture and shape characteristics. Guided by the source person's keypoints, the patch-routed disentanglement module first decouples garments into normalized patches, thus eliminating the inherent spatial information of the garment, and then reconstructs the normalized patches to the warped garment complying with the target person pose. Given the warped garment, PASTA-GAN further introduces novel spatially-adaptive residual blocks that guide the generator to synthesize more realistic garment details. Extensive comparisons with paired and unpaired approaches demonstrate the superiority of PASTA-GAN, highlighting its ability to generate high-quality try-on images when faced with a large variety of garments(e.g. vests, shirts, pants), taking a crucial step towards real-world scalable try-on

    GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning

    Full text link
    Image-based Virtual Try-ON aims to transfer an in-shop garment onto a specific person. Existing methods employ a global warping module to model the anisotropic deformation for different garment parts, which fails to preserve the semantic information of different parts when receiving challenging inputs (e.g, intricate human poses, difficult garments). Moreover, most of them directly warp the input garment to align with the boundary of the preserved region, which usually requires texture squeezing to meet the boundary shape constraint and thus leads to texture distortion. The above inferior performance hinders existing methods from real-world applications. To address these problems and take a step towards real-world virtual try-on, we propose a General-Purpose Virtual Try-ON framework, named GP-VTON, by developing an innovative Local-Flow Global-Parsing (LFGP) warping module and a Dynamic Gradient Truncation (DGT) training strategy. Specifically, compared with the previous global warping mechanism, LFGP employs local flows to warp garments parts individually, and assembles the local warped results via the global garment parsing, resulting in reasonable warped parts and a semantic-correct intact garment even with challenging inputs.On the other hand, our DGT training strategy dynamically truncates the gradient in the overlap area and the warped garment is no more required to meet the boundary constraint, which effectively avoids the texture squeezing problem. Furthermore, our GP-VTON can be easily extended to multi-category scenario and jointly trained by using data from different garment categories. Extensive experiments on two high-resolution benchmarks demonstrate our superiority over the existing state-of-the-art methods.Comment: 8 pages, 8 figures, The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR

    Quantitative Label-Free Proteomic Analysis of Milk Fat Globule Membrane in Donkey and Human Milk

    Get PDF
    Previous studies have found donkey milk (DM) has the similar compositions with human milk (HM) and could be used as a potential hypoallergenic replacement diet for babies suffering from cow's milk allergy. Milk fat globule membrane (MFGM) proteins are involved in many biological functions, behaving as important indicators of the nutritional quality of milk. In this study, we used label-free proteomics to quantify the differentially expressed MFGM proteins (DEP) between DM (in 4–5 months of lactation) and HM (in 6–8 months of lactation). In total, 293 DEP were found in these two groups. Gene Ontology (GO) enrichment analysis revealed that the majority of DEP participated in regulation of immune system process, membrane invagination and lymphocyte activation. Several significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were determined for the DEP, such as lysosome, galactose metabolism and peroxisome proliferator-activated receptor (PPAR) signaling pathway. Our study may provide valuable information in the composition of MFGM proteins in DM and HM, and expand our knowledge of different biological functions between DM and HM

    Mitochondrial genome of Salix cardiophylla and its implications for infrageneric division of the genus of Salix

    No full text
    Salix cardiophylla was a member of the genus of Salix in family Salicaceae with unique morphological traits, and once recognized as a separate genus, Toisusu Kimura. Here, we sequenced and assembled the complete mitochondrial genome of S. cardiophylla, which was 735,173 bp in length, including 56 genes, 28 protein-coding genes, 3 rRNA genes, 25 tRNA genes, and one large inverted repeat regions with length of 13,603 bp. Phylogenetic analysis based on 26 mitochondrial CDS confirmed that S. cardiophylla is a member of Salix, and support its merge into Salix in aspect of our new insights on mitogenome phylogenomics

    Information Volume Threshold for Graphical Variable Message Signs Based on Drivers’ Visual Cognition Behavior

    No full text
    Variable message signs (VMS) are widely employed to offer drivers dynamic traffic information. However, it is still lacking practical guidance about the information volume displayed on a graphical VMS. Building on the result of the subjective questionnaire survey, a static cognitive experiment was conducted to analyze the influence of volume information (i.e., elements and displaying the number of roads) of graphical VMS on drivers’ visual cognition characteristics and then determine the threshold number of roads displayed on VMS. Forty-five drivers participated in the static cognitive experiment. Five indicators, including visual cognition time, cognition accuracy, comprehension accuracy, general assessment, and information acceptance, were used to estimate the influences of graphical VMS. Study results by descriptive statistics and statistical hypothesis testing indicated that drivers also preferred auxiliary elements (i.e., distance or time information) besides basic design elements (i.e., driving direction, current position, and road name) displayed on graphical VMS. With the increase in information volume, driver visual cognition time increased while other companion indexes (i.e., visual cognition accuracy and comprehension accuracy) generally worsened. Combining the data of drivers’ objective behavior and subjective scoring, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method revealed that the number of roads shown on the graphical VMS should be no greater than five. The study results were verified by dynamic simulation experiments. This finding provides a supplement for the design standards and usage specifications for VMS

    The complete chloroplast genome of Eurya rubiginosa var. attenuata H. T. Chang (Pentaphylacaceae)

    No full text
    Eurya rubiginosa var. attenuata is a valuable multiuse tree with a long history of use in China. It has great economic and ecological importance and is used for landscape and urban planting, soil improvement, and raw materials for food production. However, genomic studies of E. rubiginosa var. attenuata are limited. Meanwhile, the classification of this taxon is controversial. In this study, the complete plastome of E. rubiginosa var. attenuata was successfully sequenced and assembled. The chloroplast genome is 157,215 bp in length with a 37.3% GC content. The chloroplast genome structure includes a quadripartite structure comprising a pair of inverted repeat (IR) sequences of 25,872 bp, a small single-copy (SSC) region of 18,216 bp, and a large single-copy (LSC) region of 87,255 bp. The genome contains 128 genes, including 83 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Phylogenetic inference based on complete plastome analysis showed that E. rubiginosa var. attenuata is closely related to E. alata and belongs to the family Pentaphylacaceae, which differs from the results of the traditional Engler system. The chloroplast genome sequence assembly and phylogenetic analysis enrich the genetic resources of Pentaphylacaceae and provide a molecular basis for further studies on the phylogeny of the family

    HVS-based quality assessment metrics for 3-D images

    No full text
    Significant efforts from both the academia and industry have been devoted to advance three-dimensional (3-D) imaging technologies. However, accurate and easy-to-use visual quality assessment metrics still lack for 3-D images. In this paper, we propose three objective quality assessment metrics for 3-D images taking into consideration a set of relevant visual characteristic factors, including contrast sensitivity, multichannel and binocular parallax characteristics. The human visual signal-to-noise ratio (HVSNR), parallax distortion ratio (PDR), and different peak signal-to-noise ratio (DPSNR) are the three independent objective quality assessment metrics proposed in this paper. Experimental results show that the quality assessment results based upon the proposed metrics are consistent with the quality grades obtained by subjective assessment
    • …
    corecore