46 research outputs found

    OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data

    Full text link
    The inexorable growth of online shopping and e-commerce demands scalable and robust machine learning-based solutions to accommodate customer requirements. In the context of automatic tagging classification and multimodal retrieval, prior works either defined a low generalizable supervised learning approach or more reusable CLIP-based techniques while, however, training on closed source data. In this work, we propose OpenFashionCLIP, a vision-and-language contrastive learning method that only adopts open-source fashion data stemming from diverse domains, and characterized by varying degrees of specificity. Our approach is extensively validated across several tasks and benchmarks, and experimental results highlight a significant out-of-domain generalization capability and consistent improvements over state-of-the-art methods both in terms of accuracy and recall. Source code and trained models are publicly available at: https://github.com/aimagelab/open-fashion-clip.Comment: International Conference on Image Analysis and Processing (ICIAP) 202

    OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data

    Get PDF
    The inexorable growth of online shopping and e-commerce demands scalable and robust machine learning-based solutions to accommodate customer requirements. In the context of automatic tagging classification and multimodal retrieval, prior works either defined a low generalizable supervised learning approach or more reusable CLIP-based techniques while, however, training on closed source data. In this work, we propose OpenFashionCLIP, a vision-and-language contrastive learning method that only adopts open-source fashion data stemming from diverse domains, and characterized by varying degrees of specificity. Our approach is extensively validated across several tasks and benchmarks, and experimental results highlight a significant out-of-domain generalization capability and consistent improvements over state-of-the-art methods both in terms of accuracy and recall. Source code and trained models are publicly available at: https://github.com/aimagelab/open-fashion-clip

    Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

    Full text link
    Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, showing how clothes interact with the human body. In this context, computer vision can thus be used to improve the fashion design process. Differently from previous works that mainly focused on the virtual try-on of garments, we propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images by following multimodal prompts, such as text, human body poses, and garment sketches. We tackle this problem by proposing a new architecture based on latent diffusion models, an approach that has not been used before in the fashion domain. Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets, namely Dress Code and VITON-HD, with multimodal annotations collected in a semi-automatic manner. Experimental results on these new datasets demonstrate the effectiveness of our proposal, both in terms of realism and coherence with the given multimodal inputs. Source code and collected multimodal annotations will be publicly released at: https://github.com/aimagelab/multimodal-garment-designer

    Spectral signatures from super-Earths, warm and hot-Neptunes

    Get PDF
    ESA's and NASA's planet characterization missions, will allow us to explore the diversity of planets around stars of different spectral type, and will expand the existing field of comparative planetology beyond our Solar System. In particular, terrestrial planets greater than one Earth mass are not represented in our Solar System, but may occur in others (Beaulieu et al., 2006; Rivera et al. 2005). The next generation of space telescopes, the James Webb Space Telescope (2013), will have the capability of acquiring transmission and emission spectra in the infrared of these extrasolar worlds. Further into the future, the direct imaging of exoplanets, both in the optical and infrared, will extend our understanding to extrasolar bodies orbiting few Astronomical Units from their parent star and expand our knowledge to smaller-size objects

    LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

    Full text link
    The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the same time, recent advancements in the development of diffusion models have enabled generative networks to create remarkably realistic images. In this context, image-based virtual try-on, which consists in generating a novel image of a target model wearing a given in-shop garment, has yet to capitalize on the potential of these powerful generative solutions. This work introduces LaDI-VTON, the first Latent Diffusion textual Inversion-enhanced model for the Virtual Try-ON task. The proposed architecture relies on a latent diffusion model extended with a novel additional autoencoder module that exploits learnable skip connections to enhance the generation process preserving the model's characteristics. To effectively maintain the texture and details of the in-shop garment, we propose a textual inversion component that can map the visual features of the garment to the CLIP token embedding space and thus generate a set of pseudo-word token embeddings capable of conditioning the generation process. Experimental results on Dress Code and VITON-HD datasets demonstrate that our approach outperforms the competitors by a consistent margin, achieving a significant milestone for the task. Source code and trained models will be publicly released at: https://github.com/miccunifi/ladi-vton

    M\uf6ssbauer spectroscopy of a monolayer of single molecule magnets

    Get PDF
    The use of single molecule magnets (SMMs) as cornerstone elements in spintronics and quantum computing applications demands that magnetic bistability is retained when molecules are interfaced with solid conducting surfaces. Here, we employ synchrotron M\uf6ssbauer spectroscopy to investigate a monolayer of a tetrairon(III) (Fe4) SMM chemically grafted on a gold substrate. At low temperature and zero magnetic field, we observe the magnetic pattern of the Fe4 molecule, indicating slow spin fluctuations compared to the M\uf6ssbauer timescale. Significant structural deformations of the magnetic core, induced by the interaction with the substrate, as predicted by ab initio molecular dynamics, are also observed. However, the effects of the modifications occurring at the individual iron sites partially compensate each other, so that slow magnetic relaxation is retained on the surface. Interestingly, these deformations escaped detection by conventional synchrotron-based techniques, like X-ray magnetic circular dichroism, thus highlighting the power of synchrotron M\uf6ssbauer spectroscopy for the investigation of hybrid interfaces

    Traitement d'images différentielles à haut contraste pour la détection de planètes extrasolaires

    No full text
    This thesis is dedicated to the ground-based exoplanet detection and it has been developed in the context of the SPHERE project of ESO; the installation of this instrument is planned for 2011 on the VLT. To directly deleted exoplanets, SPHERE will combine an extreme adaptive optics System, a coronagraph and imagery at two or more wavelengths. To distinguish between the planet signal and the speckles due to the aberrations, we need to elaborate a sophisticated method of data processing which exploits also the temporal information. I developed a method, called ANDROMEDA, which allows to detect exoplanets and to estimate their flux. It uses image differences which aim at eliminating the effects of unknown quasi-static aberrations and of the variability of the quality oi adaptive optics correction. The algorithm couples images taken at different instants which present a field rotation between them, and possibly images taken at two different wavelengths. I showed that ANDROMEDA meets the requirements of SPHERE for the detections of planets at a contrast o 106 at a separation of 0. 5" from the star, and it can even do better, reaching a separation of 0. 2" for the same contrast. Then I carried out a parametric study of the method to analyze the robustness of the flux estimation towards different sources of error. Finally, I tested ANDROMEDA on experimental data provided by the NACO instrument; this analysis allowed us to understand how to deal with typical problems posed by data reduction and instrumental artifacts.Cette thèse est consacrée à la problématique de la détection des exoplanètes depuis le sol et s'insère dans le cadre du projet SPHERE de PESO ; l'installation de cet instrument est prévue en 2011 sur le VLT. Pour la détection directe des exoplanètes, SPHERE combinera un système d'optique adaptative extrême, un coronographe et l'imagerie à deux ou plusieurs longueurs d'onde. Pour distinguer entre le signal de la planète et les "speckles" dues aux aberrations, il faut élaborer une méthode sophistiquée de traitement d'images qui exploite aussi l'information temporelle. J'ai donc développé une méthode, appelée ANDROMEDA, qui permet de détecter les exoplanètes et d'estimer leur flux. Elle utilise des différences d'images qui ont pour but d'éliminer l'impact des aberrations quasi-statiques inconnues et de la variabilité de la qualité de correction de l'optique adaptative. L'algorithme couple des images prises à des instants différents qui présentent une rotation de champ entre elles, et éventuellement des images prises à différentes longueurs d'onde. J'ai démontré qu'ANDROMEDA répond aux spécifications de SPHERE pour la détection de planètes à un contraste de 106 à une séparation de 0,5" de l'étoile, et elle peut faire encore mieux, en atteignant une séparation de 0,2" pour le même contraste. J'ai ensuite effectué une étude paramétrique pour analyser la robustesse de l'estimation du flux face à différentes sources d'erreurs. Enfin, j'ai testé ANDROMEDA sur de; données expérimentales fournies par l'instrument NACO ; cette analyse a permis de comprendre comment faire face aux problèmes typiquement posés par la réduction des images et la gestion des artefacts instrumentaux
    corecore