11,109 research outputs found

    Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

    Full text link
    Generalizable person re-identification (Re-ID) is a very hot research topic in machine learning and computer vision, which plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. However, previous methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which easily leads to poor generalization capability when adapted to the new domain. In this paper, we propose a Multi-Modal Equivalent Transformer called MMET for more robust visual-semantic embedding learning on visual, textual and visual-textual tasks respectively. To further enhance the robust feature learning in the context of transformer, a dynamic masking mechanism called Masked Multimodal Modeling strategy (MMM) is introduced to mask both the image patches and the text tokens, which can jointly works on multimodal or unimodal data and significantly boost the performance of generalizable person Re-ID. Extensive experiments on benchmark datasets demonstrate the competitive performance of our method over previous approaches. We hope this method could advance the research towards visual-semantic representation learning. Our source code is also publicly available at https://github.com/JeremyXSC/MMET

    The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions

    Full text link
    The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution

    Open Set Classification of GAN-based Image Manipulations via a ViT-based Hybrid Architecture

    Full text link
    Classification of AI-manipulated content is receiving great attention, for distinguishing different types of manipulations. Most of the methods developed so far fail in the open-set scenario, that is when the algorithm used for the manipulation is not represented by the training set. In this paper, we focus on the classification of synthetic face generation and manipulation in open-set scenarios, and propose a method for classification with a rejection option. The proposed method combines the use of Vision Transformers (ViT) with a hybrid approach for simultaneous classification and localization. Feature map correlation is exploited by the ViT module, while a localization branch is employed as an attention mechanism to force the model to learn per-class discriminative features associated with the forgery when the manipulation is performed locally in the image. Rejection is performed by considering several strategies and analyzing the model output layers. The effectiveness of the proposed method is assessed for the task of classification of facial attribute editing and GAN attribution

    Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

    Full text link
    Humans have long been recorded in a variety of forms since antiquity. For example, sculptures and paintings were the primary media for depicting human beings before the invention of cameras. However, most current human-centric computer vision tasks like human pose estimation and human image generation focus exclusively on natural images in the real world. Artificial humans, such as those in sculptures, paintings, and cartoons, are commonly neglected, making existing models fail in these scenarios. As an abstraction of life, art incorporates humans in both natural and artificial scenes. We take advantage of it and introduce the Human-Art dataset to bridge related tasks in natural and artificial scenarios. Specifically, Human-Art contains 50k high-quality images with over 123k person instances from 5 natural and 15 artificial scenarios, which are annotated with bounding boxes, keypoints, self-contact points, and text information for humans represented in both 2D and 3D. It is, therefore, comprehensive and versatile for various downstream tasks. We also provide a rich set of baseline results and detailed analyses for related tasks, including human detection, 2D and 3D human pose estimation, image generation, and motion transfer. As a challenging dataset, we hope Human-Art can provide insights for relevant research and open up new research questions.Comment: CVPR202

    Anuário científico da Escola Superior de Tecnologia da Saúde de Lisboa - 2021

    Get PDF
    É com grande prazer que apresentamos a mais recente edição (a 11.ª) do Anuário Científico da Escola Superior de Tecnologia da Saúde de Lisboa. Como instituição de ensino superior, temos o compromisso de promover e incentivar a pesquisa científica em todas as áreas do conhecimento que contemplam a nossa missão. Esta publicação tem como objetivo divulgar toda a produção científica desenvolvida pelos Professores, Investigadores, Estudantes e Pessoal não Docente da ESTeSL durante 2021. Este Anuário é, assim, o reflexo do trabalho árduo e dedicado da nossa comunidade, que se empenhou na produção de conteúdo científico de elevada qualidade e partilhada com a Sociedade na forma de livros, capítulos de livros, artigos publicados em revistas nacionais e internacionais, resumos de comunicações orais e pósteres, bem como resultado dos trabalhos de 1º e 2º ciclo. Com isto, o conteúdo desta publicação abrange uma ampla variedade de tópicos, desde temas mais fundamentais até estudos de aplicação prática em contextos específicos de Saúde, refletindo desta forma a pluralidade e diversidade de áreas que definem, e tornam única, a ESTeSL. Acreditamos que a investigação e pesquisa científica é um eixo fundamental para o desenvolvimento da sociedade e é por isso que incentivamos os nossos estudantes a envolverem-se em atividades de pesquisa e prática baseada na evidência desde o início dos seus estudos na ESTeSL. Esta publicação é um exemplo do sucesso desses esforços, sendo a maior de sempre, o que faz com que estejamos muito orgulhosos em partilhar os resultados e descobertas dos nossos investigadores com a comunidade científica e o público em geral. Esperamos que este Anuário inspire e motive outros estudantes, profissionais de saúde, professores e outros colaboradores a continuarem a explorar novas ideias e contribuir para o avanço da ciência e da tecnologia no corpo de conhecimento próprio das áreas que compõe a ESTeSL. Agradecemos a todos os envolvidos na produção deste anuário e desejamos uma leitura inspiradora e agradável.info:eu-repo/semantics/publishedVersio

    Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data

    Full text link
    We propose Compressed Vertical Federated Learning (C-VFL) for communication-efficient training on vertically partitioned data. In C-VFL, a server and multiple parties collaboratively train a model on their respective features utilizing several local iterations and sharing compressed intermediate results periodically. Our work provides the first theoretical analysis of the effect message compression has on distributed training over vertically partitioned data. We prove convergence of non-convex objectives at a rate of O(1T)O(\frac{1}{\sqrt{T}}) when the compression error is bounded over the course of training. We provide specific requirements for convergence with common compression techniques, such as quantization and top-kk sparsification. Finally, we experimentally show compression can reduce communication by over 90%90\% without a significant decrease in accuracy over VFL without compression

    Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

    Full text link
    In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent

    OLIG2 neural progenitor cell development and fate in Down syndrome

    Full text link
    Down syndrome (DS) is caused by triplication of human chromosome 21 (HSA21) and is the most common genetic form of intellectual disability. It is unknown precisely how triplication of HSA21 results in the intellectual disability, but it is thought that the global transcriptional dysregulation caused by trisomy 21 perturbs multiple aspects of neurodevelopment that cumulatively contribute to its etiology. While the characteristics associated with DS can arise from any of the genes triplicated on HSA21, in this work we focus on oligodendrocyte transcription factor 2 (OLIG2). The progeny of neural progenitor cells (NPCs) expressing OLIG2 are likely to be involved in many of the cellular changes underlying the intellectual disability in DS. To explore the fate of OLIG2+ neural progenitors, we took advantage of two distinct models of DS, the Ts65Dn mouse model and induced pluripotent stem cells (iPSCs) derived from individuals with DS. Our results from these two systems identified multiple perturbations in development in the cellular progeny of OLIG2+ NPCs. In Ts65Dn, we identified alterations in neurons and glia derived from the OLIG2 expressing progenitor domain in the ventral spinal cord. There were significant differences in the number of motor neurons and interneurons present in the trisomic lumbar spinal cord depending on age of the animal pointing both to a neurodevelopment and a neurodegeneration phenotype in the Ts65Dn mice. Of particular note, we identified changes in oligodendrocyte (OL) maturation in the trisomic mice that are dependent on spatial location and developmental origin. In the dorsal corticospinal tract, there were significantly fewer mature OLs in the trisomic mice, and in the lateral funiculus we observed the opposite phenotype with more mature OLs being present in the trisomic animals. We then transitioned our studies into iPSCs where we were able to pattern OLIG2+ NPCs to either a spinal cord-like or a brain-like identity and study the OL lineage that differentiated from each progenitor pool. Similar to the region-specific dysregulation found in the Ts65Dn spinal cord, we identified perturbations in trisomic OLs that were dependent on whether the NPCs had been patterned to a brain-like or spinal cord-like fate. In the spinal cord-like NPCs, there was no difference in the proportion of cells expressing either OLIG2 or NKX2.2, the two transcription factors whose co-expression is essential for OL differentiation. Conversely, in the brain-like NPCs, there was a significant increase in OLIG2+ cells in the trisomic culture and a decrease in NKX2.2 mRNA expression. We identified a sonic hedgehog (SHH) signaling based mechanism underlying these changes in OLIG2 and NKX2.2 expression in the brain-like NPCs and normalized the proportion of trisomic cells expressing the transcription factors to euploid levels by modulating the activity of the SHH pathway. Finally, we continued the differentiation of the brain-like and spinal cord-like NPCs to committed OL precursor cells (OPCs) and allowed them to mature. We identified an increase in OPC production in the spinal cord-like trisomic culture which was not present in the brain-like OPCs. Conversely, we identified a maturation deficit in the brain-like trisomic OLs that was not present in the spinal cord-like OPCs. These results underscore the importance of regional patterning in characterizing changes in cell differentiation and fate in DS. Together, the findings presented in this work contribute to the understanding of the cellular and molecular etiology of the intellectual disability in DS and in particular the contribution of cells differentiated from OLIG2+ progenitors
    corecore