14,529 research outputs found

    CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition

    Full text link
    Vision-Language models like CLIP have been widely adopted for various tasks due to their impressive zero-shot capabilities. However, CLIP is not suitable for extracting 3D geometric features as it was trained on only images and text by natural language supervision. We work on addressing this limitation and propose a new framework termed CG3D (CLIP Goes 3D) where a 3D encoder is learned to exhibit zero-shot capabilities. CG3D is trained using triplets of pointclouds, corresponding rendered 2D images, and texts using natural language supervision. To align the features in a multimodal embedding space, we utilize contrastive loss on 3D features obtained from the 3D encoder, as well as visual and text features extracted from CLIP. We note that the natural images used to train CLIP and the rendered 2D images in CG3D have a distribution shift. Attempting to train the visual and text encoder to account for this shift results in catastrophic forgetting and a notable decrease in performance. To solve this, we employ prompt tuning and introduce trainable parameters in the input space to shift CLIP towards the 3D pre-training dataset utilized in CG3D. We extensively test our pre-trained CG3D framework and demonstrate its impressive capabilities in zero-shot, open scene understanding, and retrieval tasks. Further, it also serves as strong starting weights for fine-tuning in downstream 3D recognition tasks.Comment: Website: https://jeya-maria-jose.github.io/cg3d-web

    GelSight360: An Omnidirectional Camera-Based Tactile Sensor for Dexterous Robotic Manipulation

    Full text link
    Camera-based tactile sensors have shown great promise in enhancing a robot's ability to perform a variety of dexterous manipulation tasks. Advantages of their use can be attributed to the high resolution tactile data and 3D depth map reconstructions they can provide. Unfortunately, many of these tactile sensors use either a flat sensing surface, sense on only one side of the sensor's body, or have a bulky form-factor, making it difficult to integrate the sensors with a variety of robotic grippers. Of the camera-based sensors that do have all-around, curved sensing surfaces, many cannot provide 3D depth maps; those that do often require optical designs specified to a particular sensor geometry. In this work, we introduce GelSight360, a fingertip-like, omnidirectional, camera-based tactile sensor capable of producing depth maps of objects deforming the sensor's surface. In addition, we introduce a novel cross-LED lighting scheme that can be implemented in different all-around sensor geometries and sizes, allowing the sensor to easily be reconfigured and attached to different grippers of varying DOFs. With this work, we enable roboticists to quickly and easily customize high resolution tactile sensors to fit their robotic system's needs

    Efectos de un nuevo nutracéutico basado en aceite de oliva virgen extra, aceite de algas y extracto de hojas de olivo sobre las alteraciones metabólicas y cardiovasculares asociadas al envejecimiento

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Medicina, Departamento de Fisiología. Fecha de Lectura: 23-07-2021Esta tesis tiene embargado el acceso al texto completo hasta el 23-01-2023Este trabajo de investigación ha sido financiado por la beca “Doctorados Industriales 2017” (IND2017/BIO7701) de la Comunidad de Madri

    The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions

    Full text link
    The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution

    A Design Science Research Approach to Smart and Collaborative Urban Supply Networks

    Get PDF
    Urban supply networks are facing increasing demands and challenges and thus constitute a relevant field for research and practical development. Supply chain management holds enormous potential and relevance for society and everyday life as the flow of goods and information are important economic functions. Being a heterogeneous field, the literature base of supply chain management research is difficult to manage and navigate. Disruptive digital technologies and the implementation of cross-network information analysis and sharing drive the need for new organisational and technological approaches. Practical issues are manifold and include mega trends such as digital transformation, urbanisation, and environmental awareness. A promising approach to solving these problems is the realisation of smart and collaborative supply networks. The growth of artificial intelligence applications in recent years has led to a wide range of applications in a variety of domains. However, the potential of artificial intelligence utilisation in supply chain management has not yet been fully exploited. Similarly, value creation increasingly takes place in networked value creation cycles that have become continuously more collaborative, complex, and dynamic as interactions in business processes involving information technologies have become more intense. Following a design science research approach this cumulative thesis comprises the development and discussion of four artefacts for the analysis and advancement of smart and collaborative urban supply networks. This thesis aims to highlight the potential of artificial intelligence-based supply networks, to advance data-driven inter-organisational collaboration, and to improve last mile supply network sustainability. Based on thorough machine learning and systematic literature reviews, reference and system dynamics modelling, simulation, and qualitative empirical research, the artefacts provide a valuable contribution to research and practice

    Corporate Social Responsibility: the institutionalization of ESG

    Get PDF
    Understanding the impact of Corporate Social Responsibility (CSR) on firm performance as it relates to industries reliant on technological innovation is a complex and perpetually evolving challenge. To thoroughly investigate this topic, this dissertation will adopt an economics-based structure to address three primary hypotheses. This structure allows for each hypothesis to essentially be a standalone empirical paper, unified by an overall analysis of the nature of impact that ESG has on firm performance. The first hypothesis explores the evolution of CSR to the modern quantified iteration of ESG has led to the institutionalization and standardization of the CSR concept. The second hypothesis fills gaps in existing literature testing the relationship between firm performance and ESG by finding that the relationship is significantly positive in long-term, strategic metrics (ROA and ROIC) and that there is no correlation in short-term metrics (ROE and ROS). Finally, the third hypothesis states that if a firm has a long-term strategic ESG plan, as proxied by the publication of CSR reports, then it is more resilience to damage from controversies. This is supported by the finding that pro-ESG firms consistently fared better than their counterparts in both financial and ESG performance, even in the event of a controversy. However, firms with consistent reporting are also held to a higher standard than their nonreporting peers, suggesting a higher risk and higher reward dynamic. These findings support the theory of good management, in that long-term strategic planning is both immediately economically beneficial and serves as a means of risk management and social impact mitigation. Overall, this contributes to the literature by fillings gaps in the nature of impact that ESG has on firm performance, particularly from a management perspective

    A Reinforcement Learning-assisted Genetic Programming Algorithm for Team Formation Problem Considering Person-Job Matching

    Full text link
    An efficient team is essential for the company to successfully complete new projects. To solve the team formation problem considering person-job matching (TFP-PJM), a 0-1 integer programming model is constructed, which considers both person-job matching and team members' willingness to communicate on team efficiency, with the person-job matching score calculated using intuitionistic fuzzy numbers. Then, a reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions. The RL-GP adopts the ensemble population strategies. Before the population evolution at each generation, the agent selects one from four population search modes according to the information obtained, thus realizing a sound balance of exploration and exploitation. In addition, surrogate models are used in the algorithm to evaluate the formation plans generated by individuals, which speeds up the algorithm learning process. Afterward, a series of comparison experiments are conducted to verify the overall performance of RL-GP and the effectiveness of the improved strategies within the algorithm. The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams. This study reveals the advantages of reinforcement learning methods, ensemble strategies, and the surrogate model applied to the GP framework. The diversity and intelligent selection of search patterns along with fast adaptation evaluation, are distinct features that enable RL-GP to be deployed in real-world enterprise environments.Comment: 16 page

    Artificial Minds

    Get PDF
    This paper explores the artistic possibilities of artificial intelligence, as well as its ability to act as a creative being through its learned knowledge from the collective consciousness of human beings, whether this learned knowledge can be used by the AI to represent reality, and whether this can be problematic regarding learned biases from the preexisting ones of our own. Looking at the history of how far artificial intelligence has come within the creative artistic realm, examining the technical aspects of how exactly an AI is able to generate original art, and examining four artists that all collaborate with artificially intelligent computer system in very diverse and unique ways, whether through video art, physical pencil drawings, or GAN generated imagery to create original works of art, the paperinvestigates whether the resulting artworks can be considered creative productions, whether AI can be taught artistic skills, whether these artistic skills can be implemented in representations of reality, and whether the AI can potentially inherit human biases in the process

    Autonomous Navigation in Rows of Trees and High Crops with Deep Semantic Segmentation

    Full text link
    Segmentation-based autonomous navigation has recently been proposed as a promising methodology to guide robotic platforms through crop rows without requiring precise GPS localization. However, existing methods are limited to scenarios where the centre of the row can be identified thanks to the sharp distinction between the plants and the sky. However, GPS signal obstruction mainly occurs in the case of tall, dense vegetation, such as high tree rows and orchards. In this work, we extend the segmentation-based robotic guidance to those scenarios where canopies and branches occlude the sky and hinder the usage of GPS and previous methods, increasing the overall robustness and adaptability of the control algorithm. Extensive experimentation on several realistic simulated tree fields and vineyards demonstrates the competitive advantages of the proposed solution

    Développement d’un système intelligent de reconnaissance automatisée pour la caractérisation des états de surface de la chaussée en temps réel par une approche multicapteurs

    Get PDF
    Le rôle d’un service dédié à l’analyse de la météo routière est d’émettre des prévisions et des avertissements aux usagers quant à l’état de la chaussée, permettant ainsi d’anticiper les conditions de circulations dangereuses, notamment en période hivernale. Il est donc important de définir l’état de chaussée en tout temps. L’objectif de ce projet est donc de développer un système de détection multicapteurs automatisée pour la caractérisation en temps réel des états de surface de la chaussée (neige, glace, humide, sec). Ce mémoire se focalise donc sur le développement d’une méthode de fusion de données images et sons par apprentissage profond basée sur la théorie de Dempster-Shafer. Les mesures directes pour l’acquisition des données qui ont servi à l’entrainement du modèle de fusion ont été effectuées à l’aide de deux capteurs à faible coût disponibles dans le commerce. Le premier capteur est une caméra pour enregistrer des vidéos de la surface de la route. Le second capteur est un microphone pour enregistrer le bruit de l’interaction pneu-chaussée qui caractérise chaque état de surface. La finalité de ce système est de pouvoir fonctionner sur un nano-ordinateur pour l’acquisition, le traitement et la diffusion de l’information en temps réel afin d’avertir les services d’entretien routier ainsi que les usagers de la route. De façon précise, le système se présente comme suit :1) une architecture d’apprentissage profond classifiant chaque état de surface à partir des images issues de la vidéo sous forme de probabilités ; 2) une architecture d’apprentissage profond classifiant chaque état de surface à partir du son sous forme de probabilités ; 3) les probabilités issues de chaque architecture ont été ensuite introduites dans le modèle de fusion pour obtenir la décision finale. Afin que le système soit léger et moins coûteux, il a été développé à partir d’architectures alliant légèreté et précision à savoir Squeeznet pour les images et M5 pour le son. Lors de la validation, le système a démontré une bonne performance pour la détection des états surface avec notamment 87,9 % pour la glace noire et 97 % pour la neige fondante
    • …
    corecore