5,342 research outputs found

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning

    Examining trade-offs between social, psychological, and energy potential of urban form

    Get PDF
    Urban planners are often challenged with the task of developing design solutions which must meet multiple, and often contradictory, criteria. In this paper, we investigated the trade-offs between social, psychological, and energy potential of the fundamental elements of urban form: the street network and the building massing. Since formal methods to evaluate urban form from the psychological and social point of view are not readily available, we developed a methodological framework to quantify these criteria as the first contribution in this paper. To evaluate the psychological potential, we conducted a three-tiered empirical study starting from real world environments and then abstracting them to virtual environments. In each context, the implicit (physiological) response and explicit (subjective) response of pedestrians were measured. To quantify the social potential, we developed a street network centrality-based measure of social accessibility. For the energy potential, we created an energy model to analyze the impact of pure geometric form on the energy demand of the building stock. The second contribution of this work is a method to identify distinct clusters of urban form and, for each, explore the trade-offs between the select design criteria. We applied this method to two case studies identifying nine types of urban form and their respective potential trade-offs, which are directly applicable for the assessment of strategic decisions regarding urban form during the early planning stages

    3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control?

    Full text link
    Recent advancements in implicit 3D representations and generative models have markedly propelled the field of 3D object generation forward. However, it remains a significant challenge to accurately model geometries with defined sharp features under parametric controls, which is crucial in fields like industrial design and manufacturing. To bridge this gap, we introduce a framework that employs Large Language Models (LLMs) to generate text-driven 3D shapes, manipulating 3D software via program synthesis. We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes, designed to explore state-of-the-art LLMs within our proposed pipeline. Our work reveals effective generation strategies and delves into the self-correction capabilities of LLMs using a visual interface. Our work highlights both the potential and limitations of LLMs in 3D parametric modeling for industrial applications.Comment: 10 pages, 6 figure

    Toward knowledge-based automatic 3D spatial topological modeling from LiDAR point clouds for urban areas

    Get PDF
    Le traitement d'un très grand nombre de données LiDAR demeure très coûteux et nécessite des approches de modélisation 3D automatisée. De plus, les nuages de points incomplets causés par l'occlusion et la densité ainsi que les incertitudes liées au traitement des données LiDAR compliquent la création automatique de modèles 3D enrichis sémantiquement. Ce travail de recherche vise à développer de nouvelles solutions pour la création automatique de modèles géométriques 3D complets avec des étiquettes sémantiques à partir de nuages de points incomplets. Un cadre intégrant la connaissance des objets à la modélisation 3D est proposé pour améliorer la complétude des modèles géométriques 3D en utilisant un raisonnement qualitatif basé sur les informations sémantiques des objets et de leurs composants, leurs relations géométriques et spatiales. De plus, nous visons à tirer parti de la connaissance qualitative des objets en reconnaissance automatique des objets et à la création de modèles géométriques 3D complets à partir de nuages de points incomplets. Pour atteindre cet objectif, plusieurs solutions sont proposées pour la segmentation automatique, l'identification des relations topologiques entre les composants de l'objet, la reconnaissance des caractéristiques et la création de modèles géométriques 3D complets. (1) Des solutions d'apprentissage automatique ont été proposées pour la segmentation sémantique automatique et la segmentation de type CAO afin de segmenter des objets aux structures complexes. (2) Nous avons proposé un algorithme pour identifier efficacement les relations topologiques entre les composants d'objet extraits des nuages de points afin d'assembler un modèle de Représentation Frontière. (3) L'intégration des connaissances sur les objets et la reconnaissance des caractéristiques a été développée pour inférer automatiquement les étiquettes sémantiques des objets et de leurs composants. Afin de traiter les informations incertitudes, une solution de raisonnement automatique incertain, basée sur des règles représentant la connaissance, a été développée pour reconnaître les composants du bâtiment à partir d'informations incertaines extraites des nuages de points. (4) Une méthode heuristique pour la création de modèles géométriques 3D complets a été conçue en utilisant les connaissances relatives aux bâtiments, les informations géométriques et topologiques des composants du bâtiment et les informations sémantiques obtenues à partir de la reconnaissance des caractéristiques. Enfin, le cadre proposé pour améliorer la modélisation 3D automatique à partir de nuages de points de zones urbaines a été validé par une étude de cas visant à créer un modèle de bâtiment 3D complet. L'expérimentation démontre que l'intégration des connaissances dans les étapes de la modélisation 3D est efficace pour créer un modèle de construction complet à partir de nuages de points incomplets.The processing of a very large set of LiDAR data is very costly and necessitates automatic 3D modeling approaches. In addition, incomplete point clouds caused by occlusion and uneven density and the uncertainties in the processing of LiDAR data make it difficult to automatic creation of semantically enriched 3D models. This research work aims at developing new solutions for the automatic creation of complete 3D geometric models with semantic labels from incomplete point clouds. A framework integrating knowledge about objects in urban scenes into 3D modeling is proposed for improving the completeness of 3D geometric models using qualitative reasoning based on semantic information of objects and their components, their geometric and spatial relations. Moreover, we aim at taking advantage of the qualitative knowledge of objects in automatic feature recognition and further in the creation of complete 3D geometric models from incomplete point clouds. To achieve this goal, several algorithms are proposed for automatic segmentation, the identification of the topological relations between object components, feature recognition and the creation of complete 3D geometric models. (1) Machine learning solutions have been proposed for automatic semantic segmentation and CAD-like segmentation to segment objects with complex structures. (2) We proposed an algorithm to efficiently identify topological relationships between object components extracted from point clouds to assemble a Boundary Representation model. (3) The integration of object knowledge and feature recognition has been developed to automatically obtain semantic labels of objects and their components. In order to deal with uncertain information, a rule-based automatic uncertain reasoning solution was developed to recognize building components from uncertain information extracted from point clouds. (4) A heuristic method for creating complete 3D geometric models was designed using building knowledge, geometric and topological relations of building components, and semantic information obtained from feature recognition. Finally, the proposed framework for improving automatic 3D modeling from point clouds of urban areas has been validated by a case study aimed at creating a complete 3D building model. Experiments demonstrate that the integration of knowledge into the steps of 3D modeling is effective in creating a complete building model from incomplete point clouds

    Task planning for table clearing of cluttered objects

    Get PDF
    Manipulation planning is a field of study with increasing interest, it combines manipulation skills and an artificial intelligence system that is able to find the optimal sequence of actions in order to solve manipulation problems. It is a complex problem since involves a mixture of symbolic planning and geometric planning. To complete the task the sequence of actions has to satisfy a set of geometrical restrictions. In this thesis we present a planning system for clearing a table with cluttered objects, which tackles geometrical restrictions within symbolic planning with a backtracking approach. The main contribution of this thesis is a planning system able to solve a wider variety of scenarios for clearing a table with cluttered objects. Grasping actions alone are not enough, and pushing actions may be needed to move an object to a pose in which it can be grasped. The planning system presented here can reason about sequences of pushing and grasping actions that allow a robot to grasp an object that was not initially graspable. This work shows that some geometric problems can be efficiently handled by reasoning at an abstract level through symbolic predicates when such predicates are chosen correctly. The advantage of this system is a reduction in execution time and it is also easy to implement. This master thesis has been developed in the Institut de Robòtica i Informàtica Industrial (IRI) in the Perception and Manipulation laboratory with the supervision of David Martínez Martínez as director and Guillem Alenyà Ribas as co-director

    On Space Syntax as a Configurational Theory of Architecture from a Situated Observer's Viewpoint

    Get PDF
    A confi gurational theory of architecture (CTA) from a situated observer’s viewpoint (SOV) is discussed. It includes the levels of description-proper, representation, and interpretation. It takes a bottom-up approach because a situated observer, who is on the ground with a building, typically builds her understanding of the building using immediately available elements, called perceptual primitives. Evidence from geometry, psychology/cognition, and spatial reasoning suggests that the level of description-proper of a CTA from a SOV must include unambiguously defi ned perceptual primitives and their perceivable elementary topological and projective relations. Subsequently, in the levels of representation and interpretation any complex relational properties of buildings must be constructed and their meanings must be explained using these perceptual primitives. Early space syntax (SS), with its foundations defi ned using such perceptual primitives as convex space and axial lines, helps capture the structure of visual experience of buildings but has limitations regarding a CTA from a SOV. More recently, SS theorists have revised the foundations of SS using much simpler perceptual primitives in an attempt to integrate the apparently disparate techniques of SS into a coherent mathematical system. As a result, they have eliminated many limitations of early SS regarding a CTA from a SOV. However, in order to become a CTA from a SOV, SS will still need to explain the importance of these newly defi ned perceptual primitives, and - provide a framework for confi gurational studies using the mathematical system developed based on these primitives

    Description Logic for Scene Understanding at the Example of Urban Road Intersections

    Get PDF
    Understanding a natural scene on the basis of external sensors is a task yet to be solved by computer algorithms. The present thesis investigates the suitability of a particular family of explicit, formal representation and reasoning formalisms for this task, which are subsumed under the term Description Logic
    • …
    corecore