192 research outputs found

    Automatic Reconstruction of Parametric, Volumetric Building Models from 3D Point Clouds

    Get PDF
    Planning, construction, modification, and analysis of buildings requires means of representing a building's physical structure and related semantics in a meaningful way. With the rise of novel technologies and increasing requirements in the architecture, engineering and construction (AEC) domain, two general concepts for representing buildings have gained particular attention in recent years. First, the concept of Building Information Modeling (BIM) is increasingly used as a modern means for representing and managing a building's as-planned state digitally, including not only a geometric model but also various additional semantic properties. Second, point cloud measurements are now widely used for capturing a building's as-built condition by means of laser scanning techniques. A particular challenge and topic of current research are methods for combining the strengths of both point cloud measurements and Building Information Modeling concepts to quickly obtain accurate building models from measured data. In this thesis, we present our recent approaches to tackle the intermeshed challenges of automated indoor point cloud interpretation using targeted segmentation methods, and the automatic reconstruction of high-level, parametric and volumetric building models as the basis for further usage in BIM scenarios. In contrast to most reconstruction methods available at the time, we fundamentally base our approaches on BIM principles and standards, and overcome critical limitations of previous approaches in order to reconstruct globally plausible, volumetric, and parametric models.Automatische Rekonstruktion von parametrischen, volumetrischen Gebäudemodellen aus 3D Punktwolken Für die Planung, Konstruktion, Modifikation und Analyse von Gebäuden werden Möglichkeiten zur sinnvollen Repräsentation der physischen Gebäudestruktur sowie dazugehöriger Semantik benötigt. Mit dem Aufkommen neuer Technologien und steigenden Anforderungen im Bereich von Architecture, Engineering and Construction (AEC) haben zwei Konzepte für die Repräsentation von Gebäuden in den letzten Jahren besondere Aufmerksamkeit erlangt. Erstens wird das Konzept des Building Information Modeling (BIM) zunehmend als ein modernes Mittel zur digitalen Abbildung und Verwaltung "As-Planned"-Zustands von Gebäuden verwendet, welches nicht nur ein geometrisches Modell sondern auch verschiedene zusätzliche semantische Eigenschaften beinhaltet. Zweitens werden Punktwolkenmessungen inzwischen häufig zur Aufnahme des "As-Built"-Zustands mittels Laser-Scan-Techniken eingesetzt. Eine besondere Herausforderung und Thema aktueller Forschung ist die Entwicklung von Methoden zur Vereinigung der Stärken von Punktwolken und Konzepten des Building Information Modeling um schnell akkurate Gebäudemodelle aus den gemessenen Daten zu erzeugen. In dieser Dissertation präsentieren wir unsere aktuellen Ansätze um die miteinander verwobenen Herausforderungen anzugehen, Punktwolken mithilfe geeigneter Segmentierungsmethoden automatisiert zu interpretieren, sowie hochwertige, parametrische und volumetrische Gebäudemodelle als Basis für die Verwendung im BIM-Umfeld zu rekonstruieren. Im Gegensatz zu den meisten derzeit verfügbaren Rekonstruktionsverfahren basieren unsere Ansätze grundlegend auf Prinzipien und Standards aus dem BIM-Umfeld und überwinden kritische Einschränkungen bisheriger Ansätze um vollständig plausible, volumetrische und parametrische Modelle zu erzeugen.</p

    From pixels to gestures: learning visual representations for human analysis in color and depth data sequences

    Get PDF
    [cat] L’anàlisi visual de persones a partir d'imatges és un tema de recerca molt important, atesa la rellevància que té a una gran quantitat d'aplicacions dins la visió per computador, com per exemple: detecció de vianants, monitorització i vigilància,interacció persona-màquina, “e-salut” o sistemes de recuperació d’matges a partir de contingut, entre d'altres. En aquesta tesi volem aprendre diferents representacions visuals del cos humà, que siguin útils per a la anàlisi visual de persones en imatges i vídeos. Per a tal efecte, analitzem diferents modalitats d'imatge com són les imatges de color RGB i les imatges de profunditat, i adrecem el problema a diferents nivells d'abstracció, des dels píxels fins als gestos: segmentació de persones, estimació de la pose humana i reconeixement de gestos. Primer, mostrem com la segmentació binària (objecte vs. fons) del cos humà en seqüències d'imatges ajuda a eliminar soroll pertanyent al fons de l'escena en qüestió. El mètode presentat, basat en optimització “Graph cuts”, imposa consistència espai-temporal a Ies màscares de segmentació obtingudes en “frames” consecutius. En segon lloc, presentem un marc metodològic per a la segmentació multi-classe, amb la qual podem obtenir una descripció més detallada del cos humà, en comptes d'obtenir una simple representació binària separant el cos humà del fons, podem obtenir màscares de segmentació més detallades, separant i categoritzant les diferents parts del cos. A un nivell d'abstraccíó més alt, tenim com a objectiu obtenir representacions del cos humà més simples, tot i ésser suficientment descriptives. Els mètodes d'estimació de la pose humana sovint es basen en models esqueletals del cos humà, formats per segments (o rectangles) que representen les extremitats del cos, connectades unes amb altres seguint les restriccions cinemàtiques del cos humà. A la pràctica, aquests models esqueletals han de complir certes restriccions per tal de poder aplicar mètodes d'inferència que permeten trobar la solució òptima de forma eficient, però a la vegada aquestes restriccions suposen una gran limitació en l'expressivitat que aques.ts models son capaços de capturar. Per tal de fer front a aquest problema, proposem un enfoc “top-down” per a predir la posició de les parts del cos del model esqueletal, introduïnt una representació de parts de mig nivell basada en “Poselets”. Finalment. proposem un marc metodològic per al reconeixement de gestos, basat en els “bag of visual words”. Aprofitem els avantatges de les imatges RGB i les imatges; de profunditat combinant vocabularis visuals específiques per a cada modalitat, emprant late fusion. Proposem un nou descriptor per a imatges de profunditat invariant a rotació, que millora l'estat de l'art, i fem servir piràmides espai-temporals per capturar certa estructura espaial i temporal dels gestos. Addicionalment, presentem una reformulació probabilística del mètode “Dynamic Time Warping” per al reconeixement de gestos en seqüències d'imatges. Més específicament, modelem els gestos amb un model probabilistic gaussià que implícitament codifica possibles deformacions tant en el domini espaial com en el temporal.[eng] The visual analysis of humans from images is an important topic of interest due to its relevance to many computer vision applications like pedestrian detection, monitoring and surveillance, human-computer interaction, e-health or content-based image retrieval, among others. In this dissertation in learning different visual representations of the human body that are helpful for the visual analysis of humans in images and video sequences. To that end, we analyze both RCB and depth image modalities and address the problem from three different research lines, at different levels of abstraction; from pixels to gestures: human segmentation, human pose estimation and gesture recognition. First, we show how binary segmentation (object vs. background) of the human body in image sequences is helpful to remove all the background clutter present in the scene. The presented method, based on “Graph cuts” optimization, enforces spatio-temporal consistency of the produced segmentation masks among consecutive frames. Secondly, we present a framework for multi-label segmentation for obtaining much more detailed segmentation masks: instead of just obtaining a binary representation separating the human body from the background, finer segmentation masks can be obtained separating the different body parts. At a higher level of abstraction, we aim for a simpler yet descriptive representation of the human body. Human pose estimation methods usually rely on skeletal models of the human body, formed by segments (or rectangles) that represent the body limbs, appropriately connected following the kinematic constraints of the human body, In practice, such skeletal models must fulfill some constraints in order to allow for efficient inference, while actually Iimiting the expressiveness of the model. In order to cope with this, we introduce a top-down approach for predicting the position of the body parts in the model, using a mid-level part representation based on Poselets. Finally, we propose a framework for gesture recognition based on the bag of visual words framework. We leverage the benefits of RGB and depth image modalities by combining modality-specific visual vocabularies in a late fusion fashion. A new rotation-variant depth descriptor is presented, yielding better results than other state-of-the-art descriptors. Moreover, spatio-temporal pyramids are used to encode rough spatial and temporal structure. In addition, we present a probabilistic reformulation of Dynamic Time Warping for gesture segmentation in video sequences, A Gaussian-based probabilistic model of a gesture is learnt, implicitly encoding possible deformations in both spatial and time domains

    Heterogeneous volumetric data mapping and its medical applications

    Get PDF
    With the advance of data acquisition techniques, massive solid geometries are being collected routinely in scientific tasks, these complex and unstructured data need to be effectively correlated for various processing and analysis. Volumetric mapping solves bijective low-distortion correspondence between/among 3D geometric data, and can serve as an important preprocessing step in many tasks in compute-aided design and analysis, industrial manufacturing, medical image analysis, to name a few. This dissertation studied two important volumetric mapping problems: the mapping of heterogeneous volumes (with nonuniform inner structures/layers) and the mapping of sequential dynamic volumes. To effectively handle heterogeneous volumes, first, we studied the feature-aligned harmonic volumetric mapping. Compared to previous harmonic mapping, it supports the point, curve, and iso-surface alignment, which are important low-dimensional structures in heterogeneous volumetric data. Second, we proposed a biharmonic model for volumetric mapping. Unlike the conventional harmonic volumetric mapping that only supports positional continuity on the boundary, this new model allows us to have higher order continuity C1C^1 along the boundary surface. This suggests a potential model to solve the volumetric mapping of complex and big geometries through divide-and-conquer. We also studied the medical applications of our volumetric mapping in lung tumor respiratory motion modeling. We were building an effective digital platform for lung tumor radiotherapy based on effective volumetric CT/MRI image matching and analysis. We developed and integrated in this platform a set of geometric/image processing techniques including advanced image segmentation, finite element meshing, volumetric registration and interpolation. The lung organ/tumor and surrounding tissues are treated as a heterogeneous region and a dynamic 4D registration framework is developed for lung tumor motion modeling and tracking. Compared to the previous 3D pairwise registration, our new 4D parameterization model leads to a significantly improved registration accuracy. The constructed deforming model can hence approximate the deformation of the tissues and tumor
    corecore