325 research outputs found

    Modelado jerárquico de objetos 3D con superficies de subdivisión

    Get PDF
    Las SSs (Superficies de Subdivisión) son un potente paradigma de modelado de objetos 3D (tridimensionales) que establece un puente entre los dos enfoques tradicionales a la aproximación de superficies, basados en mallas poligonales y de parches alabeados, que conllevan problemas uno y otro. Los esquemas de subdivisión permiten definir una superficie suave (a tramos), como las más frecuentes en la práctica, como el límite de un proceso recursivo de refinamiento de una malla de control burda, que puede ser descrita muy compactamente. Además, la recursividad inherente a las SSs establece naturalmente una relación de anidamiento piramidal entre las mallas / NDs (Niveles de Detalle) generadas/os sucesivamente, por lo que las SSs se prestan extraordinariamente al AMRO (Análisis Multiresolución mediante Ondículas) de superficies, que tiene aplicaciones prácticas inmediatas e interesantísimas, como la codificación y la edición jerárquicas de modelos 3D. Empezamos describiendo los vínculos entre las tres áreas que han servido de base a nuestro trabajo (SSs, extracción automática de NDs y AMRO) para explicar como encajan estas tres piezas del puzzle del modelado jerárquico de objetos de 3D con SSs. El AMRO consiste en descomponer una función en una versión burda suya y un conjunto de refinamientos aditivos anidados jerárquicamente llamados "coeficientes ondiculares". La teoría clásica de ondículas estudia las señales clásicas nD: las definidas sobre dominios paramétricos homeomorfos a R" o (0,1)n como el audio (n=1), las imágenes (n=2) o el vídeo (n=3). En topologías menos triviales, como las variedades 2D) (superficies en el espacio 3D), el AMRO no es tan obvio, pero sigue siendo posible si se enfoca desde la perspectiva de las SSs. Basta con partir de una malla burda que aproxime a un bajo ND la superficie considerada, subdividirla recursivamente y, al hacerlo, ir añadiendo los coeficientes ondiculares, que son los detalles 3D necesarios para obtener aproximaciones más y más finas a la superficie original. Pasamos después a las aplicaciones prácticas que constituyen nuestros principal desarrollo original y, en particular, presentamos una técnica de codificación jerárquica de modelos 3D basada en SSs, que actúa sobre los detalles 3D mencionados: los expresa en un referencial normal loscal; los organiza según una estructura jerárquica basada en facetas; los cuantifica dedicando menos bits a sus componentes tangenciales, menos energéticas, y los "escalariza"; y los codifica dinalmente gracias a una técnica similar al SPIHT (Set Partitioning In Hierarchical Tress) de Said y Pearlman. El resultado es un código completamente embebido y al menos dos veces más compacto, para superficies mayormente suaves, que los obtenidos con técnicas de codificación progresiva de mallas 3D publicadas previamente, en las que además los NDs no están anidados piramidalmente. Finalmente, describimos varios métodos auxiliares que hemos desarrollado, mejorando técnicas previas y creando otras propias, ya que una solución completa al modelado de objetos 3D con SSs requiere resolver otros dos problemas. El primero es la extracción de una malla base (triangular, en nuestro caso) de la superficie original, habitualmente dada por una malla triangular fina con conectividad arbitraria. El segundo es la generación de un remallado recursivo con conectividad de subdivisión de la malla original/objetivo mediante un refinamiento recursivo de la malla base, calculando así los detalles 3D necesarios para corregir las posiciones predichas por la subdivisión para nuevos vértices

    Compression of 3D models with NURBS

    Get PDF
    With recent progress in computing, algorithmics and telecommunications, 3D models are increasingly used in various multimedia applications. Examples include visualization, gaming, entertainment and virtual reality. In the multimedia domain 3D models have been traditionally represented as polygonal meshes. This piecewise planar representation can be thought of as the analogy of bitmap images for 3D surfaces. As bitmap images, they enjoy great flexibility and are particularly well suited to describing information captured from the real world, through, for instance, scanning processes. They suffer, however, from the same shortcomings, namely limited resolution and large storage size. The compression of polygonal meshes has been a very active field of research in the last decade and rather efficient compression algorithms have been proposed in the literature that greatly mitigate the high storage costs. However, such a low level description of a 3D shape has a bounded performance. More efficient compression should be reachable through the use of higher level primitives. This idea has been explored to a great extent in the context of model based coding of visual information. In such an approach, when compressing the visual information a higher level representation (e.g., 3D model of a talking head) is obtained through analysis methods. This can be seen as an inverse projection problem. Once this task is fullled, the resulting parameters of the model are coded instead of the original information. It is believed that if the analysis module is efficient enough, the total cost of coding (in a rate distortion sense) will be greatly reduced. The relatively poor performance and high complexity of currently available analysis methods (except for specific cases where a priori knowledge about the nature of the objects is available), has refrained a large deployment of coding techniques based on such an approach. Progress in computer graphics has however changed this situation. In fact, nowadays, an increasing number of pictures, video and 3D content are generated by synthesis processing rather than coming from a capture device such as a camera or a scanner. This means that the underlying model in the synthesis stage can be used for their efficient coding without the need for a complex analysis module. In other words it would be a mistake to attempt to compress a low level description (e.g., a polygonal mesh) when a higher level one is available from the synthesis process (e.g., a parametric surface). This is, however, what is usually done in the multimedia domain, where higher level 3D model descriptions are converted to polygonal meshes, if anything by the lack of standard coded formats for the former. On a parallel but related path, the way we consume audio-visual information is changing. As opposed to recent past and a large part of today's applications, interactivity is becoming a key element in the way we consume information. In the context of interest in this dissertation, this means that when coding visual information (an image or a video for instance), previously obvious considerations such as decision on sampling parameters are not so obvious anymore. In fact, as in an interactive environment the effective display resolution can be controlled by the user through zooming, there is no clear optimal setting for the sampling period. This means that because of interactivity, the representation used to code the scene should allow the display of objects in a variety of resolutions, and ideally up to infinity. One way to resolve this problem would be by extensive over-sampling. But this approach is unrealistic and too expensive to implement in many situations. The alternative would be to use a resolution independent representation. In the realm of 3D modeling, such representations are usually available when the models are created by an artist on a computer. The scope of this dissertation is precisely the compression of 3D models in higher level forms. The direct coding in such a form should yield improved rate-distortion performance while providing a large degree of resolution independence. There has not been, so far, any major attempt to efficiently compress these representations, such as parametric surfaces. This thesis proposes a solution to overcome this gap. A variety of higher level 3D representations exist, of which parametric surfaces are a popular choice among designers. Within parametric surfaces, Non-Uniform Rational B-Splines (NURBS) enjoy great popularity as a wide range of NURBS based modeling tools are readily available. Recently, NURBS has been included in the Virtual Reality Modeling Language (VRML) and its next generation descendant eXtensible 3D (X3D). The nice properties of NURBS and their widespread use has lead us to choose them as the form we use for the coded representation. The primary goal of this dissertation is the definition of a system for coding 3D NURBS models with guaranteed distortion. The basis of the system is entropy coded differential pulse coded modulation (DPCM). In the case of NURBS, guaranteeing the distortion is not trivial, as some of its parameters (e.g., knots) have a complicated influence on the overall surface distortion. To this end, a detailed distortion analysis is performed. In particular, previously unknown relations between the distortion of knots and the resulting surface distortion are demonstrated. Compression efficiency is pursued at every stage and simple yet efficient entropy coder realizations are defined. The special case of degenerate and closed surfaces with duplicate control points is addressed and an efficient yet simple coding is proposed to compress the duplicate relationships. Encoder aspects are also analyzed. Optimal predictors are found that perform well across a wide class of models. Simplification techniques are also considered for improved compression efficiency at negligible distortion cost. Transmission over error prone channels is also considered and an error resilient extension defined. The data stream is partitioned by independently coding small groups of surfaces and inserting the necessary resynchronization markers. Simple strategies for achieving the desired level of protection are proposed. The same extension also serves the purpose of random access and on-the-fly reordering of the data stream

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    The 1st Conference of PhD Students in Computer Science

    Get PDF

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Robust feature-based 3D mesh segmentation and visual mask with application to QIM 3D watermarking

    Get PDF
    The last decade has seen the emergence of 3D meshes in industrial, medical and entertainment applications. Many researches, from both the academic and the industrial sectors, have become aware of their intellectual property protection arising with their increasing use. The context of this master thesis is related to the digital rights management (DRM) issues and more particularly to 3D digital watermarking which is a technical tool that by means of hiding secret information can offer copyright protection, content authentication, content tracking (fingerprinting), steganography (secret communication inside another media), content enrichment etc. Up to now, 3D watermarking non-blind schemes have reached good levels in terms of robustness against a large set of attacks which 3D models can undergo (such as noise addition, decimation, reordering, remeshing, etc.). Unfortunately, so far blind 3D watermarking schemes do not present a good resistance to de-synchronization attacks (such as cropping or resampling). This work focuses on improving the Spread Transform Dither Modulation (STDM) application on 3D watermarking, which is an extension of the Quantization Index Modulation (QIM), through both the use of the perceptual model presented, which presents good robustness against noising and smoothing attacks, and the the application of an algorithm which provides robustness noising and smoothing attacks, and the the application of an algorithm which provides robustness against reordering and cropping attacks based on robust feature detection. Similar to other watermarking techniques, imperceptibility constraint is very important for 3D objects watermarking. For this reason, this thesis also explores the perception of the distortions related to the watermark embed process as well as to the alterations produced by the attacks that a mesh can undergo
    corecore