521 research outputs found

    3D Scene Geometry Estimation from 360^\circ Imagery: A Survey

    Full text link
    This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360^\circ, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and mapping). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.Comment: Published in ACM Computing Survey

    Surface Deformation Potentials on Meshes for Computer Graphics and Visualization

    Get PDF
    Shape deformation models have been used in computer graphics primarily to describe the dynamics of physical deformations like cloth draping, collisions of elastic bodies, fracture, or animation of hair. Less frequent is their application to problems not directly related to a physical process. In this thesis we apply deformations to three problems in computer graphics that do not correspond to physical deformations. To this end, we generalize the physical model by modifying the energy potential. Originally, the energy potential amounts to the physical work needed to deform a body from its rest state into a given configuration and relates material strain to internal restoring forces that act to restore the original shape. For each of the three problems considered, this potential is adapted to reflect an application specific notion of shape. Under the influence of further constraints, our generalized deformation results in shapes that balance preservation of certain shape properties and application specific objectives similar to physical equilibrium states. The applications discussed in this thesis are surface parameterization, interactive shape editing and automatic design of panorama maps. For surface parameterization, we interpret parameterizations over a planar domain as deformations from a flat initial configuration onto a given surface. In this setting, we review existing parameterization methods by analyzing properties of their potential functions and derive potentials accounting for distortion of geometric properties. Interactive shape editing allows an untrained user to modify complex surfaces, be simply grabbing and moving parts of interest. A deformation model interactively extrapolates the transformation from those parts to the rest of the surface. This thesis proposes a differential shape representation for triangle meshes leading to a potential that can be optimized interactively with a simple, tailored algorithm. Although the potential is not physically accurate, it results in intuitive deformation behavior and can be parameterized to account for different material properties. Panorama maps are blends between landscape illustrations and geographic maps that are traditionally painted by an artist to convey geographic surveyknowledge on public places like ski resorts or national parks. While panorama maps are not drawn to scale, the shown landscape remains recognizable and the observer can easily recover details necessary for self location and orientation. At the same time, important features as trails or ski slopes appear not occluded and well visible. This thesis proposes the first automatic panorama generation method. Its basis is again a surface deformation, that establishes the necessary compromise between shape preservation and feature visibility.Potentiale zur Flächendeformation auf Dreiecksnetzen für Anwendungen in der Computergrafik und Visualisierung Deformationsmodelle werden in der Computergrafik bislang hauptsächlich eingesetzt, um die Dynamik physikalischer Deformationsprozesse zu modellieren. Gängige Beispiele sind Bekleidungssimulationen, Kollisionen elastischer Körper oder Animation von Haaren und Frisuren. Deutlich seltener ist ihre Anwendung auf Probleme, die nicht direkt physikalischen Prozessen entsprechen. In der vorliegenden Arbeit werden Deformationsmodelle auf drei Probleme der Computergrafik angewandt, die nicht unmittelbar einem physikalischen Deformationsprozess entsprechen. Zu diesem Zweck wird das physikalische Modell durch eine passende Änderung der potentiellen Energie verallgemeinert. Die potentielle Energie entspricht normalerweise der physikalischen Arbeit, die aufgewendet werden muss, um einen Körper aus dem Ruhezustand in eine bestimmte Konfiguration zu verformen. Darüber hinaus setzt sie die aktuelle Verformung in Beziehung zu internen Spannungskräften, die wirken um die ursprüngliche Form wiederherzustellen. In dieser Arbeit passen wir für jedes der drei betrachteten Problemfelder die potentielle Energie jeweils so an, dass sie eine anwendungsspezifische Definition von Form widerspiegelt. Unter dem Einfluss weiterer Randbedingungen führt die so verallgemeinerte Deformation zu einer Fläche, die eine Balance zwischen der Erhaltung gewisser Formeigenschaften und Zielvorgaben der Anwendung findet. Diese Balance entspricht dem Equilibrium einer physikalischen Deformation. Die drei in dieser Arbeit diskutierten Anwendungen sind Oberflächenparameterisierung, interaktives Bearbeiten von Flächen und das vollautomatische Erzeugen von Panoramakarten im Stile von Heinrich Berann. Zur Oberflächenparameterisierung interpretieren wir Parameterisierungen über einem flachen Parametergebiet als Deformationen, die ein ursprünglich ebenes Flächenstück in eine gegebene Oberfläche verformen. Innerhalb dieses Szenarios vergleichen wir dann existierende Methoden zur planaren Parameterisierung, indem wir die resultierenden potentiellen Energien analysieren, und leiten weitere Potentiale her, die die Störung geometrischer Eigenschaften wie Fläche und Winkel erfassen. Verfahren zur interaktiven Flächenbearbeitung ermöglichen schnelle und intuitive Änderungen an einer komplexen Oberfläche. Dazu wählt der Benutzer Teile der Fläche und bewegt diese durch den Raum. Ein Deformationsmodell extrapoliert interaktiv die Transformation der gewählten Teile auf die restliche Fläche. Diese Arbeit stellt eine neue differentielle Flächenrepräsentation für diskrete Flächen vor, die zu einem einfach und interaktiv zu optimierendem Potential führt. Obwohl das vorgeschlagene Potential nicht physikalisch korrekt ist, sind die resultierenden Deformationen intuitiv. Mittels eines Parameters lassen sich außerdem bestimmte Materialeigenschaften einstellen. Panoramakarten im Stile von Heinrich Berann sind eine Verschmelzung von Landschaftsillustration und geographischer Karte. Traditionell werden sie so von Hand gezeichnet, dass bestimmt Merkmale wie beispielsweise Skipisten oder Wanderwege in einem Gebiet unverdeckt und gut sichtbar bleiben, was große Kunstfertigkeit verlangt. Obwohl diese Art der Darstellung nicht maßstabsgetreu ist, sind Abweichungen auf den ersten Blick meistens nicht zu erkennen. Dadurch kann der Betrachter markante Details schnell wiederfinden und sich so innerhalb des Gebietes orientieren. Diese Arbeit stellt das erste, vollautomatische Verfahren zur Erzeugung von Panoramakarten vor. Grundlage ist wiederum eine verallgemeinerte Oberflächendeformation, die sowohl auf Formerhaltung als auch auf die Sichtbarkeit vorgegebener geographischer Merkmale abzielt

    Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

    Full text link
    We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and motion. We also demonstrate a few of its applications, including visual analogy-making and video extrapolation.Comment: Journal preprint of arXiv:1607.02586 (IEEE TPAMI, 2019). The first two authors contributed equally to this work. Project page: http://visualdynamics.csail.mit.ed

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    Skeleton-aided Articulated Motion Generation

    Full text link
    This work make the first attempt to generate articulated human motion sequence from a single image. On the one hand, we utilize paired inputs including human skeleton information as motion embedding and a single human image as appearance reference, to generate novel motion frames, based on the conditional GAN infrastructure. On the other hand, a triplet loss is employed to pursue appearance-smoothness between consecutive frames. As the proposed framework is capable of jointly exploiting the image appearance space and articulated/kinematic motion space, it generates realistic articulated motion sequence, in contrast to most previous video generation methods which yield blurred motion effects. We test our model on two human action datasets including KTH and Human3.6M, and the proposed framework generates very promising results on both datasets.Comment: ACM MM 201

    Automated Teeth Extraction and Dental Caries Detection in Panoramic X-ray

    Get PDF
    Dental caries is one of the most chronic diseases that involves the majority of people at least once during their lifetime. This expensive disease accounts for 5-10% of the healthcare budget in developing countries. Caries lesions appear as the result of dental biofi lm metabolic activity, caused by bacteria (most prominently Streptococcus mutans) feeding on uncleaned sugars and starches in oral cavity. Also known as tooth decay, they are primarily diagnosed by general dentists solely based on clinical assessments. Since in many cases dental problems cannot be detected with simple observations, dental x-ray imaging is introduced as a standard tool for domain experts, i.e. dentists and radiologists, to distinguish dental diseases, such as proximal caries. Among different dental radiography methods, Panoramic or Orthopantomogram (OPG) images are commonly performed as the initial step toward assessment. OPG images are captured with a small dose of radiation and can depict the entire patient dentition in a single image. Dental caries can sometimes be hard to identify by general dentists relying only on their visual inspection using dental radiography. Tooth decays can easily be misinterpreted as shadows due to various reasons, such as low image quality. Besides, OPG images have poor quality and structures are not presented with strong edges due to low contrast, uneven exposure, etc. Thus, disease detection is a very challenging task using Panoramic radiography. With the recent development of Artificial Intelligence (AI) in dentistry, and with the introduction of Convolutional Neural Network (CNN) for image classification, developing medical decision support systems is becoming a topic of interest in both academia and industry. Providing more accurate decision support systems using CNNs to assist dentists can enhance their diagnosis performance, resulting in providing improved dental care assistance for patients. In the following thesis, the first automated teeth extraction system for Panoramic images, using evolutionary algorithms, is proposed. In contrast to other intraoral radiography methods, Panoramic is captured with x-ray film outside the patient mouth. Therefore, Panoramic x-rays contain regions outside of the jaw, which make teeth segmentation extremely difficult. Considering that we solely need an image of each tooth separately to build a caries detection model, segmentation of teeth from the OPG image is essential. Due to the absence of significant pixel intensity difference between different regions in OPG radiography, teeth segmentation becomes very hard to implement. Consequently, an automated system is introduced to get an OPG as input and gives images of single teeth as the output. Since only a few research studies are utilizing similar task for Panoramic radiography, there is room for improvement. A genetic algorithm is applied along with different image processing methods to perform teeth extraction by jaw extraction, jaw separation, and teeth-gap valley detection, respectively. The proposed system is compared to the state-of-the-art in teeth extraction on other image types. After teeth are segmented from each image, a model based on various untrained and pretrained CNN-based architectures is proposed to detect dental caries for each tooth. Autoencoder-based model along with famous CNN architectures are used for feature extraction, followed by capsule networks to perform classification. The dataset of Panoramic x-rays is prepared by the authors, with help from an expert radiologist to provide labels. The proposed model has demonstrated an acceptable detection rate of 86.05%, and an increase in caries detection speed. Considering the challenges of performing such task on low quality OPG images, this work is a step towards developing a fully automated efficient caries detection model to assist domain experts

    Image-based rendering and synthesis

    Get PDF
    Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

    Automated Teeth Extraction and Dental Caries Detection in Panoramic X-ray

    Get PDF
    Dental caries is one of the most chronic diseases that involves the majority of people at least once during their lifetime. This expensive disease accounts for 5-10% of the healthcare budget in developing countries. Caries lesions appear as the result of dental biofi lm metabolic activity, caused by bacteria (most prominently Streptococcus mutans) feeding on uncleaned sugars and starches in oral cavity. Also known as tooth decay, they are primarily diagnosed by general dentists solely based on clinical assessments. Since in many cases dental problems cannot be detected with simple observations, dental x-ray imaging is introduced as a standard tool for domain experts, i.e. dentists and radiologists, to distinguish dental diseases, such as proximal caries. Among different dental radiography methods, Panoramic or Orthopantomogram (OPG) images are commonly performed as the initial step toward assessment. OPG images are captured with a small dose of radiation and can depict the entire patient dentition in a single image. Dental caries can sometimes be hard to identify by general dentists relying only on their visual inspection using dental radiography. Tooth decays can easily be misinterpreted as shadows due to various reasons, such as low image quality. Besides, OPG images have poor quality and structures are not presented with strong edges due to low contrast, uneven exposure, etc. Thus, disease detection is a very challenging task using Panoramic radiography. With the recent development of Artificial Intelligence (AI) in dentistry, and with the introduction of Convolutional Neural Network (CNN) for image classification, developing medical decision support systems is becoming a topic of interest in both academia and industry. Providing more accurate decision support systems using CNNs to assist dentists can enhance their diagnosis performance, resulting in providing improved dental care assistance for patients. In the following thesis, the first automated teeth extraction system for Panoramic images, using evolutionary algorithms, is proposed. In contrast to other intraoral radiography methods, Panoramic is captured with x-ray film outside the patient mouth. Therefore, Panoramic x-rays contain regions outside of the jaw, which make teeth segmentation extremely difficult. Considering that we solely need an image of each tooth separately to build a caries detection model, segmentation of teeth from the OPG image is essential. Due to the absence of significant pixel intensity difference between different regions in OPG radiography, teeth segmentation becomes very hard to implement. Consequently, an automated system is introduced to get an OPG as input and gives images of single teeth as the output. Since only a few research studies are utilizing similar task for Panoramic radiography, there is room for improvement. A genetic algorithm is applied along with different image processing methods to perform teeth extraction by jaw extraction, jaw separation, and teeth-gap valley detection, respectively. The proposed system is compared to the state-of-the-art in teeth extraction on other image types. After teeth are segmented from each image, a model based on various untrained and pretrained CNN-based architectures is proposed to detect dental caries for each tooth. Autoencoder-based model along with famous CNN architectures are used for feature extraction, followed by capsule networks to perform classification. The dataset of Panoramic x-rays is prepared by the authors, with help from an expert radiologist to provide labels. The proposed model has demonstrated an acceptable detection rate of 86.05%, and an increase in caries detection speed. Considering the challenges of performing such task on low quality OPG images, this work is a step towards developing a fully automated efficient caries detection model to assist domain experts

    Three-dimensional modeling of the human jaw/teeth using optics and statistics.

    Get PDF
    Object modeling is a fundamental problem in engineering, involving talents from computer-aided design, computational geometry, computer vision and advanced manufacturing. The process of object modeling takes three stages: sensing, representation, and analysis. Various sensors may be used to capture information about objects; optical cameras and laser scanners are common with rigid objects, while X-ray, CT and MRI are common with biological organs. These sensors may provide a direct or an indirect inference about the object, requiring a geometric representation in the computer that is suitable for subsequent usage. Geometric representations that are compact, i.e., capture the main features of the objects with a minimal number of data points or vertices, fall into the domain of computational geometry. Once a compact object representation is in the computer, various analysis steps can be conducted, including recognition, coding, transmission, etc. The subject matter of this dissertation is object reconstruction from a sequence of optical images using shape from shading (SFS) and SFS with shape priors. The application domain is dentistry. Most of the SFS approaches focus on the computational part of the SFS problem, i.e. the numerical solution. As a result, the imaging model in most conventional SFS algorithms has been simplified under three simple, but restrictive assumptions: (1) the camera performs an orthographic projection of the scene, (2) the surface has a Lambertian reflectance and (3) the light source is a single point source at infinity. Unfortunately, such assumptions are no longer held in the case of reconstruction of real objects as intra-oral imaging environment for human teeth. In this work, we introduce a more realistic formulation of the SFS problem by considering the image formation components: the camera, the light source, and the surface reflectance. This dissertation proposes a non-Lambertian SFS algorithm under perspective projection which benefits from camera calibration parameters. The attenuation of illumination is taken account due to near-field imaging. The surface reflectance is modeled using the Oren-Nayar-Wolff model which accounts for the retro-reflection case. In this context, a new variational formulation is proposed that relates an evolving surface model with image information, taking into consideration that the image is taken by a perspective camera with known parameters. A new energy functional is formulated to incorporate brightness, smoothness and integrability constraints. In addition, to further improve the accuracy and practicality of the results, 3D shape priors are incorporated in the proposed SFS formulation. This strategy is motivated by the fact that humans rely on strong prior information about the 3D world around us in order to perceive 3D shape information. Such information is statistically extracted from training 3D models of the human teeth. The proposed SFS algorithms have been used in two different frameworks in this dissertation: a) holistic, which stitches a sequence of images in order to cover the entire jaw, and then apply the SFS, and b) piece-wise, which focuses on a specific tooth or a segment of the human jaw, and applies SFS using physical teeth illumination characteristics. To augment the visible portion, and in order to have the entire jaw reconstructed without the use of CT or MRI or even X-rays, prior information were added which gathered from a database of human jaws. This database has been constructed from an adult population with variations in teeth size, degradation and alignments. The database contains both shape and albedo information for the population. Using this database, a novel statistical shape from shading (SSFS) approach has been created. Extending the work on human teeth analysis, Finite Element Analysis (FEA) is adapted for analyzing and calculating stresses and strains of dental structures. Previous Finite Element (FE) studies used approximate 2D models. In this dissertation, an accurate three-dimensional CAD model is proposed. 3D stress and displacements of different teeth type are successfully carried out. A newly developed open-source finite element solver, Finite Elements for Biomechanics (FEBio), has been used. The limitations of the experimental and analytical approaches used for stress and displacement analysis are overcome by using FEA tool benefits such as dealing with complex geometry and complex loading conditions
    corecore