322 research outputs found

    S3^3FD: Single Shot Scale-invariant Face Detector

    Full text link
    This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S3^3FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchor-based detectors deteriorate dramatically as the objects become smaller. We make contributions in the following three aspects: 1) proposing a scale-equitable face detection framework to handle different scales of faces well. We tile anchors on a wide range of layers to ensure that all scales of faces have enough features for detection. Besides, we design anchor scales based on the effective receptive field and a proposed equal proportion interval principle; 2) improving the recall rate of small faces by a scale compensation anchor matching strategy; 3) reducing the false positive rate of small faces via a max-out background label. As a consequence, our method achieves state-of-the-art detection performance on all the common face detection benchmarks, including the AFW, PASCAL face, FDDB and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images.Comment: Accepted by ICCV 2017 + its supplementary materials; Updated the latest results on WIDER FAC

    Processor-In-Memory (PIM) Based Architectures for PetaFlops Potential Massively Parallel Processing

    Get PDF
    The report summarizes the work performed at the University of Notre Dame under a NASA grant from July 15, 1995 through July 14, 1996. Researchers involved in the work included the PI, Dr. Peter M. Kogge, and three graduate students under his direction in the Computer Science and Engineering Department: Stephen Dartt, Costin Iancu, and Lakshmi Narayanaswany. The organization of this report is as follows. Section 2 is a summary of the problem addressed by this work. Section 3 is a summary of the project's objectives and approach. Section 4 summarizes PIM technology briefly. Section 5 overviews the main results of the work. Section 6 then discusses the importance of the results and future directions. Also attached to this report are copies of several technical reports and publications whose contents directly reflect results developed during this study

    Mo.Se.: Segmentación de mosaico de imágenes basado en aprendizaje profundo en cascada

    Get PDF
    [EN] Mosaic is an ancient type of art used to create decorative images or patterns combining small components. A digital version of a mosaic can be useful for archaeologists, scholars and restorers who are interested in studying, comparing and preserving mosaics. Nowadays, archaeologists base their studies mainly on manual operation and visual observation that, although still fundamental, should be supported by an automatized procedure of information extraction. In this context, this research explains improvements which can change the manual and time-consuming procedure of mosaic tesserae drawing. More specifically, this paper analyses the advantages of using Mo.Se. (Mosaic Segmentation), an algorithm that exploits deep learning and image segmentation techniques; the methodology combines U-Net 3 Network with the Watershed algorithm. The final purpose is to define a workflow which establishes the steps to perform a robust segmentation and obtain a digital (vector) representation of a mosaic. The detailed approach is presented, and theoretical justifications are provided, building various connections with other models, thus making the workflow both theoretically valuable and practically scalable for medium or large datasets. The automatic segmentation process was tested with the high-resolution orthoimage of an ancient mosaic by following a close-range photogrammetry procedure. Our approach has been tested in the pavement of St. Stephen's Church in Umm ar-Rasas, a Jordan archaeological site, located 30 km southeast of the city of Madaba (Jordan). Experimental results show that this generalized framework yields good performances, obtaining higher accuracy compared with other state-of-the-art approaches. Mo.Se. has been validated using publicly available datasets as a benchmark, demonstrating that the combination of learning-based methods with procedural ones enhances segmentation performance in terms of overall accuracy, which is almost 10% higher. This study’s ambitious aim is to provide archaeologists with a tool which accelerates their work of automatically extracting ancient geometric mosaics.Highlights:A Mo.Se. (Mosaic Segmentation) algorithm is described with the purpose to perform robust image segmentation to automatically detect tesserae in ancient mosaics.This research aims to overcome manual and time-consuming procedure of tesserae segmentation by proposing an approach that uses deep learning and image processing techniques, obtaining a digital replica of a mosaic.Extensive experiments show that the proposed framework outperforms state-of-the-art methods with higher accuracy, even compared with publicly available datasets.[ES] El mosaico es un tipo de arte antiguo utilizado para crear imágenes decorativas o patrones de pequeños componentes. Una versión digital de un mosaico puede ser útil a los arqueólogos, estudiosos y restauradores que están interesados en el estudio, la comparación y la preservación de los mosaicos. Hoy en día, los arqueólogos basan sus estudios principalmente en la operación manual y la observación visual que, aunque sigue siendo fundamental, debe ser apoyada con la ayuda de un procedimiento automatizado de extracción de la información. En este contexto, esta investigación tiene la intención de superar el procedimiento manual y lento del dibujo de teselas en mosaico proponiendo Mo.Se. (Mosaic Segmentation), un algoritmo que explota técnicas de aprendizaje profundo y segmentación de imagen; específicamente, la metodología combina la red U-Net 3 con el algoritmo Watershed. El propósito final es definir un flujo de trabajo que establezca los pasos para realizar una segmentación robusta y obtener una representación digital (vectorial) de un mosaico. Se presenta el procedimiento detallado y se proporcionan justificaciones teóricas, construyendo varias conexiones con otros modelos, haciendo que el flujo de trabajo sea teóricamente valioso y prácticamente escalable en conjuntos de datos medianos o grandes. El proceso de segmentación automática se probó con la ortoimagen de alta resolución de un mosaico antiguo, siguiendo un procedimiento de fotogrametría de objeto cercano. Nuestro enfoque se ha probado en el pavimento de la Iglesia de San Esteban en Umm ar-Rasas, un sitio arqueológico de Jordania, ubicado a 30 km al sureste de la ciudad de Madaba (Jordania). Los resultados experimentales muestran que este marco generalizado produce buenos rendimientos, obteniendo una mayor precisión en comparación con otros enfoques de vanguardia. Mo.Se. se ha validado utilizando conjuntos de datos disponibles públicamente como punto de referencia, lo que demuestra que la combinación de métodos basadosen el aprendizaje con métodos procedimentales mejora el rendimiento de la segmentación en casi un 10% en términos de exactitud en general. El ambicioso objetivo de este estudio es proporcionar a los arqueólogos una herramienta que acelere su trabajo de extracción automática de mosaicos geométricos antiguos.This work was partially found within the framework of the project Innovative technologies and training activities for the conservation and enhancement of the archaeological site of Umm er-Rasas (Jordan) funded by Ministero degli Affari Esteri e della Cooperazione Internazionale. The authors would like to express their gratitude to the ISPC CNR and in particular to Dott. Roberto Gabrielli (project leader) and Alessandra Albiero for providing the dataset.Felicetti, A.; Paolanti, M.; Zingaretti, P.; Pierdicca, R.; Malinverni, ES. (2021). Mo.Se.: Mosaic image segmentation based on deep cascading learning. Virtual Archaeology Review. 12(24):25-38. https://doi.org/10.4995/var.2021.14179OJS25381224Bartoli, A., Fenu, G., Medvet, E., Pellegrino, F. A., & Timeus, N. (2016, November). Segmentation of Mosaic Images Based on Deformable Models Using Genetic Algorithms. In International Conference on Smart Objects and Technologies for Social Good (pp. 233-242). Springer, Cham. https://doi.org/10.1007/978-3-319-61949-1_25Battiato, S., Di Blasi, G., Farinella, G. M., & Gallo, G. (2007, December). Digital mosaic frameworks‐an overview. In computer graphics forum (Vol. 26, No. 4, pp. 794-812). Oxford, UK: Blackwell Publishing Ltd. https://doi.org/10.1111/j.1467-8659.2007.01021.xBeucher, S., & Lantuéjoul, C. (1979). Use of watersheds in contour detection. International workshop on image processing: Real-time edge and motion detection/estimation. Rennes, France.Benyoussef, L., & Derrode, S. (2011). Analysis of ancient mosaic images for dedicated applications. Digital Imaging for Cultural Heritage Preservation: Analysis, Restoration, and Reconstruction of Ancient Artworks, 385.Bonfigli, R., Felicetti, A., Principi, E., Fagiani, M., Squartini, S., & Piazza, F. (2018). Denoising autoencoders for non-intrusive load monitoring: improvements and comparative evaluation. Energy and Buildings, 158. https://doi.org/10.1016/j.enbuild.2017.11.054Bordoni, L., & Mele, F. (Eds.). (2016). Artificial intelligence for cultural heritage. Cambridge Scholars Publishing.Bourke, P. (2014, December). Novel imaging of heritage objects and sites. In 2014 International Conference on Virtual Systems & Multimedia (VSMM) (pp. 25-30). IEEE. 10.1109/VSMM.2014.7136666Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016, October). 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention (pp. 424-432). Springer, Cham. https://doi.org/10.1007/978-3-319-46723-8_49Cipriani, L., & Fantini, F. (2017). Digitalization culture VS archaeological visualization: integration of pipelines and open issues. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 195. https://doi.org/10.5194/isprs-archives-XLII-2-W3-195-2017Djibril, M. O., & Thami, R. O. H. (2008). Islamic geometrical patterns indexing and classification using discrete symmetry groups. Journal on Computing and Cultural Heritage (JOCCH), 1(2), 1-14. https://doi.org/10.1145/1434763.1434767Djibril, M. O., Thami, R. O. H., Benslimane, R., & Daoudi, M. (2005). Une nouvelle technique pour l'indexation des arabesques basée sur la dimension fractale. Univ. Mohamed V, Maroc.Falk, T., Mai, D., Bensch, R., Çiçek, Ö., Abdulkadir, A., Marrakchi, Y., Böhm, A., Deubner, J., Jäckel, Z., Seiwald, K., & Dovzhenko, A. (2019). U-Net: deep learning for cell counting, detection, and morphometry. Nature methods, 16(1), 67-70. https://doi.org/10.1038/s41592-018-0261-2Felicetti, A., Albiero, A., Gabrielli, R., Pierdicca, R., Paolanti, M., Zingaretti, P., & Malinverni, E. S. (2018). Automatic Mosaic Digitalization: a Deep Learning approach to tessera segmentation. In METROARCHEO, IEEE International Conference on Metrology for Archaeology and Cultural Heritage. Cassino. https://doi.org/10.1109/MetroArchaeo43810.2018.13606Fenu, G., Jain, N., Medvet, E., Pellegrino, F. A., & Namer, M. P. (2015, March). On the Assessment of Segmentation Methods for Images of Mosaics. In VISAPP (3) (pp. 130-137). https://doi.org/10.13140/RG.2.1.3025.6489Fenu, G., Medvet, E., Panfilo, D., & Pellegrino, F. A. (2020). Mosaic Images Segmentation using U-net. In International Conference on Pattern Recognition Applications and Methods (pp. 485-492). Scitepress. http://dx.doi.org/10.5220/0008967404850492Fontanella, F., Molinara, M., Gallozzi, A., Cigola, M., Senatore, L. J., Florio, R., Clini, P., & Celis, F. (2019, June). HeritageGO (HeGO) A Social Media Based Project for Cultural Heritage Valorization. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization (pp. 377-382). https://doi.org/10.1145/3314183.3323863Gil, F. A., Gomis, J. M., & Pérez, M. (2009). Reconstruction Techniques for Image Analysis of Ancient Islamic Mosaics. International Journal of Virtual Reality, 8(3), 5-12. https://doi.org/10.20870/IJVR.2009.8.3.2735Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Kohl, S., Romera-Paredes, B., Meyer, C., De Fauw, J., Ledsam, J. R., Maier-Hein, K., Eslami, S.M.A, Rezende, D.J., & Ronneberger, O. (2018). A probabilistic u-net for segmentation of ambiguous images. In Advances in Neural Information Processing Systems (pp. 6965-6975). https://arxiv.org/abs/1806.05034Liciotti, D., Paolanti, M., Pietrini, R., Frontoni, E., & Zingaretti, P. (2018, August). Convolutional networks for semantic heads segmentation using top-view depth data in crowded environment. In 2018 24th international conference on pattern recognition (ICPR) IEEE. https://doi.org/10.1109/ICPR.2018.8545397Maghrebi, W., Ammar, A. B., Alimi, A. M., & Khabou, M. A. (2013). An Intelligent mutli-object retrieval system for historical mosaics. Editorial Preface, 4(4). https://doi.org/10.14569/IJACSA.2013.040417Maghrebi, W., Baccour, L., Khabou, M. A., & Alimi, A. M. (2007, November). An indexing and retrieval system of historic art images based on fuzzy shape similarity. In Mexican International Conference on Artificial Intelligence (pp. 623-633). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76631-5_59Maghrebi, W., Borchani, A., Khabou, M. A., & Alimi, A. M. (2007, September). A system for historic document image indexing and retrieval based on xml database conforming to mpeg7 standard. In International Workshop on Graphics Recognition (pp. 114-125). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88188-9_12Malinverni, E. S., Pierdicca, R., Di Stefano, F., Gabrielli, R., & Albiero, A. (2019). Virtual museum enriched by GIS data to share science and culture. Church of Saint Stephen in Umm Ar-Rasas (Jordan). Virtual Archaeology Review, 10(21). https://doi.org/10.4995/var.2019.11919M'hedhbi, M., Mezhoud, R., M'hiri, S., & Ghorbel, F. (2006, April). A new content-based image indexing and retrieval system of mosaic images. In 2006 2nd International Conference on Information & Communication Technologies (Vol. 1, pp. 1715-1719). IEEE. https://doi.org/10.1109/ICTTA.2006.1684644Pierdicca, R., Frontoni, E., Malinverni, E. S., Colosi, F., & Orazi, R. (2016). Virtual reconstruction of archaeological heritage using a combination of photogrammetric techniques: Huaca Arco Iris, Chan Chan, Peru. Digital Applications in Archaeology and Cultural Heritage, 3(3). https://doi.org/10.1016/j.daach.2016.06.002Pierdicca, R., Frontoni, E., Zingaretti, P., Malinverni, E. S., Colosi, F., & Orazi, R. (2015, August). Making visible the invisible. Augmented reality visualization for 3D reconstructions of archaeological sites. In International Conference on Augmented and Virtual Reality (Blinded for peer review). Springer, Cham. https://doi.org/10.1007/978-3-319-22888-4_3Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28Vincent, L., & Soille, P. (1991). Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis & Machine Intelligence, (6), 583-598. https://doi.org/10.1109/34.87344Youssef, L. B., & Derrode, S. (2008). Tessella-oriented segmentation and guidelines estimation of ancient mosaic images. Journal of Electronic Imaging, 17(4), 043014. https://doi.org/10.1117/1.3013543Zarghili, A., Gadi, N., Benslimane, R., & Bouatouch, K. (2001). Arabo-Moresque decor image retrieval system based on mosaic representations. Journal of Cultural Heritage, 2(2), 149-154. https://doi.org/10.1016/S1296-2074(01)01116-5Zarghili, A., Kharroubi, J., & Benslimane, R. (2008). Arabo-Moresque decor images retrieval system based on spatial relationships indexing. Journal of cultural heritage, 9(3), 317-325. https://doi.org/10.1016/j.culher.2007.10.008Zitová, B., Flusser, J., & Šroubek, F. (2004). An application of image processing in the medieval mosaic conservation. Pattern analysis and applications, 7(1), 18-25. https://doi.org/10.1007/s10044-003-0200-

    Fourier ptychography: current applications and future promises

    Get PDF
    Traditional imaging systems exhibit a well-known trade-off between the resolution and the field of view of their captured images. Typical cameras and microscopes can either “zoom in” and image at high-resolution, or they can “zoom out” to see a larger area at lower resolution, but can rarely achieve both effects simultaneously. In this review, we present details about a relatively new procedure termed Fourier ptychography (FP), which addresses the above trade-off to produce gigapixel-scale images without requiring any moving parts. To accomplish this, FP captures multiple low-resolution, large field-of-view images and computationally combines them in the Fourier domain into a high-resolution, large field-of-view result. Here, we present details about the various implementations of FP and highlight its demonstrated advantages to date, such as aberration recovery, phase imaging, and 3D tomographic reconstruction, to name a few. After providing some basics about FP, we list important details for successful experimental implementation, discuss its relationship with other computational imaging techniques, and point to the latest advances in the field while highlighting persisting challenges

    Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery

    Get PDF
    A robust and fast automatic moving object detection and tracking system is essential to characterize target object and extract spatial and temporal information for different functionalities including video surveillance systems, urban traffic monitoring and navigation, robotic. In this dissertation, I present a collaborative Spatial Pyramid Context-aware moving object detection and Tracking system. The proposed visual tracker is composed of one master tracker that usually relies on visual object features and two auxiliary trackers based on object temporal motion information that will be called dynamically to assist master tracker. SPCT utilizes image spatial context at different level to make the video tracking system resistant to occlusion, background noise and improve target localization accuracy and robustness. We chose a pre-selected seven-channel complementary features including RGB color, intensity and spatial pyramid of HoG to encode object color, shape and spatial layout information. We exploit integral histogram as building block to meet the demands of real-time performance. A novel fast algorithm is presented to accurately evaluate spatially weighted local histograms in constant time complexity using an extension of the integral histogram method. Different techniques are explored to efficiently compute integral histogram on GPU architecture and applied for fast spatio-temporal median computations and 3D face reconstruction texturing. We proposed a multi-component framework based on semantic fusion of motion information with projected building footprint map to significantly reduce the false alarm rate in urban scenes with many tall structures. The experiments on extensive VOTC2016 benchmark dataset and aerial video confirm that combining complementary tracking cues in an intelligent fusion framework enables persistent tracking for Full Motion Video and Wide Aerial Motion Imagery.Comment: PhD Dissertation (162 pages

    Toward Robust Video Event Detection and Retrieval Under Adversarial Constraints

    Get PDF
    The continuous stream of videos that are uploaded and shared on the Internet has been leveraged by computer vision researchers for a myriad of detection and retrieval tasks, including gesture detection, copy detection, face authentication, etc. However, the existing state-of-the-art event detection and retrieval techniques fail to deal with several real-world challenges (e.g., low resolution, low brightness and noise) under adversary constraints. This dissertation focuses on these challenges in realistic scenarios and demonstrates practical methods to address the problem of robustness and efficiency within video event detection and retrieval systems in five application settings (namely, CAPTCHA decoding, face liveness detection, reconstructing typed input on mobile devices, video confirmation attack, and content-based copy detection). Specifically, for CAPTCHA decoding, I propose an automated approach which can decode moving-image object recognition (MIOR) CAPTCHAs faster than humans. I showed that not only are there inherent weaknesses in current MIOR CAPTCHA designs, but that several obvious countermeasures (e.g., extending the length of the codeword) are not viable. More importantly, my work highlights the fact that the choice of underlying hard problem selected by the designers of a leading commercial solution falls into a solvable subclass of computer vision problems. For face liveness detection, I introduce a novel approach to bypass modern face authentication systems. More specifically, by leveraging a handful of pictures of the target user taken from social media, I show how to create realistic, textured, 3D facial models that undermine the security of widely used face authentication solutions. My framework makes use of virtual reality (VR) systems, incorporating along the way the ability to perform animations (e.g., raising an eyebrow or smiling) of the facial model, in order to trick liveness detectors into believing that the 3D model is a real human face. I demonstrate that such VR-based spoofing attacks constitute a fundamentally new class of attacks that point to a serious weaknesses in camera-based authentication systems. For reconstructing typed input on mobile devices, I proposed a method that successfully transcribes the text typed on a keyboard by exploiting video of the user typing, even from significant distances and from repeated reflections. This feat allows us to reconstruct typed input from the image of a mobile phone’s screen on a user’s eyeball as reflected through a nearby mirror, extending the privacy threat to include situations where the adversary is located around a corner from the user. To assess the viability of a video confirmation attack, I explored a technique that exploits the emanations of changes in light to reveal the programs being watched. I leverage the key insight that the observable emanations of a display (e.g., a TV or monitor) during presentation of the viewing content induces a distinctive flicker pattern that can be exploited by an adversary. My proposed approach works successfully in a number of practical scenarios, including (but not limited to) observations of light effusions through the windows, on the back wall, or off the victim’s face. My empirical results show that I can successfully confirm hypotheses while capturing short recordings (typically less than 4 minutes long) of the changes in brightness from the victim’s display from a distance of 70 meters. Lastly, for content-based copy detection, I take advantage of a new temporal feature to index a reference library in a manner that is robust to the popular spatial and temporal transformations in pirated videos. My technique narrows the detection gap in the important area of temporal transformations applied by would-be pirates. My large-scale evaluation on real-world data shows that I can successfully detect infringing content from movies and sports clips with 90.0% precision at a 71.1% recall rate, and can achieve that accuracy at an average time expense of merely 5.3 seconds, outperforming the state of the art by an order of magnitude.Doctor of Philosoph

    Visual Quality Assessment and Blur Detection Based on the Transform of Gradient Magnitudes

    Get PDF
    abstract: Digital imaging and image processing technologies have revolutionized the way in which we capture, store, receive, view, utilize, and share images. In image-based applications, through different processing stages (e.g., acquisition, compression, and transmission), images are subjected to different types of distortions which degrade their visual quality. Image Quality Assessment (IQA) attempts to use computational models to automatically evaluate and estimate the image quality in accordance with subjective evaluations. Moreover, with the fast development of computer vision techniques, it is important in practice to extract and understand the information contained in blurred images or regions. The work in this dissertation focuses on reduced-reference visual quality assessment of images and textures, as well as perceptual-based spatially-varying blur detection. A training-free low-cost Reduced-Reference IQA (RRIQA) method is proposed. The proposed method requires a very small number of reduced-reference (RR) features. Extensive experiments performed on different benchmark databases demonstrate that the proposed RRIQA method, delivers highly competitive performance as compared with the state-of-the-art RRIQA models for both natural and texture images. In the context of texture, the effect of texture granularity on the quality of synthesized textures is studied. Moreover, two RR objective visual quality assessment methods that quantify the perceived quality of synthesized textures are proposed. Performance evaluations on two synthesized texture databases demonstrate that the proposed RR metrics outperforms full-reference (FR), no-reference (NR), and RR state-of-the-art quality metrics in predicting the perceived visual quality of the synthesized textures. Last but not least, an effective approach to address the spatially-varying blur detection problem from a single image without requiring any knowledge about the blur type, level, or camera settings is proposed. The evaluations of the proposed approach on a diverse sets of blurry images with different blur types, levels, and content demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods qualitatively and quantitatively.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Example-Based Urban Modeling

    Get PDF
    The manual modeling of virtual cities or suburban regions is an extremely time-consuming task, which expects expert knowledge of different fields. Existing modeling tool-sets have a steep learning curve and may need special education skills to work with them productively. Existing automatic methods rely on rule sets and grammars to generate urban structures; however, their expressiveness is limited by the rule-sets. Expert skills are necessary to typeset rule sets successfully and, in many cases, new rule-sets need to be defined for every new building style or street network style. To enable non-expert users, the possibility to construct urban structures for individual experiments, this work proposes a portfolio of novel example-based synthesis algorithms and applications for the controlled generation of virtual urban environments. The notion example-based denotes here that new virtual urban environments are created by computer programs that re-use existing digitized real-world data serving as templates. The data, i.e., street networks, topography, layouts of building footprints, or even 3D building models, necessary to realize the envisioned task is already publicly available via online services. To enable the reuse of existing urban datasets, novel algorithms need to be developed by encapsulating expert knowledge and thus allow the controlled generation of virtual urban structures from sparse user input. The focus of this work is the automatic generation of three fundamental structures that are common in urban environments: road networks, city block, and individual buildings. In order to achieve this goal, the thesis proposes a portfolio of algorithms that are briefly summarized next. In a theoretical chapter, we propose a general optimization technique that allows formulating example-based synthesis as a general resource-constrained k-shortest path (RCKSP) problem. From an abstract problem specification and a database of exemplars carrying resource attributes, we construct an intermediate graph and employ a path-search optimization technique. This allows determining either the best or the k-best solutions. The resulting algorithm has a reduced complexity for the single constraint case when compared to other graph search-based techniques. For the generation of road networks, two different techniques are proposed. The first algorithm synthesizes a novel road network from user input, i.e., a desired arterial street skeleton, topography map, and a collection of hierarchical fragments extracted from real-world road networks. The algorithm recursively constructs a novel road network reusing these fragments. Candidate fragments are inserted into the current state of the road network, while shape differences will be compensated by warping. The second algorithm synthesizes road networks using generative adversarial networks (GANs), a recently introduced deep learning technique. A pre- and postprocessing pipeline allows using GANs for the generation of road networks. An in-depth evaluation shows that GANs faithfully learn the road structure present in the example network and that graph measures such as area, aspect ratio, and compactness, are maintained within the virtual road networks. To fill empty city blocks in road networks we propose two novel techniques. The first algorithm re-uses real-world city blocks and synthesizes building footprint layouts into empty city blocks by retrieving viable candidate blocks from a database. We evaluate the algorithm and synthesize a multitude of city block layouts reusing real-world building footprint arrangements from European and US-cities. In addition, we increase the realism of the synthesized layouts by performing example-based placement of 3D building models. This technique is evaluated by placing buildings onto challenging footprint layouts using different example building databases. The second algorithm computes a city block layout, resembling the style of a real-world city block. The original footprint layout is deformed to construct a textit{guidance map}, i.e., the original layout is transferred to a target city block using warping. This guidance map and the original footprints are used by an optimization technique that computes a novel footprint layout along the city block edges. We perform a detailed evaluation and show that using the guidance map allows transferring of the original layout, locally as well as globally, even when the source and target shapes drastically differ. To synthesize individual buildings, we use the general optimization technique described first and formulate the building generation process as a resource-constrained optimization problem. From an input database of annotated building parts, an abstract description of the building shape, and the specification of resource constraints such as length, area, or a number of architectural elements, a novel building is synthesized. We evaluate the technique by synthesizing a multitude of challenging buildings fulfilling several global and local resource constraints. Finally, we show how this technique can even be used to synthesize buildings having the shape of city blocks and might also be used to fill empty city blocks in virtual street networks. All algorithms presented in this work were developed to work with a small amount of user input. In most cases, simple sketches and the definition of constraints are enough to produce plausible results. Manual work is necessary to set up the building part databases and to download example data from mapping services available on the Internet
    corecore