10 research outputs found

    Inferring Implicit 3D Representations from Human Figures on Pictorial Maps

    Full text link
    In this work, we present an automated workflow to bring human figures, one of the most frequently appearing entities on pictorial maps, to the third dimension. Our workflow is based on training data and neural networks for single-view 3D reconstruction of real humans from photos. We first let a network consisting of fully connected layers estimate the depth coordinate of 2D pose points. The gained 3D pose points are inputted together with 2D masks of body parts into a deep implicit surface network to infer 3D signed distance fields (SDFs). By assembling all body parts, we derive 2D depth images and body part masks of the whole figure for different views, which are fed into a fully convolutional network to predict UV images. These UV images and the texture for the given perspective are inserted into a generative network to inpaint the textures for the other views. The textures are enhanced by a cartoonization network and facial details are resynthesized by an autoencoder. Finally, the generated textures are assigned to the inferred body parts in a ray marcher. We test our workflow with 12 pictorial human figures after having validated several network configurations. The created 3D models look generally promising, especially when considering the challenges of silhouette-based 3D recovery and real-time rendering of the implicit SDFs. Further improvement is needed to reduce gaps between the body parts and to add pictorial details to the textures. Overall, the constructed figures may be used for animation and storytelling in digital 3D maps.Comment: to be published in 'Cartography and Geographic Information Science

    The next generation of atlas user interfaces – A user study with “Digital Natives”

    Full text link
    Atlases are one of the most complex geovisualization environments as they are very information-rich. Within these environments, a well-designed user interface is essential to explore the variety of atlas maps and media. Involving technology-affine digital natives in the interface design process seems self-evident to provide appealing and intuitively usable atlases in the future. In our study, we presented secondary school students (n=110, age 14-15 years) with five graphical user interface (GUI) mock-ups varying in layout density and tool arrangement. Each alternative design embodies a GUI concept inspired by an existing Web atlas or a popular website. The students have completed five tasks in these atlas interfaces that represent typical use cases for thematic navigation, spatial orientation and information queries. We collected performance and preference metrics for each layout, i.e., the time to solve a task (efficiency), whether students found the correct answers (effectiveness), and their ratings of each layout for “attractiveness”. To complete the analysis, we also conducted a mouse click analysis. Results indicate that atlas interfaces with a medium layout density are strongly preferred by the tested participants, and through inferential statistics, by digital natives in general. These medium density layouts also perform significantly better; i.e., they have lower average times, lower number of clicks and a higher percentage of successfully completed tasks. Based on the interpretation of the results of this study, general and practical guidelines for future atlas user interfaces are derived

    Towards Storytelling with Animated Pictorial Map Objects – An Experiment with Convolutional Neural Networks

    No full text
    Storytelling is a popular technique applied in many fields including cartography. On the one hand, stories can be told intrinsically by map elements per se. An often quoted example in this regard is Minard’s map of Napoleon’s Russian Campaign (e.g. Denil 2017) which depicts the loss of troops in a spatio-temporally aligned Sankey diagram. On the other hand, stories can be conveyed extrinsically by multimedia elements aside the map. For instance, the travel route of a soldier during the First World War can be shown on a temporally navigable map and accompanied with photos, videos, diary entries, and military forms (Cartwright & Field 2015). In this experiment, we follow a mixed approach where human figures on the map will be animated and address the map reader via speech bubbles. As source data, we consider pictorial maps from digital map libraries (e.g. the David Rumsey Map Collection) and social media websites (e.g. Pinterest). These maps contain realistically drawn representations which are in our opinion very suitable for communicating personal narratives. We present a workflow with convolutional neural networks (CNNs), a type of artificial neural network primarily used for image recognition, to detect human figures in pictorial maps. In particular, we use Mask R-CNN (He et al. 2017) for identifying bounding boxes and silhouettes of figures. For the segmentation of body parts (i.e. head, torso, arms, hands, legs, feet) and the detection of joints (i.e. nose, thorax, shoulders, elbows, wrists, hip, knees, ankles), we combine the U-Net architecture (Ronneberger et al. 2015) with a ResNet (He et al. 2015). In a final step, we implement a simple 2Danimation of waving and walking characters and add speech bubbles near head positions. As a first training dataset, we created parametric SVG character models with different postures originating from the MPII Human Pose Dataset. The second training dataset contains real image human body parts from the PASCAL-Part Dataset. Humans from both datasets are placed randomly on pictorial maps without any other figures. Preliminary results show that the validation accuracy is the highest when synthetic and real training datasets are combined. We implemented the CNNs with TensorFlow’s keras API, whereas training data and animations are generated with the web browser. Our approach enables giving storytellers a physical presence and anchoring them spatially within the map. By animating characters, we can gain the map reader’s attention and guide him/her to special and possibly hidden places (e.g. in touristic maps). By telling personal stories, we may raise the interest of people to explore the maps (e.g. in museums) and give a better understanding of the often abstractly encoded information in maps (e.g. in atlases). When a certain aesthetic value has been reached, pictorial objects may also generate positive emotions so that anxieties about the complexity of data may become secondary (e.g. in education). Overall, the goal of our work is to engage map readers, give them valuable support while studying a map, and create long-lasting memories of the map content

    Unsupervised historical map registration by a deformation neural network

    No full text
    Image registration that aligns multi-temporal or multi-source images is vital for tasks like change detection and image fusion. Thanks to the advance and large-scale practice of modern surveying methods, multi-temporal historical maps can be unlocked and combined to trace object changes in the past, potentially supporting research in environmental science, ecology and urban planning, etc. Even when maps are geo-referenced, the contained geographical features can be misaligned due to surveying, painting, map generalization, and production bias. In our work, we adapt an endto- end unsupervised deformation network that couples rigid and non-rigid transformations to align scanned historical map sheets at different time stamps. To the best of our knowledge, we are the first to use unsupervised deep learning to register map images. We address the sparsity of map features by introducing a loss based on distance fields. When aligning the displaced landmark locations by our proposed method, the results are promising both quantitatively and qualitatively. The generated smooth deformation grid can be applied to vector features directly to align them from the source map sheet to the target map sheet

    Storytelling in Interactive 3D Geographic Visualization Systems

    Get PDF
    The objective of interactive geographic maps is to provide geographic information to a large audience in a captivating and intuitive way. Storytelling helps to create exciting experiences and to explain complex or otherwise hidden relationships of geospatial data. Furthermore, interactive 3D applications offer a wide range of attractive elements for advanced visual story creation and offer the possibility to convey the same story in many different ways. In this paper, we discuss and analyze storytelling techniques in 3D geographic visualizations so that authors and developers working with geospatial data can use these techniques to conceptualize their visualization and interaction design. Finally, we outline two examples which apply the given concepts

    Augmenting printed school atlases with thematic 3D maps

    No full text
    Digitalization in schools requires a rethinking of teaching materials and methods in all subjects. This upheaval also concerns traditional print media, like school atlases used in geography classes. In this work, we examine the cartographic technological feasibility of extending a printed school atlas with digital content by augmented reality (AR). While previous research rather focused on topographic three-dimensional (3D) maps, our prototypical application for Android tablets complements map sheets of the Swiss World Atlas with thematically related data. We follow a natural marker approach using the AR engine Vuforia and the game engine Unity. We compare two workflows to insert geo-data, being correctly aligned with the map images, into the game engine. Next, the imported data are transformed into partly animated 3D visualizations, such as a dot distribution map, curved lines, pie chart billboards, stacked cuboids, extruded bars, and polygons. Additionally, we implemented legends, elements for temporal and thematic navigation, a screen capture function, and a touch-based feature query for the user interface. We evaluated our prototype in a usability experiment, which showed that secondary school students are as effective, interested, and sustainable with printed as with augmented maps when solving geographic tasks.ISSN:2414-408

    Augmenting Printed School Atlases with Thematic 3D Maps

    No full text
    Digitalization in schools requires a rethinking of teaching materials and methods in all subjects. This upheaval also concerns traditional print media, like school atlases used in geography classes. In this work, we examine the cartographic technological feasibility of extending a printed school atlas with digital content by augmented reality (AR). While previous research rather focused on topographic three-dimensional (3D) maps, our prototypical application for Android tablets complements map sheets of the Swiss World Atlas with thematically related data. We follow a natural marker approach using the AR engine Vuforia and the game engine Unity. We compare two workflows to insert geo-data, being correctly aligned with the map images, into the game engine. Next, the imported data are transformed into partly animated 3D visualizations, such as a dot distribution map, curved lines, pie chart billboards, stacked cuboids, extruded bars, and polygons. Additionally, we implemented legends, elements for temporal and thematic navigation, a screen capture function, and a touch-based feature query for the user interface. We evaluated our prototype in a usability experiment, which showed that secondary school students are as effective, interested, and sustainable with printed as with augmented maps when solving geographic tasks

    Inferring Implicit 3D Representations from Human Figures on Pictorial Maps

    No full text
    Human figures frequently occur on pictorial maps besides other illustrative entities. In this work, we present how to automatically derive 3D depictions from these 2D human figures. Previous research has shown that silhouettes, body parts, and joints of 2D human figures in common poses can be detected on pictorial maps by artificial neural networks (SchnĂĽrer et al., 2019). Architectures for these networks have been also developed to reconstruct 3D models of real persons from photos in good accuracy (Varol et al., 2018). Single-view methods are particularly suited for our use case since pictorial figures are usually drawn from one perspective only. Furthermore, a trend can be observed to represent the recovered 3D models by implicit surfaces, expressed by level sets of functions (Saito et al., 2019) or signed distance functions (Wang et al., 2019). Compared to other 3D structures, implicit geometries are memory-efficient, but they require special ray tracing algorithms like marching cubes or sphere tracing to be rendered. We examine two approaches: (1) A convolutional neural network, consisting of a feature extractor and a head network, shall learn to directly predict body parts and joints of a 3D model from a 2D image. For this approach, a large amount of training data is essential, for instance, body scans from real persons (e.g. Human3.6M1) or synthetically created persons (e.g. SURREAL2). For our case, these 3D models may be additionally distorted or enriched by rigged human characters from computer games. After converting the geometries from explicit into implicit forms (e.g. mesh-to-sdf3), the network is trained to estimate the resulting values of sample points. (2) Implicit function parameters can be stepwise optimized, for example by Stochastic Gradient Descent, to reduce differences between the target image and its approximation. The latter is a projection of 3D primitives which are combined, transformed, morphed, or deformed by mathematical operations (Pasko et al., 1995). This approach facilitates to formulate constraints such as the connectivity of body parts or rotation angles of joints, but it requires more iterations and eventually ends in a local minimum. The following challenges exist for both approaches: Due to occlusions, multiple reconstruction outputs are plausible. Perhaps, a generative model such as a variational autoencoder or generative adversarial network needs to be introduced to reflect the variety of poses by latent codes. Moreover, a certain strategy may be pursued to sample equally points near the surface, within the body, and in the surrounding space so that local details and thin parts (e.g. fingers) can be preserved (Paschalidou et al., 2020). To speed up the training or optimization process, possibly a meta-learning algorithm may help to find good initialization parameters (Sitzmann et al., 2020). Since human figures on maps are mostly hand-drawn or manually created with graphic software, the camera perspective or lighting conditions may not be fully consistent. It is not clear yet whether this has an impact on differentiable rendering methods (Niemeyer et al., 2020), which may be applied in our networks. Lastly, the texture needs to be mapped to the 3D model and estimated for the hidden parts, which can be achieved by a subnetwork (Saito et al., 2019). We will evaluate the two approaches according to their effectiveness and efficiency. Based on the outcomes of related works and the proposed methods to overcome the challenges, we are optimistic to create meaningful representations. When being successful, the inferred 3D figures could emerge from the original map by augmented reality devices. The figures could then be animated and act as guides on touristic maps or storytellers on historic maps in museums. Due to their attractiveness, the generated 3D figures may raise the interest of people, especially children, in maps and may also serve educative purposes

    Instance Segmentation, Body Part Parsing, and Pose Estimation of Human Figures in Pictorial Maps

    No full text
    In recent years, convolutional neural networks (CNNs) have been applied successfully to recognise persons, their body parts and pose keypoints in photos and videos. The transfer of these techniques to artificially created images is rather unexplored, though challenging since these images are drawn in different styles, body proportions, and levels of abstraction. In this work, we study these problems on the basis of pictorial maps where we identify included human figures with two consecutive CNNs: We first segment individual figures with Mask R-CNN, and then parse their body parts and estimate their poses simultaneously with four different UNet++ versions. We train the CNNs with a mixture of real persons and synthetic figures and compare the results with manually annotated test datasets consisting of pictorial figures. By varying the training datasets and the CNN configurations, we were able to improve the original Mask R-CNN model and we achieved moderately satisfying results with the UNet++ versions. The extracted figures may be used for animation and storytelling and may be relevant for the analysis of historic and contemporary maps.ISSN:2372-9341ISSN:2372-933

    Detection of Pictorial Map Objects with Convolutional Neural Networks

    No full text
    In this work, realistically drawn objects are identified on digital maps by convolutional neural networks. For the first two experiments, 6200 images were retrieved from Pinterest. While alternating image input options, two binary classifiers based on Xception and InceptionResNetV2 were trained to separate maps and pictorial maps. Results showed that the accuracy is 95-97% to distinguish maps from other images, whereas maps with pictorial objects are correctly classified at rates of 87-92%. For a third experiment, bounding boxes of 3200 sailing ships were annotated in historic maps from different digital libraries. Faster R-CNN and RetinaNet were compared to determine the box coordinates, while adjusting anchor scales and examining configurations for small objects. A resulting average precision of 32% was obtained for Faster R-CNN and of 36% for RetinaNet. Research outcomes are relevant for trawling map images on the Internet and for enhancing the advanced search of digital map catalogues.ISSN:0008-7041ISSN:1743-277
    corecore