231 research outputs found

    Computational Re-Photography

    Get PDF
    Rephotographers aim to recapture an existing photograph from the same viewpoint. A historical photograph paired with a well-aligned modern rephotograph can serve as a remarkable visualization of the passage of time. However, the task of rephotography is tedious and often imprecise, because reproducing the viewpoint of the original photograph is challenging. The rephotographer must disambiguate between the six degrees of freedom of 3D translation and rotation, and the confounding similarity between the effects of camera zoom and dolly. We present a real-time estimation and visualization technique for rephotography that helps users reach a desired viewpoint during capture. The input to our technique is a reference image taken from the desired viewpoint. The user moves through the scene with a camera and follows our visualization to reach the desired viewpoint. We employ computer vision techniques to compute the relative viewpoint difference. We guide 3D movement using two 2D arrows. We demonstrate the success of our technique by rephotographing historical images and conducting user studies

    Evaluating humanoid embodied conversational agents in mobile guide applications

    Get PDF
    Evolution in the area of mobile computing has been phenomenal in the last few years. The exploding increase in hardware power has enabled multimodal mobile interfaces to be developed. These interfaces differ from the traditional graphical user interface (GUI), in that they enable a more “natural” communication with mobile devices, through the use of multiple communication channels (e.g., multi-touch, speech recognition, etc.). As a result, a new generation of applications has emerged that provide human-like assistance in the user interface (e.g., the Siri conversational assistant (Siri Inc., visited 2010)). These conversational agents are currently designed to automate a number of tedious mobile tasks (e.g., to call a taxi), but the possible applications are endless. A domain of particular interest is that of Cultural Heritage, where conversational agents can act as personalized tour guides in, for example, archaeological attractions. The visitors to historical places have a diverse range of information needs. For example, casual visitors have different information needs from those with a deeper interest in an attraction (e.g., - holiday learners versus students). A personalized conversational agent can access a cultural heritage database, and effectively translate data into a natural language form that is adapted to the visitor’s personal needs and interests. The present research aims to investigate the information needs of a specific type of visitors, those for whom retention of cultural content is important (e.g., students of history, cultural experts, history hobbyists, educators, etc.). Embodying a conversational agent enables the agent to use additional modalities to communicate this content (e.g., through facial expressions, deictic gestures, etc.) to the user. Simulating the social norms that guide the real-world human-to-human interaction (e.g., adapting the story based on the reactions of the users), should at least theoretically optimize the cognitive accessibility of the content. Although a number of projects have attempted to build embodied conversational agents (ECAs) for cultural heritage, little is known about their impact on the users’ perceived cognitive accessibility of the cultural heritage content, and the usability of the interfaces they support. In particular, there is a general disagreement on the advantages of multimodal ECAs in terms of users’ task performance and satisfaction over nonanthropomorphised interfaces. Further, little is known about what features influence what aspects of the cognitive accessibility of the content and/or usability of the interface. To address these questions I studied the user experiences with ECA interfaces in six user studies across three countries (Greece, UK and USA). To support these studies, I introduced: a) a conceptual framework based on well-established theoretical models of human cognition, and previous frameworks from the literature. The framework offers a holistic view of the design space of ECA systems b) a research technique for evaluating the cognitive accessibility of ECA-based information presentation systems that combine data from eye tracking and facial expression recognition. In addition, I designed a toolkit, from which I partially developed its natural language processing component, to facilitate rapid development of mobile guide applications using ECAs. Results from these studies provide evidence that an ECA, capable of displaying some of the communication strategies (e.g., non-verbal behaviours to accompany linguistic information etc.) found in the real-world human guidance scenario, is not affecting and effective in enhancing the user’s ability to retain cultural content. The findings from the first two studies, suggest than an ECA has no negative/positive impact on users experiencing content that is similar (but not the same) across different locations (see experiment one, in Chapter 7), and content of variable difficulty (see experiment two, in Chapter 7). However, my results also suggest that improving the degree of content personalization and the quality of the modalities used by the ECA can result in both effective and affecting human-ECA interactions. Effectiveness is the degree to which an ECA facilitates a user in accomplishing the navigation and information tasks. Similarly, affecting is the degree to which the ECA changes the quality of the user’s experience while accomplishing the navigation and information tasks. By adhering to the above rules, I gradually improved my designs and built ECAs that are affecting. In particular, I found that an ECA can affect the quality of the user’s navigation experience (see experiment three in Chapter 7), as well as how a user experiences narrations of cultural value (see experiment five, in Chapter 8). In terms of navigation, I found sound evidence that the strongest impact of the ECAs nonverbal behaviours is on the ability of users to correctly disambiguate the navigation of an ECA instructions provided by a tour guide system. However, my ECAs failed to become effective, and to elicit enhanced navigation or retention performances. Given the positive impact of ECAs on the disambiguation of navigation instructions, the lack of ECA-effectiveness in navigation could be attributed to the simulated mobile conditions. In a real outdoor environment, where users would have to actually walk around the castle, an ECA could have elicited better navigation performance, than a system without it. With regards to retention performance, my results suggest that a designer should not solely consider the impact of an ECA, but also the style and effectiveness of the question-answering (Q&A) with the ECA, and the type of user interacting with the ECA (see experiments four and six, in Chapter 8). I found that that there is a correlation between how many questions participants asked per location for a tour, and the information they retained after the completion of the tour. When participants were requested to ask the systems a specific number of questions per location, they could retain more information than when they were allowed to freely ask questions. However, the constrained style of interaction decreased their overall satisfaction with the systems. Therefore, when enhanced retention performance is needed, a designer should consider strategies that should direct users to ask a specific number of questions per location for a tour. On the other hand, when maintaining the positive levels of user experiences is the desired outcome of an interaction, users should be allowed to freely ask questions. Then, the effectiveness of the Q&A session is of importance to the success/failure of the user’s interaction with the ECA. In a natural-language question-answering system, the system often fails to understand the user’s question and, by default, it asks the user to rephrase again. A problem arises when the system fails to understand a question repeatedly. I found that a repetitive request to rephrase the same question annoys participants and affects their retention performance. Therefore, in order to ensure effective human-ECA Q&A, the repeat messages should be built in a way to allow users to figure out how to ask the system questions to avoid improper responses. Then, I found strong evidence that an ECA may be effective for some type of users, while for some others it may be not. I found that an ECA with an attention-grabbing mechanism (see experiment six, in Chapter 8), had an inverse effect on the retention performance of participants with different gender. In particular, it enhanced the retention performance of the male participants, while it degraded the retention performance of the female participants. Finally, a series of tentative design recommendations for the design of both affecting and effective ECAs in mobile guide applications in derived from the work undertaken. These are aimed at ECA researchers and mobile guide designers

    Virtual Heritage: new technologies for edutainment

    Get PDF
    Cultural heritage represents an enormous amount of information and knowledge. Accessing this treasure chest allows not only to discover the legacy of physical and intangible attributes of the past but also to provide a better understanding of the present. Museums and cultural institutions have to face the problem of providing access to and communicating these cultural contents to a wide and assorted audience, meeting the expectations and interests of the reference end-users and relying on the most appropriate tools available. Given the large amount of existing tangible and intangible heritage, artistic, historical and cultural contents, what can be done to preserve and properly disseminate their heritage significance? How can these items be disseminated in the proper way to the public, taking into account their enormous heterogeneity? Answering this question requires to deal as well with another aspect of the problem: the evolution of culture, literacy and society during the last decades of 20th century. To reflect such transformations, this period witnessed a shift in the museum’s focus from the aesthetic value of museum artifacts to the historical and artistic information they encompass, and a change into the museums’ role from a mere "container" of cultural objects to a "narrative space" able to explain, describe, and revive the historical material in order to attract and entertain visitors. These developments require creating novel exhibits, able to tell stories about the objects and enabling visitors to construct semantic meanings around them. The objective that museums presently pursue is reflected by the concept of Edutainment, Education + Entertainment. Nowadays, visitors are not satisfied with ‘learning something’, but would rather engage in an ‘experience of learning’, or ‘learning for fun’, being active actors and players in their own cultural experience. As a result, institutions are faced with several new problems, like the need to communicate with people from different age groups and different cultural backgrounds, the change in people attitude due to the massive and unexpected diffusion of technology into everyday life, the need to design the visit by a personal point of view, leading to a high level of customization that allows visitors to shape their path according to their characteristics and interests. In order to cope with these issues, I investigated several approaches. In particular, I focused on Virtual Learning Environments (VLE): real-time interactive virtual environments where visitors can experience a journey through time and space, being immersed into the original historical, cultural and artistic context of the work of arts on display. VLE can strongly help archivists and exhibit designers, allowing to create new interesting and captivating ways to present cultural materials. In this dissertation I will tackle many of the different dimensions related to the creation of a cultural virtual experience. During my research project, the entire pipeline involved into the development and deployment of VLE has been investigated. The approach followed was to analyze in details the main sub-problems to face, in order to better focus on specific issues. Therefore, I first analyzed different approaches to an effective recreation of the historical and cultural context of heritage contents, which is ultimately aimed at an effective transfer of knowledge to the end-users. In particular, I identified the enhancement of the users’ sense of presence in VLE as one of the main tools to reach this objective. Presence is generally expressed as the perception of 'being there', i.e. the subjective belief of users that they are in a certain place, even if they know that the experience is mediated by the computer. Presence is related to the number of senses involved by the VLE and to the quality of the sensorial stimuli. But in a cultural scenario, this is not sufficient as the cultural presence plays a relevant role. Cultural presence is not just a feeling of 'being there' but of being - not only physically, but also socially, culturally - 'there and then'. In other words, the VLE must be able to transfer not only the appearance, but also all the significance and characteristics of the context that makes it a place and both the environment and the context become tools capable of transferring the cultural significance of a historic place. The attention that users pay to the mediated environment is another aspect that contributes to presence. Attention is related to users’ focalization and concentration and to their interests. Thus, in order to improve the involvement and capture the attention of users, I investigated in my work the adoption of narratives and storytelling experiences, which can help people making sense of history and culture, and of gamification approaches, which explore the use of game thinking and game mechanics in cultural contexts, thus engaging users while disseminating cultural contents and, why not?, letting them have fun during this process. Another dimension related to the effectiveness of any VLE is also the quality of the user experience (UX). User interaction, with both the virtual environment and its digital contents, is one of the main elements affecting UX. With respect to this I focused on one of the most recent and promising approaches: the natural interaction, which is based on the idea that persons need to interact with technology in the same way they are used to interact with the real world in everyday life. Then, I focused on the problem of presenting, displaying and communicating contents. VLE represent an ideal presentation layer, being multiplatform hypermedia applications where users are free to interact with the virtual reconstructions by choosing their own visiting path. Cultural items, embedded into the environment, can be accessed by users according to their own curiosity and interests, with the support of narrative structures, which can guide them through the exploration of the virtual spaces, and conceptual maps, which help building meaningful connections between cultural items. Thus, VLE environments can even be seen as visual interfaces to DBs of cultural contents. Users can navigate the VE as if they were browsing the DB contents, exploiting both text-based queries and visual-based queries, provided by the re-contextualization of the objects into their original spaces, whose virtual exploration can provide new insights on specific elements and improve the awareness of relationships between objects in the database. Finally, I have explored the mobile dimension, which became absolutely relevant in the last period. Nowadays, off-the-shelf consumer devices as smartphones and tablets guarantees amazing computing capabilities, support for rich multimedia contents, geo-localization and high network bandwidth. Thus, mobile devices can support users in mobility and detect the user context, thus allowing to develop a plethora of location-based services, from way-finding to the contextualized communication of cultural contents, aimed at providing a meaningful exploration of exhibits and cultural or tourist sites according to visitors’ personal interest and curiosity

    Doctor of Philosophy

    Get PDF
    dissertationInteractive editing and manipulation of digital media is a fundamental component in digital content creation. One media in particular, digital imagery, has seen a recent increase in popularity of its large or even massive image formats. Unfortunately, current systems and techniques are rarely concerned with scalability or usability with these large images. Moreover, processing massive (or even large) imagery is assumed to be an off-line, automatic process, although many problems associated with these datasets require human intervention for high quality results. This dissertation details how to design interactive image techniques that scale. In particular, massive imagery is typically constructed as a seamless mosaic of many smaller images. The focus of this work is the creation of new technologies to enable user interaction in the formation of these large mosaics. While an interactive system for all stages of the mosaic creation pipeline is a long-term research goal, this dissertation concentrates on the last phase of the mosaic creation pipeline - the composition of registered images into a seamless composite. The work detailed in this dissertation provides the technologies to fully realize interactive editing in mosaic composition on image collections ranging from the very small to massive in scale

    Urban tourism in Athens: tourist myths and images

    Get PDF
    This thesis explores and analyses the mythical quality of modem Athens as experienced by tourists. It is an exploration of the tourist gaze upon the Athenian landscape, as well as an account of how tourists narrate its urban mythology. This research is largely concerned with the relationship of time and space through memory, exploring the interplay between the spatial arrangement of urban elements, temporality and the experience of the city. Athens is viewed as a city marked by a temporal collage where different historical periods are juxtaposed. This juxtaposition gives Athens the character of a deconstructed city. The city is made present through spatialised remainders, her genius loci. This thesis thus analyses the relationship between Athens past and the present, the strangely familiar and the stereotypically exotic, as interwoven within an urban landscape imagined, gazed and finally, narrated by foreign tourists. The core argument of this work is that the Athenian landscape embodies an urban mythology constructed by the nineteenth century romantic travellers: these, through their writings, fashioned the stereotypical imagery of Athens. Modem tourists are the consumers of these myths. Like their nineteenth century predecessors, tourists stroll around the city following the traces of their memory - key landmarks and symbols, recognising what they have already known; feeling nostalgic for the past -their past, fragmenting the landscape into different historic layers, depopulating it from its present inhabitants, orientalising it. In this work I explore the transmission and reinvention of the myths of Athens through guidebooks, travel brochures, guided tours and tourist photographs. The exploration of the different images of Athens as visualised by tourists leads to a discussion of gendered, orientalised, literary, photographic and cartographic aspects of the Athenian urban landscape. The theoretical framework of the thesis is based on post-modernism, post-structuralism and semiotics. My research methods have been qualitative, including both in-depth interviews and participant observation, following tourists around the city and participating in their activities. I also analysed the ways tourists 'gaze' and photograph the city. My intention is to draw -metaphorically speaking- a mental map including the sites visited, consumed and experienced by tourists

    Mechanisms of place recognition and path integration based on the insect visual system

    Get PDF
    Animals are often able to solve complex navigational tasks in very challenging terrain, despite using low resolution sensors and minimal computational power, providing inspiration for robots. In particular, many species of insect are known to solve complex navigation problems, often combining an array of different behaviours (Wehner et al., 1996; Collett, 1996). Their nervous system is also comparatively simple, relative to that of mammals and other vertebrates. In the first part of this thesis, the visual input of a navigating desert ant, Cataglyphis velox, was mimicked by capturing images in ultraviolet (UV) at similar wavelengths to the ant’s compound eye. The natural segmentation of ground and sky lead to the hypothesis that skyline contours could be used by ants as features for navigation. As proof of concept, sky-segmented binary images were used as input for an established localisation algorithm SeqSLAM (Milford and Wyeth, 2012), validating the plausibility of this claim (Stone et al., 2014). A follow-up investigation sought to determine whether using the sky as a feature would help overcome image matching problems that the ant often faced, such as variance in tilt and yaw rotation. A robotic localisation study showed that using spherical harmonics (SH), a representation in the frequency domain, combined with extracted sky can greatly help robots localise on uneven terrain. Results showed improved performance to state of the art point feature localisation methods on fast bumpy tracks (Stone et al., 2016a). In the second part, an approach to understand how insects perform a navigational task called path integration was attempted by modelling part of the brain of the sweat bee Megalopta genalis. A recent discovery that two populations of cells act as a celestial compass and visual odometer, respectively, led to the hypothesis that circuitry at their point of convergence in the central complex (CX) could give rise to path integration. A firing rate-based model was developed with connectivity derived from the overlap of observed neural arborisations of individual cells and successfully used to build up a home vector and steer an agent back to the nest (Stone et al., 2016b). This approach has the appeal that neural circuitry is highly conserved across insects, so findings here could have wide implications for insect navigation in general. The developed model is the first functioning path integrator that is based on individual cellular connections

    Advances in Computer Recognition, Image Processing and Communications, Selected Papers from CORES 2021 and IP&C 2021

    Get PDF
    As almost all human activities have been moved online due to the pandemic, novel robust and efficient approaches and further research have been in higher demand in the field of computer science and telecommunication. Therefore, this (reprint) book contains 13 high-quality papers presenting advancements in theoretical and practical aspects of computer recognition, pattern recognition, image processing and machine learning (shallow and deep), including, in particular, novel implementations of these techniques in the areas of modern telecommunications and cybersecurity
    corecore