2,961 research outputs found

    SenseCam image localisation using hierarchical SURF trees

    Get PDF
    The SenseCam is a wearable camera that automatically takes photos of the wearer's activities, generating thousands of images per day. Automatically organising these images for efficient search and retrieval is a challenging task, but can be simplified by providing semantic information with each photo, such as the wearer's location during capture time. We propose a method for automatically determining the wearer's location using an annotated image database, described using SURF interest point descriptors. We show that SURF out-performs SIFT in matching SenseCam images and that matching can be done efficiently using hierarchical trees of SURF descriptors. Additionally, by re-ranking the top images using bi-directional SURF matches, location matching performance is improved further

    Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data

    Full text link
    Localization is a key requirement for mobile robot autonomy and human-robot interaction. Vision-based localization is accurate and flexible, however, it incurs a high computational burden which limits its application on many resource-constrained platforms. In this paper, we address the problem of performing real-time localization in large-scale 3D point cloud maps of ever-growing size. While most systems using multi-modal information reduce localization time by employing side-channel information in a coarse manner (eg. WiFi for a rough prior position estimate), we propose to inter-weave the map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera. The key to our approach is a localization policy based on a sequential Monte Carlo estimator. The localiser uses this policy to attempt point-matching only in nodes where it is likely to succeed, significantly increasing the efficiency of the localization process. The proposed multi-modal localization system is evaluated extensively in a large museum building. The results show that our multi-modal approach not only increases the localization accuracy but significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots (Humanoids) 201

    Mo Músaem Fíorúil: a web-based search and information service for museum visitors

    Get PDF
    Abstract. We describe the prototype of an interactive, web-based, museum artifact search and information service. Mo Músaem Fíorúil clusters and indexes images of museum artifacts taken by visitors to the museum where the images are captured using a passive capture device such as Microsoft's SenseCam [1]. The system also matches clustered artifacts to images of the same artifact from the museums o cial photo collection and allows the user to view images of the same artifact taken by other visitors to the museum. This matching process potentially allows the system to provide more detailed information about a particular artifact to the user based on their inferred preferences, thereby greatly enhancing the user's overall museum experience. In this work, we introduce the system and describe, in broad terms, it's overall functionality and use. Using different image sets of artificial museum objects, we also describe experiments and results carried out in relation to the artifact matching component of the system

    Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation

    Full text link
    The task of a visual landmark recognition system is to identify photographed buildings or objects in query photos and to provide the user with relevant information on them. With their increasing coverage of the world's landmark buildings and objects, Internet photo collections are now being used as a source for building such systems in a fully automatic fashion. This process typically consists of three steps: clustering large amounts of images by the objects they depict; determining object names from user-provided tags; and building a robust, compact, and efficient recognition index. To this date, however, there is little empirical information on how well current approaches for those steps perform in a large-scale open-set mining and recognition task. Furthermore, there is little empirical information on how recognition performance varies for different types of landmark objects and where there is still potential for improvement. With this paper, we intend to fill these gaps. Using a dataset of 500k images from Paris, we analyze each component of the landmark recognition pipeline in order to answer the following questions: How many and what kinds of objects can be discovered automatically? How can we best use the resulting image clusters to recognize the object in a query? How can the object be efficiently represented in memory for recognition? How reliably can semantic information be extracted? And finally: What are the limiting factors in the resulting pipeline from query to semantics? We evaluate how different choices of methods and parameters for the individual pipeline steps affect overall system performance and examine their effects for different query categories such as buildings, paintings or sculptures

    Subobject Detection through Spatial Relationships on Mobile Phones

    Get PDF
    We present a novel image classification technique for detecting multiple objects (called subobjects) in a single image. In addition to image classifiers, we apply spatial relationships among the subobjects to verify and to predict locations of detected and undetected subobjects, respectively. By continuously refining the spatial relationships throughout the detection process, even locations of completely occluded exhibits can be determined. Finally, all detected subobjects are labeled and the user can select the object of interest for retrieving corresponding multimedia information. This approach is applied in the context of PhoneGuide, an adaptive museum guidance system for camera-equipped mobile phones. We show that the recognition of subobjects using spatial relationships is up to 68% faster than related approaches without spatial relationships. Results of a field experiment in a local museum illustrate that unexperienced users reach an average recognition rate for subobjects of 85.6% under realistic conditions

    Conceptual spatial representations for indoor mobile robots

    Get PDF
    We present an approach for creating conceptual representations of human-made indoor environments using mobile robots. The concepts refer to spatial and functional properties of typical indoor environments. Following findings in cognitive psychology, our model is composed of layers representing maps at different levels of abstraction. The complete system is integrated in a mobile robot endowed with laser and vision sensors for place and object recognition. The system also incorporates a linguistic framework that actively supports the map acquisition process, and which is used for situated dialogue. Finally, we discuss the capabilities of the integrated system

    Adaptive Training of Video Sets for Image Recognition on Mobile Phones

    Get PDF
    We present an enhancement towards adaptive video training for PhoneGuide, a digital museum guidance system for ordinary camera–equipped mobile phones. It enables museum visitors to identify exhibits by capturing photos of them. In this article, a combined solution of object recognition and pervasive tracking is extended to a client–server–system for improving data acquisition and for supporting scale–invariant object recognition

    Augmented reality experience: from high-resolution acquisition to real time augmented contents

    Get PDF
    This paper presents results of a research project "dUcale" that experiments ICT solutions for the museum of Palazzo Ducale (Urbino). In this project, the famed painting the "Città Ideale" becomes a case to exemplify a specific approach to the digital mediation of cultural heritage. An augmented reality (AR) mobile application, able to enhance the museum visit experience, is presented. The computing technologies involved in the project (websites, desktop and social applications, mobile software, and AR) constitute a persuasive environment for the artwork knowledge. The overall goal of our research is to provide to cultural institutions best practices efficiently on low budgets. Therefore, we present a low cost method for high-resolution acquisition of paintings; the image is used as a base in AR approach. The proposed methodology consists of an improved SIFT extractor for real time image. The other novelty of this work is the multipoint probabilistic layer. Experimental results demonstrated the robustness of the proposed approach with extensive use of the AR application in front of the "Città Ideale" painting. To prove the usability of the application and to ensure a good user experience, we also carried out several users tests in the real scenario

    Mobile Augmented Reality in Museums : Towards Enhancing Visitor's Learning Experience

    Get PDF
    This article presents the design and implementation of a handheld Augmented Reality (AR) system called Mobile Augmented Reality Touring System (M.A.R.T.S). The results of experiments conducted during museum visits using this system are also described. These experiments aim at studying how such a tool can transform the visitor’s learning experience by comparing it to two widely used museum systems. First, we present the museum’s learning experience and a related model which emerged from the state of the art. This model consists of two types of activity experienced by the observer of a work of art: sensitive and analytical. Then, we detail M.A.R.T.S architecture and implementation. Our empirical study highlights the fact that AR can direct visitors’ attention by emphasizing and superimposing. Its magnifying and sensitive effects are well perceived and appreciated by visitors. The obtained results reveal that M.A.R.T.S contributes to a worthwhile learning experience

    Comparing Feature Detectors: A bias in the repeatability criteria, and how to correct it

    Full text link
    Most computer vision application rely on algorithms finding local correspondences between different images. These algorithms detect and compare stable local invariant descriptors centered at scale-invariant keypoints. Because of the importance of the problem, new keypoint detectors and descriptors are constantly being proposed, each one claiming to perform better (or to be complementary) to the preceding ones. This raises the question of a fair comparison between very diverse methods. This evaluation has been mainly based on a repeatability criterion of the keypoints under a series of image perturbations (blur, illumination, noise, rotations, homotheties, homographies, etc). In this paper, we argue that the classic repeatability criterion is biased towards algorithms producing redundant overlapped detections. To compensate this bias, we propose a variant of the repeatability rate taking into account the descriptors overlap. We apply this variant to revisit the popular benchmark by Mikolajczyk et al., on classic and new feature detectors. Experimental evidence shows that the hierarchy of these feature detectors is severely disrupted by the amended comparator.Comment: Fixed typo in affiliation
    corecore