1,303 research outputs found

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    IDeixis : image-based deixis for recognizing locations

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 31-32).In this thesis, we describe an approach to recognizing location from camera-equipped mobile devices using image-based web search. This is an image-based deixis capable of pointing at a distant location away from the user's current location. We demonstrate our approach on an application allowing users to browse web pages matching the image of a nearby location. Common image search metrics can match images captured with a camera-equipped mobile device to images found on the World Wide Web. The users can recognize the location if those pages contain information about this location (e.g. name, facts, stories ... etc). Since the amount of information displayable on the device is limited, automatic keyword extraction methods can be applied to help efficiently identify relevant pieces of location information. Searching the entire web can be computationally overwhelming, so we devise a hybrid image-and-keyword searching technique. First, image-search is performed over images and links to their source web pages in a database that indexes only a small fraction of the web. Then, relevant keywords on these web pages are automatically identified and submitted to an existing text-based search engine (e.g. Google) that indexes a much larger portion of the web. Finally, the resulting image set is filtered to retain images close to the original query in terms of visual similarity. It is thus possible to efficiently search hundreds of millions of images that are not only textually related but also visually relevant.by Pei-Hsiu Yeh.S.M

    Appearance-based indoor localization: a comparison of patch descriptor performance

    Get PDF
    Vision is one of the most important of the senses, and humans use it extensively during navigation. We evaluated different types of image and video frame descriptors that could be used to determine distinctive visual landmarks for localizing a person based on what is seen by a camera that they carry. To do this, we created a database containing over 3 km of video-sequences with ground-truth in the form of distance travelled along different corridors. Using this database, the accuracy of localization - both in terms of knowing which route a user is on - and in terms of position along a certain route, can be evaluated. For each type of descriptor, we also tested different techniques to encode visual structure and to search between journeys to estimate a user's position. The techniques include single-frame descriptors, those using sequences of frames, and both colour and achromatic descriptors. We found that single-frame indexing worked better within this particular dataset. This might be because the motion of the person holding the camera makes the video too dependent on individual steps and motions of one particular journey. Our results suggest that appearance-based information could be an additional source of navigational data indoors, augmenting that provided by, say, radio signal strength indicators (RSSIs). Such visual information could be collected by crowdsourcing low-resolution video feeds, allowing journeys made by different users to be associated with each other, and location to be inferred without requiring explicit mapping. This offers a complementary approach to methods based on simultaneous localization and mapping (SLAM) algorithms.Comment: Accepted for publication on Pattern Recognition Letter

    A Location-Aware Middleware Framework for Collaborative Visual Information Discovery and Retrieval

    Get PDF
    This work addresses the problem of scalable location-aware distributed indexing to enable the leveraging of collaborative effort for the construction and maintenance of world-scale visual maps and models which could support numerous activities including navigation, visual localization, persistent surveillance, structure from motion, and hazard or disaster detection. Current distributed approaches to mapping and modeling fail to incorporate global geospatial addressing and are limited in their functionality to customize search. Our solution is a peer-to-peer middleware framework based on XOR distance routing which employs a Hilbert Space curve addressing scheme in a novel distributed geographic index. This allows for a universal addressing scheme supporting publish and search in dynamic environments while ensuring global availability of the model and scalability with respect to geographic size and number of users. The framework is evaluated using large-scale network simulations and a search application that supports visual navigation in real-world experiments

    Image recognition-based architecture to enhance inclusive mobility of visually impaired people in smart and urban environments

    Get PDF
    The demographic growth that we have witnessed in recent years, which is expected to increase in the years to come, raises emerging challenges worldwide regarding urban mobility, both in transport and pedestrian movement. The sustainable development of cities is also intrinsically linked to urban planning and mobility strategies. The tasks of navigation and orientation in cities are something that we resort to today with great frequency, especially in unknown cities and places. Current navigation solutions refer to the precision aspect as a big challenge, especially between buildings in city centers. In this paper, we focus on the segment of visually impaired people and how they can obtain information about where they are when, for some reason, they have lost their orientation. Of course, the challenges are different and much more challenging in this situation and with this population segment. GPS, a technique widely used for navigation in outdoor environments, does not have the precision we need or the most beneficial type of content because the information that a visually impaired person needs when lost is not the name of the street or the coordinates but a reference point. Therefore, this paper includes the proposal of a conceptual architecture for outdoor positioning of visually impaired people using the Landmark Positioning approach.5311-8814-F0ED | Sara Maria da Cruz Maia de Oliveira PaivaN/

    “AccessBIM” - A Model of Environmental Characteristics for Vision Impaired Indoor Navigation and Way Finding

    Get PDF
    The complexity of modern indoor environments has made navigation difficult for individuals with vision impairment. Hence, this thesis presents the AccessBIM framework, which is an optimized database that’s facilitates generation of a real-time floor plan with path determination. The AccessBIM framework has the potential to play an integral role in improving the independence and quality of life for people with vision impairment whilst also decreasing the cost to the community related to caretakers
    • 

    corecore