1,303 research outputs found
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a âpiece â of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
IDeixis : image-based deixis for recognizing locations
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 31-32).In this thesis, we describe an approach to recognizing location from camera-equipped mobile devices using image-based web search. This is an image-based deixis capable of pointing at a distant location away from the user's current location. We demonstrate our approach on an application allowing users to browse web pages matching the image of a nearby location. Common image search metrics can match images captured with a camera-equipped mobile device to images found on the World Wide Web. The users can recognize the location if those pages contain information about this location (e.g. name, facts, stories ... etc). Since the amount of information displayable on the device is limited, automatic keyword extraction methods can be applied to help efficiently identify relevant pieces of location information. Searching the entire web can be computationally overwhelming, so we devise a hybrid image-and-keyword searching technique. First, image-search is performed over images and links to their source web pages in a database that indexes only a small fraction of the web. Then, relevant keywords on these web pages are automatically identified and submitted to an existing text-based search engine (e.g. Google) that indexes a much larger portion of the web. Finally, the resulting image set is filtered to retain images close to the original query in terms of visual similarity. It is thus possible to efficiently search hundreds of millions of images that are not only textually related but also visually relevant.by Pei-Hsiu Yeh.S.M
Appearance-based indoor localization: a comparison of patch descriptor performance
Vision is one of the most important of the senses, and humans use it
extensively during navigation. We evaluated different types of image and video
frame descriptors that could be used to determine distinctive visual landmarks
for localizing a person based on what is seen by a camera that they carry. To
do this, we created a database containing over 3 km of video-sequences with
ground-truth in the form of distance travelled along different corridors. Using
this database, the accuracy of localization - both in terms of knowing which
route a user is on - and in terms of position along a certain route, can be
evaluated. For each type of descriptor, we also tested different techniques to
encode visual structure and to search between journeys to estimate a user's
position. The techniques include single-frame descriptors, those using
sequences of frames, and both colour and achromatic descriptors. We found that
single-frame indexing worked better within this particular dataset. This might
be because the motion of the person holding the camera makes the video too
dependent on individual steps and motions of one particular journey. Our
results suggest that appearance-based information could be an additional source
of navigational data indoors, augmenting that provided by, say, radio signal
strength indicators (RSSIs). Such visual information could be collected by
crowdsourcing low-resolution video feeds, allowing journeys made by different
users to be associated with each other, and location to be inferred without
requiring explicit mapping. This offers a complementary approach to methods
based on simultaneous localization and mapping (SLAM) algorithms.Comment: Accepted for publication on Pattern Recognition Letter
A Location-Aware Middleware Framework for Collaborative Visual Information Discovery and Retrieval
This work addresses the problem of scalable location-aware distributed indexing to enable the leveraging of collaborative effort for the construction and maintenance of world-scale visual maps and models which could support numerous activities including navigation, visual localization, persistent surveillance, structure from motion, and hazard or disaster detection. Current distributed approaches to mapping and modeling fail to incorporate global geospatial addressing and are limited in their functionality to customize search. Our solution is a peer-to-peer middleware framework based on XOR distance routing which employs a Hilbert Space curve addressing scheme in a novel distributed geographic index. This allows for a universal addressing scheme supporting publish and search in dynamic environments while ensuring global availability of the model and scalability with respect to geographic size and number of users. The framework is evaluated using large-scale network simulations and a search application that supports visual navigation in real-world experiments
Image recognition-based architecture to enhance inclusive mobility of visually impaired people in smart and urban environments
The demographic growth that we have witnessed in recent years, which is expected to increase in the years to come, raises emerging challenges worldwide regarding urban mobility, both in transport and pedestrian movement. The sustainable development of cities is also intrinsically linked to urban planning and mobility strategies. The tasks of navigation and orientation in cities are something that we resort to today with great frequency, especially in unknown cities and places. Current navigation solutions refer to the precision aspect as a big challenge, especially between buildings in city centers. In this paper, we focus on the segment of visually impaired people and how they can obtain information about where they are when, for some reason, they have lost their orientation. Of course, the challenges are different and much more challenging in this situation and with this population segment. GPS, a technique widely used for navigation in outdoor environments, does not have the precision we need or the most beneficial type of content because the information that a visually impaired person needs when lost is not the name of the street or the coordinates but a reference point. Therefore, this paper includes the proposal of a conceptual architecture for outdoor positioning of visually impaired people using the Landmark Positioning approach.5311-8814-F0ED | Sara Maria da Cruz Maia de Oliveira PaivaN/
âAccessBIMâ - A Model of Environmental Characteristics for Vision Impaired Indoor Navigation and Way Finding
The complexity of modern indoor environments has made navigation difficult for individuals with vision impairment. Hence, this thesis presents the AccessBIM framework, which is an optimized database thatâs facilitates generation of a real-time floor plan with path determination. The AccessBIM framework has the potential to play an integral role in improving the independence and quality of life for people with vision impairment whilst also decreasing the cost to the community related to caretakers
- âŠ