14,161 research outputs found

    Correlating Pedestrian Flows and Search Engine Queries

    Get PDF
    An important challenge for ubiquitous computing is the development of techniques that can characterize a location vis-a-vis the richness and diversity of urban settings. In this paper we report our work on correlating urban pedestrian flows with Google search queries. Using longitudinal data we show pedestrian flows at particular locations can be correlated with the frequency of Google search terms that are semantically relevant to those locations. Our approach can identify relevant content, media, and advertisements for particular locations.Comment: 4 pages, 1 figure, 1 tabl

    Information extraction from multimedia web documents: an open-source platform and testbed

    No full text
    The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

    Semantic analysis of field sports video using a petri-net of audio-visual concepts

    Get PDF
    The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework

    Cross-View Image Matching for Geo-localization in Urban Environments

    Full text link
    In this paper, we address the problem of cross-view image geo-localization. Specifically, we aim to estimate the GPS location of a query street view image by finding the matching images in a reference database of geo-tagged bird's eye view images, or vice versa. To this end, we present a new framework for cross-view image geo-localization by taking advantage of the tremendous success of deep convolutional neural networks (CNNs) in image classification and object detection. First, we employ the Faster R-CNN to detect buildings in the query and reference images. Next, for each building in the query image, we retrieve the kk nearest neighbors from the reference buildings using a Siamese network trained on both positive matching image pairs and negative pairs. To find the correct NN for each query building, we develop an efficient multiple nearest neighbors matching method based on dominant sets. We evaluate the proposed framework on a new dataset that consists of pairs of street view and bird's eye view images. Experimental results show that the proposed method achieves better geo-localization accuracy than other approaches and is able to generalize to images at unseen locations

    Contextual queries and situated information needs for mobile users

    Get PDF
    The users of mobile devices increasingly use networked services to address their information needs. Questions asked by mobile users are strongly influenced by contextual factors such as location, conversation and activity. We report on a diary study performed to better understand mobile information needs. Participants’ diary entries are used as a basis for discussing the geographical and situational context in which mobile information behaviour occurs. The suitability of user queries to be answered by a portable knowledge collection and web search are also considered. We find that the type of questions recorded by participants varies across their locations, with differences between home, shopping and in-car contexts. These variations occur both in the query terms and in the form of desired answers. Both the location of queries and the participants’ activities affected participants’ questions. When information needs were affected by both location and activity, they tended to be strongly affected by both factors. The overall picture that emerges is one of multiple contextual influences interacting to shape mobile information needs. Mobile devices that attempt to adapt to users’ context will need to account for a rich variety of situational factors

    Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases

    Get PDF
    In this paper, we introduce a novel framework for automatic Semantic Video Annotation. As this framework detects possible events occurring in video clips, it forms the annotating base of video search engine. To achieve this purpose, the system has to able to operate on uncontrolled wide-domain videos. Thus, all layers have to be based on generic features. This framework aims to bridge the "semantic gap", which is the difference between the low-level visual features and the human's perception, by finding videos with similar visual events, then analyzing their free text annotation to find a common area then to decide the best description for this new video using commonsense knowledgebases. Experiments were performed on wide-domain video clips from the TRECVID 2005 BBC rush standard database. Results from these experiments show promising integrity between those two layers in order to find expressing annotations for the input video. These results were evaluated based on retrieval performance
    corecore