91 research outputs found

    Context-aware person identification in personal photo collections

    Get PDF
    Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo is captured. This semi-automatic annotation includes annotation of the identity of people in photos. In this paper, we focus on such person annotation, and propose person identiïŹcation techniques based on a combination of context and content. We propose language modelling and nearest neighbor approaches to context-based person identiïŹcation, in addition to novel face color and image color content-based features (used alongside face recognition and body patch features). We conduct a comprehensive empirical study of these techniques using the real private photo collections of a number of users, and show that combining context- and content-based analysis improves performance over content or context alone

    Coping with noise in a real-world weblog crawler and retrieval system

    Get PDF
    In this paper we examine the effects of noise when creating a real-world weblog corpus for information retrieval. We focus on the DiffPost (Lee et al. 2008) approach to noise removal from blog pages, examining the difficulties encountered when crawling the blogosphere during the creation of a real-world corpus of blog pages. We introduce and evaluate a number of enhancements to the original DiffPost approach in order to increase the robustness of the algorithm. We then extend DiffPost by looking at the anchor-text to text ratio, and dis- cover that the time-interval between crawls is more impor- tant to the successful application of noise-removal algorithms within the blog context, than any additional improvements to the removal algorithm itself

    Efficient storage and decoding of SURF feature points

    Get PDF
    Practical use of SURF feature points in large-scale indexing and retrieval engines requires an efficient means for storing and decoding these features. This paper investigates several methods for compression and storage of SURF feature points, considering both storage consumption and disk-read efficiency. We compare each scheme with a baseline plain-text encoding scheme as used by many existing SURF implementations. Our final proposed scheme significantly reduces both the time required to load and decode feature points, and the space required to store them on disk

    Combination of content analysis and context features for digital photograph retrieval.

    Get PDF
    In recent years digital cameras have seen an enormous rise in popularity, leading to a huge increase in the quantity of digital photos being taken. This brings with it the challenge of organising these large collections. The MediAssist project uses date/time and GPS location for the organisation of personal collections. However, this context information is not always sufficient to support retrieval when faced with a large, shared, archive made up of photos from a number of users. We present work in this paper which retrieves photos of known objects (buildings, monuments) using both location information and content-based retrieval tools from the AceToolbox. We show that for this retrieval scenario, where a user is searching for photos of a known building or monument in a large shared collection, content-based techniques can offer a significant improvement over ranking based on context (specifically location) alone

    An investigation of term weighting approaches for microblog retrieval

    Get PDF
    The use of effective term frequency weighting and document length normalisation strategies have been shown over a number of decades to have a significant positive effect for document retrieval. When dealing with much shorter documents, such as those obtained from microblogs, it would seem intuitive that these would have less benefit. In this paper we investigate their effect on microblog retrieval performance using the Tweets2011 collection from the TREC 2011 Microblog Track

    Combining social network analysis and sentiment analysis to explore the potential for online radicalisation

    Get PDF
    The increased online presence of jihadists has raised the possibility of individuals being radicalised via the Internet. To date, the study of violent radicalisation has focused on dedicated jihadist websites and forums. This may not be the ideal starting point for such research, as participants in these venues may be described as “already madeup minds”. Crawling a global social networking platform, such as YouTube, on the other hand, has the potential to unearth content and interaction aimed at radicalisation of those with little or no apparent prior interest in violent jihadism. This research explores whether such an approach is indeed fruitful. We collected a large dataset from a group within YouTube that we identified as potentially having a radicalising agenda. We analysed this data using social network analysis and sentiment analysis tools, examining the topics discussed and what the sentiment polarity (positive or negative) is towards these topics. In particular, we focus on gender differences in this group of users, suggesting most extreme and less tolerant views among female users

    A generic news story segmentation system and its evaluation

    Get PDF
    The paper presents an approach to segmenting broadcast TV news programmes automatically into individual news stories. We first segment the programme into individual shots, and then a number of analysis tools are run on the programme to extract features to represent each shot. The results of these feature extraction tools are then combined using a support vector machine trained to detect anchorperson shots. A news broadcast can then be segmented into individual stories based on the location of the anchorperson shots within the programme. We use one generic system to segment programmes from two different broadcasters, illustrating the robustness of our feature extraction process to the production styles of different broadcasters

    "I’m Eating a Sandwich in Glasgow": Modeling locations with tweets

    Get PDF
    Social media such as Twitter generate large quantities of data about what a person is thinking and doing in a partic- ular location. We leverage this data to build models of locations to improve our understanding of a user’s geographic context. Understanding the user’s geographic context can in turn enable a variety of services that allow us to present information, recommend businesses and services, and place advertisements that are relevant at a hyper-local level. In this paper we create language models of locations using coordinates extracted from geotagged Twitter data. We model locations at varying levels of granularity, from the zip code to the country level. We measure the accuracy of these models by the degree to which we can predict the location of an individual tweet, and further by the accuracy with which we can predict the location of a user. We find that we can meet the performance of the industry standard tool for pre- dicting both the tweet and the user at the country, state and city levels, and far exceed its performance at the hyper-local level, achieving a three- to ten-fold increase in accuracy at the zip code level

    A study of the imaging of contrast agents for use in computerised tomography

    Get PDF
    A computed tomography (CT) scanner is a device which is capable of mapping the variation in linear attenuation coefficient in a slice through an object. This is achieved by the multiple measurement of the attenuation of an X-ray beam at various positions and angles through the body. In medical diagnostic imaging using CT, contrast agents are administered to patients resulting in increased attenuation of the beam in the areas where the contrast agent resides. The increased contrast results in the easier and more accurate visualisation of abnormalities. In contrast-enhanced CT, iodine is almost universally used as the contrast agent when imaging the heart and associated arteries / veins. This is due to its low toxicity and high enhancement. It has been used extensively in traditional diagnostic radiology prior to the introduction of CT. A study was performed to determine whether iodine was the optimum element, in terms of the minimum concentration needed for visualisation, to use in contrast-enhanced CT scanning of the myocardium / heart wall. The results of this study show that gadolinium, and not iodine, is the optimum element to use as a CT contrast agent. Gadolinium, chelated to DTPA, is presently used as a contrast agent in MRI. The above study concentrated only on the particular case of imaging the myocardium. A theoretical study was undertaken to determine the minimum concentration of any element when scanned using two different imaging methods. The situation studied was that of administering the contrast agent / analyte to a cylinder, which is itself contained inside another cylinder, the space between filled with some matrix. By varying the size of the inner cylinder, administration of a contrast agent to various organs or arteries can be simulated. By varying the size of the outer cylinder, various object / patient sizes can be studied. In the first imaging method, two scans are performed at any energy, one with and one without the analyte present. These scans are subtracted to yield an image of the analyte alone. In the second method two scans are performed; one on the high side and one on the low side of the K absorption edge of the analyte. Again these are subtracted to yield an image of the analyte since the variation in the attenuation of the matrix across the K-edge is minor compared to that of the analyte. The equations were verified by both computer simulations and experimental scans. Two important results were obtained. As the relative size of the inner cylinder decreases, firstly the optimum element shifts towards higher atomic number transition elements and secondly, the ratio of the minimum concentration of the optimum elements to the minimum concentration of iodine needed decreases making the case for using the transition elements as contrast agents stronger when imaging low relative size objects
    • 

    corecore