1,088 research outputs found
Context-aware person identification in personal photo collections
Identifying the people in photos is an important need for users of photo management systems. We present MediAssist, one such system which facilitates browsing, searching and semi-automatic annotation of personal photos, using analysis of both image content and the context in which the photo is captured. This semi-automatic annotation includes annotation of the identity of people in photos. In this paper, we focus on such person annotation, and propose person identiïŹcation techniques based on a combination of context and content. We propose language modelling and nearest neighbor approaches to context-based person identiïŹcation, in addition to novel face color and image color content-based features (used alongside face recognition and body patch features). We conduct a comprehensive empirical study of these techniques using the real private photo collections of a number of users, and show that combining context- and content-based analysis improves performance over content or context alone
Coping with noise in a real-world weblog crawler and retrieval system
In this paper we examine the effects of noise when creating a real-world weblog corpus for information retrieval. We focus on the DiffPost (Lee et al. 2008) approach to noise removal from blog pages, examining the difficulties encountered when crawling the blogosphere during the creation of a real-world corpus of blog pages. We introduce and evaluate a number of enhancements to the original DiffPost approach in order to increase the robustness of the algorithm. We then extend DiffPost by looking at the anchor-text to text ratio, and dis- cover that the time-interval between crawls is more impor- tant to the successful application of noise-removal algorithms within the blog context, than any additional improvements to the removal algorithm itself
Combination of content analysis and context features for digital photograph retrieval.
In recent years digital cameras have seen an enormous rise
in popularity, leading to a huge increase in the quantity of
digital photos being taken. This brings with it the challenge of organising these large collections. The MediAssist project uses date/time and GPS location for the
organisation of personal collections. However, this context
information is not always sufficient to support retrieval
when faced with a large, shared, archive made up of
photos from a number of users. We present work in this
paper which retrieves photos of known objects (buildings,
monuments) using both location information and content-based
retrieval tools from the AceToolbox. We show that
for this retrieval scenario, where a user is searching for
photos of a known building or monument in a large shared
collection, content-based techniques can offer a significant
improvement over ranking based on context (specifically
location) alone
Combining social network analysis and sentiment analysis to explore the potential for online radicalisation
The increased online presence of jihadists has raised the possibility of individuals being radicalised via the Internet. To date, the study of violent radicalisation has focused on dedicated jihadist websites and forums. This may not be the ideal starting point for such research, as participants in these venues may be described as âalready madeup mindsâ. Crawling a global social networking platform, such as YouTube, on the other hand, has the potential to unearth content and interaction aimed at radicalisation of those with little or no apparent prior interest in violent jihadism. This research explores whether such an approach is indeed fruitful. We collected a large dataset from a group within YouTube that we identified as potentially having a radicalising agenda. We analysed this data using social network analysis and sentiment analysis tools, examining the topics discussed and what the sentiment polarity (positive or negative) is towards these topics. In particular, we focus on gender differences in this group of users, suggesting most extreme and less tolerant views among female users
A generic news story segmentation system and its evaluation
The paper presents an approach to segmenting broadcast TV news programmes automatically into individual news stories. We first segment the programme into individual shots, and then a number of analysis tools are run on the programme to extract features to represent each shot. The results of these feature extraction tools are then combined using a support vector machine trained to detect anchorperson shots. A news broadcast can then be segmented into individual stories based on the location of the anchorperson shots within the programme. We use one generic system to segment programmes from two different broadcasters, illustrating the robustness of our feature extraction process to the production styles of different broadcasters
Mobile access to personal digital photograph archives
Handheld computing devices are becoming highly connected
devices with high capacity storage. This has resulted in their being able to support storage of, and access to, personal photo archives. However the only means for mobile device users to browse such archives is typically a simple one-by-one scroll through image thumbnails in the order that they were taken, or by manually organising them based on folders. In this paper we describe a system for context-based browsing of personal digital photo archives. Photos are labeled with the GPS location and time they are taken and this is used to derive other context-based metadata such as weather conditions and daylight conditions. We
present our prototype system for mobile digital photo retrieval, and an experimental evaluation illustrating the utility of location information for effective personal photo retrieval
My digital photos: where and when?
In recent years digital cameras have seen an enormous rise in popularity, leading to a huge increase in the quantity of digital photos being taken. This brings with it the challenge of organising these large collections. We preset work which organises personal digital photo collections based on date/time and GPS location, which we believe will become a key organisational methodology over the next few years as consumer digital cameras evolve to incorporate GPS and as cameras in mobile phones spread further. The accompanying video illustrates the results of our research into digital photo management tools which contains a series of screen and user interactions highlighting how a user utilises the tools we are developing to manage a personal archive of digital photos
Topic-dependent sentiment analysis of financial blogs
While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches
An examination of a large visual lifelog
With lifelogging gaining in popularity, we examine the differences between visual lifelog photos and explicitly captured digital photos. We do this based on an examination of over a year of continuous visual lifelog capture and a collection of over ten thousand personal digital photos
Exploring the use of paragraph-level annotations for sentiment analysis of financial blogs
In this paper we describe our work in the area of topic-based sentiment analysis in the domain of financial blogs. We explore the use of paragraph-level and document-level annotations, examining how additional information from paragraph-level annotations can be used to increase the accuracy of document-level sentiment classification. We acknowledge the additional effort required to provide these paragraph-level annotations, and so we compare these findings against an automatic means of generating topic-specific sub-documents
- âŠ