276 research outputs found
A Review on Personalized Tag based Image based Search Engines
The development of social media based on Web 2.0, amounts of images and videos spring up everywhere on the Internet. This phenomenon has brought great challenges to multimedia storage, indexing and retrieval. Generally speaking, tag-based image search is more commonly used in social media than content based image retrieval and content understanding. Thanks to the low relevance and diversity performance of initial retrieval results, the ranking problem in the tag-based image retrieval has gained researchersďż˝ wide attention. We will review some of techniques proposed by different authors for image retrieval in this paper
Recommended from our members
Text-based document geolocation and its application to the digital humanities
This dissertation investigates automatic geolocation of documents (i.e. identification of their location, expressed as latitude/longitude coordinates), based on the text of those documents rather than metadata. I assert that such geolocation can be performed using text alone, at a sufficient accuracy for use in real-world applications. Although in some corpora metadata is found in abundance (e.g. home location, time zone, friends, followers, etc. in Twitter), it is lacking in others, such as many corpora of primary-source documents in the digital humanities, an area to which document geolocation has hardly been applied. To this end, I first develop methods for accurate text-based geolocation and then apply them to newly-annotated corpora in the digital humanities. The geolocation methods I develop use both uniform and adaptive (k-d tree) grids over the Earth’s surface, culminating in a hierarchical logistic-regression-based technique that achieves state of the art results on well-known corpora (Twitter user feeds, Wikipedia articles and Flickr image tags). In the second part of the dissertation I develop a new NLP task, text-based geolocation of historical corpora. Because there are no existing corpora to test on, I create and annotate two new corpora of significantly different natures (a 19th-century travel log and a large set of Civil War archives). I show how my methods produce good geolocation accuracy even given the relatively small amount of annotated data available, which can be further improved using domain adaptation. I then use the predictions on the much larger unannotated portion of the Civil War archives to generate and analyze geographic topic models, showing how they can be mined to produce interesting revelations concerning various Civil War-related subjects. Finally, I develop a new geolocation technique for text-only corpora involving co-training between document-geolocation and toponym- resolution models, using a gazetteer to inject additional information into the training process. To evaluate this technique I develop a new metric, the closest toponym error distance, on which I show improvements compared with a baseline geolocator.Linguistic
Spatial and Temporal Sentiment Analysis of Twitter data
The public have used Twitter world wide for expressing opinions. This study focuses on spatio-temporal variation of georeferenced Tweets’ sentiment polarity, with a view to understanding how opinions evolve on Twitter over space and time and across communities of users. More specifically, the question this study tested is whether sentiment polarity on Twitter exhibits specific time-location patterns. The aim of the study is to investigate the spatial and temporal distribution of georeferenced Twitter sentiment polarity within the area of 1 km buffer around the Curtin Bentley campus boundary in Perth, Western Australia. Tweets posted in campus were assigned into six spatial zones and four time zones. A sentiment analysis was then conducted for each zone using the sentiment analyser tool in the Starlight Visual Information System software. The Feature Manipulation Engine was employed to convert non-spatial files into spatial and temporal feature class. The spatial and temporal distribution of Twitter sentiment polarity patterns over space and time was mapped using Geographic Information Systems (GIS). Some interesting results were identified. For example, the highest percentage of positive Tweets occurred in the social science area, while science and engineering and dormitory areas had the highest percentage of negative postings. The number of negative Tweets increases in the library and science and engineering areas as the end of the semester approaches, reaching a peak around an exam period, while the percentage of negative Tweets drops at the end of the semester in the entertainment and sport and dormitory area. This study will provide some insights into understanding students and staff ’s sentiment variation on Twitter, which could be useful for university teaching and learning management
European Handbook of Crowdsourced Geographic Information
"This book focuses on the study of the remarkable new source of geographic information that has become available in the form of user-generated content accessible over the Internet through mobile and Web applications. The exploitation, integration and application of these sources, termed volunteered geographic information (VGI) or crowdsourced geographic information (CGI), offer scientists an unprecedented opportunity to conduct research on a variety of topics at multiple scales and for diversified objectives. The Handbook is organized in five parts, addressing the fundamental questions: What motivates citizens to provide such information in the public domain, and what factors govern/predict its validity?What methods might be used to validate such information? Can VGI be framed within the larger domain of sensor networks, in which inert and static sensors are replaced or combined by intelligent and mobile humans equipped with sensing devices? What limitations are imposed on VGI by differential access to broadband Internet, mobile phones, and other communication technologies, and by concerns over privacy? How do VGI and crowdsourcing enable innovation applications to benefit human society?
Chapters examine how crowdsourcing techniques and methods, and the VGI phenomenon, have motivated a multidisciplinary research community to identify both fields of applications and quality criteria depending on the use of VGI. Besides harvesting tools and storage of these data, research has paid remarkable attention to these information resources, in an age when information and participation is one of the most important drivers of development.
The collection opens questions and points to new research directions in addition to the findings that each of the authors demonstrates. Despite rapid progress in VGI research, this Handbook also shows that there are technical, social, political and methodological challenges that require further studies and research.
Georeferencing flickr resources based on textual meta-data
The task of automatically estimating the location of web resources is of central importance in location-based services on the Web. Much attention has been focused on Flickr photos and videos, for which it was found that language modeling approaches are particularly suitable. In particular, state-of-the art systems for georeferencing Flickr photos tend to cluster the locations on Earth in a relatively small set of disjoint regions, apply feature selection to identify location-relevant tags, then use a form of text classification to identify which area is most likely to contain the true location of the resource, and finally attempt to find an appropriate location within the identified area. In this paper, we present a systematic discussion of each of the aforementioned components, based on the lessons we have learned from participating in the 2010 and 2011 editions of MediaEval’s Placing Task. Extensive experimental results allow us to analyze why certain methods work well on this task and show that a median error of just over 1 km can be achieved on a standard benchmark test set
- …