420 research outputs found
Using flickr for characterizing the environment: An exploratory analysis
© Shelan S. Jeawak, Christopher B. Jones, and Steven Schockaert. The photo-sharing website Flickr has become a valuable informal information source in disciplines such as geography and ecology. Some ecologists, for instance, have been manually analysing Flickr to obtain information that is more up-to-date than what is found in traditional sources. While several previous works have shown the potential of Flickr tags for characterizing places, it remains unclear to what extent such tags can be used to derive scientifically useful information for ecologists in an automated way. To obtain a clearer picture about the kinds of environmental features that can be modelled using Flickr tags, we consider the problem of predicting scenicness, species distribution, land cover, and several climate related features. Our focus is on comparing the predictive power of Flickr tags with that of structured data from more traditional sources. We find that, broadly speaking, Flickr tags perform comparably to the considered structured data sources, being sometimes better and sometimes worse. Most importantly, we find that combining Flickr tags with structured data sources consistently, and sometimes substantially, improves the results. This suggests that Flickr indeed provides information that is complementary to traditional sources
Exploiting Flickr meta-data for predicting environmental features
The photo-sharing website Flickr has become used as an informal information source in disciplines such as geography and ecology. Many recent studies have highlighted the fact that Flickr tags capture valuable ecological information, which can complement more traditional sources. A shortcoming of most of these existing methods is that they rely on manual interpretation of Flickr content, with little automated exploitation of the associated tags. Therefore, they fail to exploit the full potential of the data. Automatically extracting and analysing information from unstructured and noisy data remains a hard task. This research aims to investigate the use of Flickr meta-data for predicting a wide variety of environmental phenomena. In particular, we consider the problem of predicting scenicness, species distribution, land cover, and climate-related features. To this end, we developed several novel machine learning methods that can efficiently utilise Flickr tags as a supplementary source to the structured information that is available from traditional scientific resources.
The first proposed method aims at modelling locations, and hence inferring environmental phenomena, using georeferenced Flickr tags. Our focus was on comparing the predictive power of Flickr tags with that of structured environmental data. This method represents each location as a concatenation of two feature vectors: a bag-of words representation derived from Flickr and a feature vector encoding the numerical and categorical features obtained from the structured dataset. We found that Flickr was generally competitive with the structured environmental data for prediction, being sometimes better and sometimes worse. However, combining Flickr tags with existing ecological data sources consistently improved the results, which suggests that Flickr can indeed be regarded as complementary to traditional sources. The second method that we propose is based on a collective prediction model, which crucially relies on Flickr tags to define the neighbourhood structure. The use of a collective prediction formulation is motivated by the fact that most environmental features are strongly spatially autocorrelated. While this suggests that geographic distance should play a key role in determining neighbourhoods, we show that considerable gains can be made by additionally taking Flickr tags and traditional data into consideration.
The thesis considers two further novel methods which are based on a low dimensional vector space representation. The first model, called EGEL (Embedding Geographic Locations), learns vector space embeddings of geographic locations by integrating the textual information derived from Flickr with the numerical and categorical information derived from environmental datasets. We experimentally show that this method improves on bag-of-words representation approaches, especially in cases where structured data are available. This model has been extended by considering a spatiotemporal representation of regions. In particular, we propose a spatiotemporal embeddings model, called SPATE (Spatiotemporal Embeddings), which learns a vector space embedding for each geographic region and each month of the year. This allows the model to capture environmental phenomena that may depend on monthly or seasonal variation. Apart from extending our primary model, SPATE also includes a new smoothing method to deal with the sparsity of Flickr tags over the considered spatiotemporal setup.
The experimental results demonstrated in this thesis confirm our hypothesis that there is valuable information contained in Flickr tags which can be used to predict environmental features
Recommended from our members
A conceptual framework for studying collective reactions to events in location-based social media
Events are a core concept of spatial information, but location-based social media (LBSM) provide information on reactions to events. Individuals have varied degrees of agency in initiating, reacting to or modifying the course of events, and reactions include observations of occurrence, expressions containing sentiment or emotions, or a call to action. Key characteristics of reactions include referent events and information about who reacted, when, where and how. Collective reactions are composed of multiple individual reactions sharing common referents. They can be characterized according to the following dimensions: spatial, temporal, social, thematic and interlinkage. We present a conceptual framework, which allows characterization and comparison of collective reactions. For a thematically well-defined class of event such as storms, we can explore differences and similarities in collective attribution of meaning across space and time. Other events may have very complex spatio-temporal signatures (e.g. political processes such as Brexit or elections), which can be decomposed into series of individual events (e.g. a temporal window around the result of a vote). The purpose of our framework is to explore ways in which collective reactions to events in LBSM can be described and underpin the development of methods for analysing and understanding collective reactions to events
Geo-Information Harvesting from Social Media Data
As unconventional sources of geo-information, massive imagery and text
messages from open platforms and social media form a temporally quasi-seamless,
spatially multi-perspective stream, but with unknown and diverse quality. Due
to its complementarity to remote sensing data, geo-information from these
sources offers promising perspectives, but harvesting is not trivial due to its
data characteristics. In this article, we address key aspects in the field,
including data availability, analysis-ready data preparation and data
management, geo-information extraction from social media text messages and
images, and the fusion of social media and remote sensing data. We then
showcase some exemplary geographic applications. In addition, we present the
first extensive discussion of ethical considerations of social media data in
the context of geo-information harvesting and geographic applications. With
this effort, we wish to stimulate curiosity and lay the groundwork for
researchers who intend to explore social media data for geo-applications. We
encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
The Digital Classicist 2013
This edited volume collects together peer-reviewed papers that initially emanated from presentations at Digital Classicist seminars and conference panels. This wide-ranging volume showcases exemplary applications of digital scholarship to the ancient world and critically examines the many challenges and opportunities afforded by such research. The chapters included here demonstrate innovative approaches that drive forward the research interests of both humanists and technologists while showing that rigorous scholarship is as central to digital research as it is to mainstream classical studies. As with the earlier Digital Classicist publications, our aim is not to give a broad overview of the field of digital classics; rather, we present here a snapshot of some of the varied research of our members in order to engage with and contribute to the development of scholarship both in the fields of classical antiquity and Digital Humanities more broadly
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
The Digital Classicist 2013
This edited volume collects together peer-reviewed papers that initially emanated from presentations at Digital Classicist seminars and conference panels.
This wide-ranging volume showcases exemplary applications of digital scholarship to the ancient world and critically examines the many challenges and opportunities afforded by such research. The chapters included here demonstrate innovative approaches that drive forward the research interests of both humanists and technologists while showing that rigorous scholarship is as central to digital research as it is to mainstream classical studies.
As with the earlier Digital Classicist publications, our aim is not to give a broad overview of the field of digital classics; rather, we present here a snapshot of some of the varied research of our members in order to engage with and contribute to the development of scholarship both in the fields of classical antiquity and Digital Humanities more broadly
- …