59,306 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Early aspects: aspect-oriented requirements engineering and architecture design
This paper reports on the third Early Aspects: Aspect-Oriented Requirements Engineering and Architecture Design Workshop, which has been held in Lancaster, UK, on March 21, 2004. The workshop included a presentation session and working sessions in which the particular topics on early aspects were discussed. The primary goal of the workshop was to focus on challenges to defining methodical software development processes for aspects from early on in the software life cycle and explore the potential of proposed methods and techniques to scale up to industrial applications
Beautiful and damned. Combined effect of content quality and social ties on user engagement
User participation in online communities is driven by the intertwinement of
the social network structure with the crowd-generated content that flows along
its links. These aspects are rarely explored jointly and at scale. By looking
at how users generate and access pictures of varying beauty on Flickr, we
investigate how the production of quality impacts the dynamics of online social
systems. We develop a deep learning computer vision model to score images
according to their aesthetic value and we validate its output through
crowdsourcing. By applying it to over 15B Flickr photos, we study for the first
time how image beauty is distributed over a large-scale social system.
Beautiful images are evenly distributed in the network, although only a small
core of people get social recognition for them. To study the impact of exposure
to quality on user engagement, we set up matching experiments aimed at
detecting causality from observational data. Exposure to beauty is
double-edged: following people who produce high-quality content increases one's
probability of uploading better photos; however, an excessive imbalance between
the quality generated by a user and the user's neighbors leads to a decline in
engagement. Our analysis has practical implications for improving link
recommender systems.Comment: 13 pages, 12 figures, final version published in IEEE Transactions on
Knowledge and Data Engineering (Volume: PP, Issue: 99
- …