Search CORE

3,110 research outputs found

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

Mobile Cloud Support for Semantic-Enriched Speech Recognition in Social Care

Author: Corradi A.
Destro M.
Foschini L.
Kotoulas S.
Lopez V.
Montanari R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Nowadays, most users carry high computing power mobile devices where speech recognition is certainly one of the main technologies available in every modern smartphone, although battery draining and application performance (resource shortage) have a big impact on the experienced quality. Shifting applications and services to the cloud may help to improve mobile user satisfaction as demonstrated by several ongoing efforts in the mobile cloud area. However, the quality of speech recognition is still not sufficient in many complex cases to replace the common hand written text, especially when prompt reaction to short-term provisioning requests is required. To address the new scenario, this paper proposes a mobile cloud infrastructure to support the extraction of semantics information from speech recognition in the Social Care domain, where carers have to speak about their patients conditions in order to have reliable notes used afterward to plan the best support. We present not only an architecture proposal, but also a real prototype that we have deployed and thoroughly assessed with different queries, accents, and in presence of load peaks, in our experimental mobile cloud Platform as a Service (PaaS) testbed based on Cloud Foundry

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Elearning, Communication and Open-data: Massive Mobile, Ubiquitous and Open Learning

Author: Arranz Pilar
Bohuschke Felix
Brouns Francis
Cobo Ortega Ángel
Collantes Viaña Marta
De Lima Silva Mariana
Gabelas Barroso José Antonio
García Luis
Jordano de la Torre María
Marta-Lazo Carmen
Rocha Blanco Eliana Rocío
Rodríguez Hoyos Carlos
Silva Alejandro
Solana-González Pedro
Sáez López José Manuel
Ventura Expósito Patricia
Viñuales Javier
Zorrilla Pantaleón Marta Elena
Álvarez Saiz Elena Esperanza
Publication venue
Publication date: 01/01/2015
Field of study

ABSTRACT: In MOOCs, learning analytics have to be addressed to the various types of learners that participate. This deliverable describes indicators that enable both teachers and learner to monitor the progress and performance as well as identify whether there are learners at risk of dropping out. How these indicators should be computed and displayed to end users by means of dashboards is also explained. Furthermore a proposal based on xAPI statements for storing relevant data and events is provided

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UCrea

ECO D2.5 Learning analytics requirements and metrics report

Author: Arranz Pilar
Bohuschke Felix
Brouns Francis
Cobo Ortega Ángel
Collantes Viaña Marta
De Lima Silva Mariana
Esperanza‎ Álvarez Saiz Elena
Gabelas Barroso José Antonio
García Luis
Jordano de la Torre María
Marta-Lazo Carmen
Rocío‎ Rocha Blanco Eliana
Rodríguez Hoyo Carlos
Silva Alejandro
Solana González Pedro
Sáez López José Manuel
Ventura Expósito Patricia
Viñuales Javier
Zorrilla Pantaleón Marta Elena
Publication venue
Publication date: 27/01/2015
Field of study

In MOOCs, learning analytics have to be addressed to the various types of learners that participate. This deliverable describes indicators that enable both teachers and learner to monitor the progress and performance as well as identify whether there are learners at risk of dropping out. How these indicators should be computed and displayed to end users by means of dashboards is also explained. Furthermore a proposal based on xAPI statements for storing relevant data and events is provided.Part of the work carried out has been funded with support from the European Commission, under the ICT Policy Support Programme, as part of the Competitiveness and Innovation Framework Programme (CIP) in the ECO project under grant agreement n° 21127

Open University of the Netherlands Research Portal

Herding cats in a FOSS ecosystem: a tale of communication and coordination for release management

Author: A Bachmann
A Bohn
A Guzzi
A Lamkanfi
A Mockus
A Mockus
A Sarma
AR Dennis
AR Dennis
B Flyvbjerg
B Luijten
B Vasilescu
C Bird
C Bird
C de Souza
C Walters
D Pagano
DL Parnas
DM German
DM German
E Linstead
E Rocco
E Shihab
FP Brooks Jr
G Robles
GM Olson
H Schackmann
J Howison
J Ljungberg
JD Herbsleb
JD Herbsleb
JM Corbin
JP Scott
JR Carlson
JR Casebolt
JW Creswell
K Fogel
LP Robert
M Jacomy
M Lungu
M Michlmayr
M Michlmayr
ML Markus
P Runeson
RK Yin
RL Daft
S Easterbrook
S Jansen
S Jansen
S Koch
W Scacchi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Exploiting Social Media Sources for Search, Fusion and Evaluation

Author: Lee Chia-Jung
Publication venue: ScholarWorks@UMass Amherst
Publication date: 09/11/2015
Field of study

The web contains heterogeneous information that is generated with different characteristics and is presented via different media. Social media, as one of the largest content carriers, has generated information from millions of users worldwide, creating material rapidly in all types of forms such as comments, images, tags, videos and ratings, etc. In social applications, the formation of online communities contributes to conversations of substantially broader aspects, as well as unfiltered opinions about subjects that are rarely covered in public media. Information accrued on social platforms, therefore, presents a unique opportunity to augment web sources such as Wikipedia or news pages, which are usually characterized as being more formal. The goal of this dissertation is to investigate in depth how social data can be exploited and applied in the context of three fundamental information retrieval (IR) tasks: search, fusion, and evaluation. Improving search performance has consistently been a major focus in the IR community. Given the in-depth discussions and active interactions contained in social media, we present approaches to incorporating this type of data to improve search on general web corpora. In particular, we propose two graph-based frameworks, social anchor and information network, to associate related web and social content, where information sources of diverse characteristics can be used to complement each other in a unified manner. We investigate how the enriched representation can potentially reduce vocabulary mismatch and improve retrieval effectiveness. Presenting social media content to users is valuable particularly for queries intended for time-sensitive events or community opinions. Current major search engines commonly blend results from different search services (or verticals) into core web results. Motivated by this real-world need, we explore ways to merge results from different web and social services into a single ranked list. We present an optimization framework for fusion, where impact of documents, ranked lists, and verticals can be modeled simultaneously to maximize performance. Evaluating search system performance has largely relied on creating reusable test collections in IR. Traditional ways to creating evaluation sets can require substantial manual effort. To reduce such effort, we explore an approach to automating the process of collecting pairs of queries and relevance judgments, using high quality social media, Community Question Answering (CQA). Our approach is based on the idea that CQA services support platforms for users to raise questions and to share answers, therefore encoding the associations between real user information needs and real user assessments. To demonstrate the effectiveness of our approaches, we conduct extensive retrieval and fusion experiments, as well as verify the reliability of the new, CQA-based evaluation test sets

ScholarWorks@UMass Amherst

Learning Social Links and Communities from Interaction, Topical, and Spatio-Temporal Information

Author: TRAN VAN CANH
Publication venue
Publication date: 01/01/2014
Field of study

The immense popularity of today's social networks has lead to the availability and accessibility of vast amounts of data created by users on a daily basis. Various types of information can be extracted from such data, for example, interactions among users, topics of user postings, and geographic locations of users. While most of the existing works on social network analysis, in particular those focusing on social links and communities, rely on explicit and static link structures among users, extracting knowledge from exploiting more features embedded in user-generated data is another important direction that only recently has gained more attention. Initial studies employing this approach show good results in terms of a better understanding latent interactions among users. In the context of this dissertation, multiple features embedded in user-generated data are investigated to develop new models and algorithms for (1) revealing hidden social links between users and (2) extracting and analyzing dynamic feature-based communities in social networks. We introduce two approaches for extracting and measuring interpretable and meaningful social links between users. One is based on the participation of users in threads of discussions. The other one relies on the social characteristics of users as reflected in their postings. A novel probabilistic model called rLinkTopic is developed to address the problem of extracting a new type of feature-based community called regional LinkTopic: a community of users that are geographically close to each other over time, have common interests indicated by the topical similarity of their postings, and are contextually linked to each other. Based on the rLinkTopic model, a comprehensive framework called ErLinkTopic is developed that allows to extract and capture complex changes in the features describing regional LinkTopic communities, for example, the community membership of users and topics of communities. Our framework provides a novel basis for important studies such as exploring social characteristics of users in geographic regions and predicting the evolution of user communities. For each approach developed in this dissertation, extensive comparative experiments are conducted using data from real-world social networks to validate the proposed models and algorithms in terms of effectiveness and efficiency. The experimental results are further discussed in detail to show improvements over existing approaches and the applicability and advantages of our models in terms of learning social links and communities from user-generated data

Heidelberger Dokumentenserver

Contributions to the cornerstones of interaction in visualization: strengthening the interaction of visualization

Author: Tominski Christian (gnd: 13336089X)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

Visualization has become an accepted means for data exploration and analysis. Although interaction is an important component of visualization approaches, current visualization research pays less attention to interaction than to aspects of the graphical representation. Therefore, the goal of this work is to strengthen the interaction side of visualization. To this end, we establish a unified view on interaction in visualization. This unified view covers four cornerstones: the data, the tasks, the technology, and the human.Visualisierung hat sich zu einem unverzichtbaren Werkzeug für die Exploration und Analyse von Daten entwickelt. Obwohl Interaktion ein wichtiger Bestandteil solcher Werkzeuge ist, wird der Interaktion in der aktuellen Visualisierungsforschung weniger Aufmerksamkeit gewidmet als Aspekten der graphischen Repräsentation. Daher ist es das Ziel dieser Arbeit, die Interaktion im Bereich der Visualisierung zu stärken. Hierzu wird eine einheitliche Sicht auf Interaktion in der Visualisierung entwickelt

Rostocker Dokumentenserver