290 research outputs found
Development of a machine learning-based model to autonomously estimate web page credibility
There is a broad range of information available on the Internet, some of which is considered to be more credible than others. People consider different credibility aspects while evaluating the credibility of a web page, however, many web users find it difficult to determine the credibility of all types of web pages. An autonomous system that can analyze different credibility factors extracted from a web page to estimate the page's credibility could help users to make better decisions about the perceived credibility of the web information.
This research investigated the applicability of several machine learning approaches to the evaluation of web page credibility. First, six credibility categories were identified from peer-reviewed literature. Then, their related credibility features were investigated and automatically extracted from the web page content, metadata, or external resources.
Three sets of features (i.e., automatically extracted credibility features, bag of words features, and combination of both) were used in classification experiments to compare their impact on the autonomous credibility estimation model performance. The Content Credibility Corpus (C3) dataset was used to develop and test the performance of the model developed in this research. XGBoost achieved the best weighted average F1 score for extracted features. In comparison, the Logistic Regression classifier had the best performance when bag of words features was used, and all features together were used as a feature vector.
To begin to explore the legitimacy of this approach, a crowdsourcing task was conducted to evaluate how the output of the proposed model aligns with the credibility ratings given by human annotators. Thirty web pages were selected from the C3 dataset to find out how current users' ratings correlate to the ratings that were used as ground truth to train the model. In addition, 30 new web pages were selected to explore how generalizable the algorithm is for classifying new web pages.
Participants were asked to rate the credibility of each web page base on a 5-point Likert scale. Sixty-nine crowd-sourced participants evaluated the credibility of the 60 web pages for a total of 600 ratings (10 per page). Spearman's correlation between average credibility scores given by participants and original scores in the C3 dataset indicates a moderate positive correlation: r = 0.44, p < 0.02. A contingency table was created to compare the predicted scores by the model with the rated scores by participants. Overall, the model achieved an accuracy of 80%, which indicates that the proposed model can generalize for new web pages.
The model outlined in this thesis outperformed the previous work by using a promising set of features that some of them were presented in this research for the first time
Recommended from our members
Mining Scholarly Publications for Research Evaluation
Scientific research can lead to breakthroughs that revolutionise society by solving long-standing problems. However, investment of public funds into research requires the ability to clearly demonstrate beneficial returns, accountability, and good management. At the same time, with the amount of scholarly literature rapidly expanding, recognising key research that presents the most important contributions to science is becoming increasingly difficult and time-consuming. This creates a need for effective and appropriate research evaluation methods. However, the question of how to evaluate the quality of research outcomes is very difficult to answer and despite decades of research, there is still no standard solution to this problem.
Given this growing need for research evaluation, it is increasingly important to understand how research should be evaluated, and whether the existing methods meet this need. However, the current solutions, which are predominantly based on counting the number of interactions in the scholarly communication network, are insufficient for a number of reasons. In particular, they struggle in capturing many aspects of the academic culture and often significantly lag behind current developments.
This work focuses on the evaluation of research publications and aims at creating new methods which utilise publication content. It studies the concept of research publication quality, methods assessing the performance of new research publication evaluation methods, analyses and extends the existing methods, and, most importantly, presents a new class of metrics which are based on publication manuscripts. By bridging the fields of research evaluation and text- and data-mining, this work provides tools for analysing the outcomes of research, and for relieving information overload in scholarly publishing
Designing for quality in real-world mobile crowdsourcing systems
PhD ThesisCrowdsourcing has emerged as a popular means to collect and analyse data on a scale for
problems that require human intelligence to resolve. Its prompt response and low cost have
made it attractive to businesses and academic institutions. In response, various online
crowdsourcing platforms, such as Amazon MTurk, Figure Eight and Prolific have successfully
emerged to facilitate the entire crowdsourcing process. However, the quality of results has
been a major concern in crowdsourcing literature. Previous work has identified various key
factors that contribute to issues of quality and need to be addressed in order to produce high
quality results. Crowd tasks design, in particular, is a major key factor that impacts the
efficiency and effectiveness of crowd workers as well as the entire crowdsourcing process.
This research investigates crowdsourcing task designs to collect and analyse two distinct types
of data, and examines the value of creating high-quality crowdwork activities on new
crowdsource enabled systems for end-users. The main contribution of this research includes 1)
a set of guidelines for designing crowdsourcing tasks that support quality collection, analysis
and translation of speech and eye tracking data in real-world scenarios; and 2) Crowdsourcing
applications that capture real-world data and coordinate the entire crowdsourcing process to
analyse and feed quality results back. Furthermore, this research proposes a new quality control
method based on workers trust and self-verification. To achieve this, the research follows the
case study approach with a focus on two real-world data collection and analysis case studies.
The first case study, Speeching, explores real-world speech data collection, analysis, and
feedback for people with speech disorder, particularly with Parkinsonās. The second case study,
CrowdEyes, examines the development and use of a hybrid system combined of crowdsourcing
and low-cost DIY mobile eye trackers for real-world visual data collection, analysis, and
feedback. Both case studies have established the capability of crowdsourcing to obtain high
quality responses comparable to that of an expert. The Speeching app, and the provision of
feedback in particular were well perceived by the participants. This opens up new opportunities
in digital health and wellbeing. Besides, the proposed crowd-powered eye tracker is fully
functional under real-world settings. The results showed how this approach outperforms all
current state-of-the-art algorithms under all conditions, which opens up the technology for wide
variety of eye tracking applications in real-world settings
Index ordering by query-independent measures
There is an ever-increasing amount of data that is being produced from various data sources ā this data must then be organised effectively if we hope to search though it. Traditional information retrieval approaches search through all available data in a particular collection in order to find the most suitable results, however, for particularly large collections this may be extremely time consuming.
Our purposed solution to this problem is to only search a limited amount of the collection at query-time, in order to speed this retrieval process up. Although, in doing this we aim to limit the loss in retrieval efficacy (in terms of accuracy of results). The way we aim to do this is to firstly identify the most āimportantā documents within the collection, and then sort the documents within the collection in order of their "importanceā in the collection. In this way we can choose to limit the amount of information to search through, by eliminating the documents of lesser importance, which should not only make the search more efficient, but should also limit any loss in retrieval accuracy.
In this thesis we investigate various different query-independent methods that may indicate the importance of a document in a collection. The more accurate the measure is at determining an important document, the more effectively we can eliminate documents from the retrieval process - improving the query-throughput of the system, as well as providing a high level of accuracy in the returned results. The effectiveness of these approaches are evaluated using the datasets provided by the terabyte track at the Text REtreival Conference (TREC)
Proceedings of the 11th Toulon-Verona International Conference on Quality in Services
The Toulon-Verona Conference was founded in 1998 by prof. Claudio Baccarani of the University of Verona, Italy, and prof. Michel Weill of the University of Toulon, France. It has been organized each year in a different place in Europe in cooperation with a host university (Toulon 1998, Verona 1999, Derby 2000, Mons 2001, Lisbon 2002, Oviedo 2003, Toulon 2004, Palermo 2005, Paisley 2006, Thessaloniki 2007, Florence, 2008). Originally focusing on higher education institutions, the research themes have over the years been extended to the health sector, local government, tourism, logistics, banking services. Around a hundred delegates from about twenty different countries participate each year and nearly one thousand research papers have been published over the last ten years, making of the conference one of the major events in the field of quality in services
- ā¦