6 research outputs found
An Online Versions of Hyperlinked-Induced Topics Search (HITS) Algorithm
Generally, search engines perform the ranking of web pages in an offline mode, which is after the web pages have been retrieved and stored in the database. The existing HITS algorithm (Kleinberg, 1996) operates in an offline mode to perform page rank calculation. In this project, we have implemented an online mode of page ranking for this algorithm. This will improve the overall performance of the Search Engine. This report describes the approach used to implement and test this algorithm. Comparison results performed against other existing search engines are also presented in this report. This is helpful in describing the efficiency of implemented algorithm
Fake News Detection with Deep Diffusive Network Model
In recent years, due to the booming development of online social networks,
fake news for various commercial and political purposes has been appearing in
large numbers and widespread in the online world. With deceptive words, online
social network users can get infected by these online fake news easily, which
has brought about tremendous effects on the offline society already. An
important goal in improving the trustworthiness of information in online social
networks is to identify the fake news timely. This paper aims at investigating
the principles, methodologies and algorithms for detecting fake news articles,
creators and subjects from online social networks and evaluating the
corresponding performance. This paper addresses the challenges introduced by
the unknown characteristics of fake news and diverse connections among news
articles, creators and subjects. Based on a detailed data analysis, this paper
introduces a novel automatic fake news credibility inference model, namely
FakeDetector. Based on a set of explicit and latent features extracted from the
textual information, FakeDetector builds a deep diffusive network model to
learn the representations of news articles, creators and subjects
simultaneously. Extensive experiments have been done on a real-world fake news
dataset to compare FakeDetector with several state-of-the-art models, and the
experimental results have demonstrated the effectiveness of the proposed model
Horse racing prediction using graph-based features.
This thesis presents an applied horse racing prediction using graph based features on a set of horse races data. We used artificial neural network and logistic regression models to train then test to prediction without graph based features and with graph based features. This thesis can be explained in 4 main parts. Collect data from a horse racing website held from 2015 to 2017. Train data to using predictive models and make a prediction. Create a global directed graph of horses and extract graph-based features (Core Part) . Add graph based features to basic features and train to using same predictive models and check improvements prediction accuracy. Two random horses were picked that are in same races from data and tested in systems for prediction. With graph based features, prediction of accuracy better than without graph-based features. Also We tested this system on 2016 and 2017 Kentucky Derby. Even though we did not predict top three results from 2017 Kentucky Derby, in 2016 Kentucky Derby, we predicted top four position
Website visualizer : a tool for the visual analysis of website usage
Mestrado em Engenharia Electrónica, Telecomunicações e InformáticaOs sítios web estão incorporados em organizações para sustentar a missão
das mesmas e para garantir uma difusão eficaz de informação num quadro de
fluxo de trabalho eficiente. Neste contexto, os gestores de conteúdo e
informação tem que monitorizar constantemente as necessidades inerentes à
missão institucional e reflecti-las na estrutura, conteúdos e paradigmas de
interacção das respectivas intranets e extranets. Esta tarefa de monitorização
e análise não é de todo trivial, nem automática, sendo difícil garantir a
sincronização dos sítios institucionais com as efectivas necessidades da sua
missão em dado momento.
O objectivo fundamental deste trabalho traduz-se nos exercícios de
conceptualização, desenvolvimento e avaliação de uma aplicação capaz de
relatar um cenário de análise e visualizar padrões de interacção em sitíos
institucionais suportados em tecnologias web, que seja capaz de realçar as
áreas mais críticas, com base na análise da estructura, conteúdo e
hiperligações. Para este efeito, propôs-se um modelo conceptual e uma
arquitectura, bem como um conjunto de métodos de visualização que facilitem
essa análise.
De forma a validar o modelo conceptual, a arquitectura, as estruturas de
informação e os diversos métodos de visualização propostos, desenvolveu-se
um protótipo que já comporta algumas fases de avaliação e aferição. Este
protótipo pode ser considerado como uma plataforma de suporte à
investigação capaz de integrar e testar esquemas específicos de visualização
e procedimentos de correlação visual. Em suma, é parte integrante de um dos
projectos de investigação da Universidade de Aveiro.
Mais específicamente, este trabalho introduz uma arquitectura por camadas
que suporta vistas multiplas sincronizadas, bem como novos metódos de
visualização, inspecção e interacção. O prototipo integra estes metódos de
visualização numa aplicação capaz de capturar, compilar e analizar informação
relacionada com a estructura e conteúdo do sitío web, bem como padrões de
utilização.
O protótipo destina-se fundamentalmente a dar apoio a especialistas de
usabilidade ou gestores de conteúdo na organização do espaço de informação
de um sitío institucional. Contudo, não se destina a produzir directamente
soluções para problemas de usabilidade encontrados, mas sim a ajudar a
tomar decisões com base nos problemas de usabilidade diagnosticados,
identificados e sinalizados durante o processo de análise.
ABSTRACT: Websites are incorporated in organizations to support their mission and
guarantee effective information delivery within an efficient information workflow
framework. In this context, content managers have to constantly monitor the
business needs and reflect them on the structure, contents and interaction
paradigm of the institutional websites. This task is not trivial, nor automated,
being difficult to guarantee that these websites are synchronized with the actual
business requirements.
The overall goal of this work is the conceptualization, development and
evaluation of an application able to assist usability experts in the analysis and
visualization of interaction patterns of organizational web based systems. It
should be able to highlight the most critical website areas, based on the
analysis of website structure, contents and interconnections. For this purpose,
a conceptual model and architecture has been proposed, as well as a set of
visualization methods designed to facilitate that analysis.
In order to validate the proposed conceptual models, the architecture,
information structures and several visualization methods, a prototype was
developed, evaluated and refined. It can be considered as an experimental
research platform, capable of integrating and testing specific visualization
schemes and visual correlation procedures, and is part of an ongoing research
program of University of Aveiro.
Specifically, this work introduces a layered architecture that supports
simultaneously synchronised multiple views, as well as novel visualization,
inspection and interaction mechanisms. The prototype integrates these
visualization methods in an application able to capture, compile and analyze
the information related to the structure, contents and usage patterns of a
website.
This work is meant mainly to help usability experts or content managers to
organize the informational space of an institutional web site. However, this
application is not supposed to directly provide solutions for the usability
problems of the site but to offer the means to help its users take decisions
based on the interpretation of the usability problems identified and highlighted
during the analysis process
Analysis and Improvement of HITS Algorithm for Detecting Web Communities
In this paper, we discuss problems with HITS (HyperlinkInduced Topic Search) algorithm, which capitalizes on hyperlinks to extract topic-bound communities of web pages. Despite its theoretically sound foundations, we observed HITS algorithm failed in real applications. In order to understand this problem, we developed a visualization tool LinkViewer, which graphically presents the extraction process. This tool helped reveal that a large and densely linked set of unrelated Web pages in the base set impeded the extraction. These pages were obtained when the root set was expanded into the base set. As remedies for this topic drift problem, prior studies applied textual analysis method. On the other hand, we propose two methods which utilize only the structural information of the Web: 1) The projection method, which projects eigenvectors on the root subspace, so that most elements in the root set will be relevant to the original topic, and 2) The base-set downsizing method, which filters out the pages without links to multiple pages in the root set. These methods are shown to be robust for broader types of topics and low in computation cost