6 research outputs found

    An Online Versions of Hyperlinked-Induced Topics Search (HITS) Algorithm

    Get PDF
    Generally, search engines perform the ranking of web pages in an offline mode, which is after the web pages have been retrieved and stored in the database. The existing HITS algorithm (Kleinberg, 1996) operates in an offline mode to perform page rank calculation. In this project, we have implemented an online mode of page ranking for this algorithm. This will improve the overall performance of the Search Engine. This report describes the approach used to implement and test this algorithm. Comparison results performed against other existing search engines are also presented in this report. This is helpful in describing the efficiency of implemented algorithm

    Fake News Detection with Deep Diffusive Network Model

    Get PDF
    In recent years, due to the booming development of online social networks, fake news for various commercial and political purposes has been appearing in large numbers and widespread in the online world. With deceptive words, online social network users can get infected by these online fake news easily, which has brought about tremendous effects on the offline society already. An important goal in improving the trustworthiness of information in online social networks is to identify the fake news timely. This paper aims at investigating the principles, methodologies and algorithms for detecting fake news articles, creators and subjects from online social networks and evaluating the corresponding performance. This paper addresses the challenges introduced by the unknown characteristics of fake news and diverse connections among news articles, creators and subjects. Based on a detailed data analysis, this paper introduces a novel automatic fake news credibility inference model, namely FakeDetector. Based on a set of explicit and latent features extracted from the textual information, FakeDetector builds a deep diffusive network model to learn the representations of news articles, creators and subjects simultaneously. Extensive experiments have been done on a real-world fake news dataset to compare FakeDetector with several state-of-the-art models, and the experimental results have demonstrated the effectiveness of the proposed model

    Horse racing prediction using graph-based features.

    Get PDF
    This thesis presents an applied horse racing prediction using graph based features on a set of horse races data. We used artificial neural network and logistic regression models to train then test to prediction without graph based features and with graph based features. This thesis can be explained in 4 main parts. Collect data from a horse racing website held from 2015 to 2017. Train data to using predictive models and make a prediction. Create a global directed graph of horses and extract graph-based features (Core Part) . Add graph based features to basic features and train to using same predictive models and check improvements prediction accuracy. Two random horses were picked that are in same races from data and tested in systems for prediction. With graph based features, prediction of accuracy better than without graph-based features. Also We tested this system on 2016 and 2017 Kentucky Derby. Even though we did not predict top three results from 2017 Kentucky Derby, in 2016 Kentucky Derby, we predicted top four position

    Website visualizer : a tool for the visual analysis of website usage

    Get PDF
    Mestrado em Engenharia Electrónica, Telecomunicações e InformáticaOs sítios web estão incorporados em organizações para sustentar a missão das mesmas e para garantir uma difusão eficaz de informação num quadro de fluxo de trabalho eficiente. Neste contexto, os gestores de conteúdo e informação tem que monitorizar constantemente as necessidades inerentes à missão institucional e reflecti-las na estrutura, conteúdos e paradigmas de interacção das respectivas intranets e extranets. Esta tarefa de monitorização e análise não é de todo trivial, nem automática, sendo difícil garantir a sincronização dos sítios institucionais com as efectivas necessidades da sua missão em dado momento. O objectivo fundamental deste trabalho traduz-se nos exercícios de conceptualização, desenvolvimento e avaliação de uma aplicação capaz de relatar um cenário de análise e visualizar padrões de interacção em sitíos institucionais suportados em tecnologias web, que seja capaz de realçar as áreas mais críticas, com base na análise da estructura, conteúdo e hiperligações. Para este efeito, propôs-se um modelo conceptual e uma arquitectura, bem como um conjunto de métodos de visualização que facilitem essa análise. De forma a validar o modelo conceptual, a arquitectura, as estruturas de informação e os diversos métodos de visualização propostos, desenvolveu-se um protótipo que já comporta algumas fases de avaliação e aferição. Este protótipo pode ser considerado como uma plataforma de suporte à investigação capaz de integrar e testar esquemas específicos de visualização e procedimentos de correlação visual. Em suma, é parte integrante de um dos projectos de investigação da Universidade de Aveiro. Mais específicamente, este trabalho introduz uma arquitectura por camadas que suporta vistas multiplas sincronizadas, bem como novos metódos de visualização, inspecção e interacção. O prototipo integra estes metódos de visualização numa aplicação capaz de capturar, compilar e analizar informação relacionada com a estructura e conteúdo do sitío web, bem como padrões de utilização. O protótipo destina-se fundamentalmente a dar apoio a especialistas de usabilidade ou gestores de conteúdo na organização do espaço de informação de um sitío institucional. Contudo, não se destina a produzir directamente soluções para problemas de usabilidade encontrados, mas sim a ajudar a tomar decisões com base nos problemas de usabilidade diagnosticados, identificados e sinalizados durante o processo de análise. ABSTRACT: Websites are incorporated in organizations to support their mission and guarantee effective information delivery within an efficient information workflow framework. In this context, content managers have to constantly monitor the business needs and reflect them on the structure, contents and interaction paradigm of the institutional websites. This task is not trivial, nor automated, being difficult to guarantee that these websites are synchronized with the actual business requirements. The overall goal of this work is the conceptualization, development and evaluation of an application able to assist usability experts in the analysis and visualization of interaction patterns of organizational web based systems. It should be able to highlight the most critical website areas, based on the analysis of website structure, contents and interconnections. For this purpose, a conceptual model and architecture has been proposed, as well as a set of visualization methods designed to facilitate that analysis. In order to validate the proposed conceptual models, the architecture, information structures and several visualization methods, a prototype was developed, evaluated and refined. It can be considered as an experimental research platform, capable of integrating and testing specific visualization schemes and visual correlation procedures, and is part of an ongoing research program of University of Aveiro. Specifically, this work introduces a layered architecture that supports simultaneously synchronised multiple views, as well as novel visualization, inspection and interaction mechanisms. The prototype integrates these visualization methods in an application able to capture, compile and analyze the information related to the structure, contents and usage patterns of a website. This work is meant mainly to help usability experts or content managers to organize the informational space of an institutional web site. However, this application is not supposed to directly provide solutions for the usability problems of the site but to offer the means to help its users take decisions based on the interpretation of the usability problems identified and highlighted during the analysis process

    Analysis and Improvement of HITS Algorithm for Detecting Web Communities

    No full text
    In this paper, we discuss problems with HITS (HyperlinkInduced Topic Search) algorithm, which capitalizes on hyperlinks to extract topic-bound communities of web pages. Despite its theoretically sound foundations, we observed HITS algorithm failed in real applications. In order to understand this problem, we developed a visualization tool LinkViewer, which graphically presents the extraction process. This tool helped reveal that a large and densely linked set of unrelated Web pages in the base set impeded the extraction. These pages were obtained when the root set was expanded into the base set. As remedies for this topic drift problem, prior studies applied textual analysis method. On the other hand, we propose two methods which utilize only the structural information of the Web: 1) The projection method, which projects eigenvectors on the root subspace, so that most elements in the root set will be relevant to the original topic, and 2) The base-set downsizing method, which filters out the pages without links to multiple pages in the root set. These methods are shown to be robust for broader types of topics and low in computation cost
    corecore