Ranking and aggregation of Slovenian online news

Abstract

Na spletu obstaja mnogo različnih spletnih strani z novicami, ki pogosto vsebujejo podobne novice. Kakovost novic se med različnimi viri močno razlikuje. Prav tako obstaja kar nekaj spletnih aplikacij, ki podobne novice združujejo. Pogosto uporabniku ponudijo najbolj svežo novico, čeprav ta ni nujno najbolj informativna. Namen diplomske naloge je nadgradnja osnovnega agregatorja novic. Diplomska naloga zajema analizo spletnih mest z novicami in razvoj spletne aplikacije, ki zbira novice. Te združi s podobnimi in jih razvrsti tako, da izpostavi boljše na podlagi algoritmičnega vrednotenja. Aplikacija je sestavljena iz treh komponent, ki so izdelane v programskih jezikih JavaScript, TypeScript in Python. Prva komponenta zbira vsebino in ponuja dostop do te preko REST API-ja. Implementirana je s pomočjo Node.js, Express in MongoDB. Druga komponenta vrednoti in združuje besedila s pomočjo strojnega učenja in je implementirana v programskem jeziku Python. Tretja komponenta je implementirana s pomočjo ogrodja Angular, za prikaz rezultatov analize zbranih besedil.There are many different news websites across the web that serve similar news. The quality of articles varies greatly between different sources. There are also several applications that aggregate similar news. They often show the user the freshest article even though it may not necessarily be the most informative. The purpose of the thesis is to upgrade the basic news aggregator. This thesis covers the analysis of presentation and a content on news websites and the development of web application, which collects the news. Which are aggregated and sorted in a way that the better articles are exposed, based on algorithmic evaluation. The application consists of three components, all are made in programming language such as JavaScript, TypeScript and Python. The first component collects content and serves as REST API to access collected content. It is implemented by using Node.js, Express and MongoDB. The second component grades and groups the collected texts by using machine learning libraries and is implemented in programming language Python. The third component is implemented using Angular to display the results of the analysis

Similar works

Full text

thumbnail-image

Repository of the University of Ljubljana

redirect
Last time updated on 30/11/2019

This paper was published in Repository of the University of Ljubljana.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess