Search CORE

1 research outputs found

Agregacja danych tekstowych na przykładzie systemu informacji prasowej

Author: Dubel Bartosz
Kasprowski Paweł
Publication venue: Silesian University of Technology Press
Publication date
Field of study

Huge amount of textual information available in Internet becomes one of the most important problems because analysis of such data is difficult automatically. Typical examples of such big text databases are web services presenting press information. The same or very similar information repeats in different services. That is why so called “aggregators” that aggregate and preprocess information from different services are becoming more and more popular. This paper presents one of such aggregators that collects information from multiple services, parses and analyses it and then tries to classify and collect different statistics.Nadmiar informacji dostępnej w postaci tekstowej w sieci Internet staje się coraz większym problemem, ponieważ automatyczna analiza takich danych jest trudna. Typowymi przykładami dużych baz tekstowych są serwisy prezentujące bieżące informacje prasowe. Z uwagi na dużą liczbę takich serwisów, wiele informacji powtarza się. W artykule omówiono system z grupy tak zwanych agregatorów, który gromadzi w jednym miejscu informacje z wielu serwisów, dokonuje ich analizy i klasyfikacji, a następnie generuje na ich podstawie różnego rodzaju statystyki

Studia Informatica (E-Journal)