Çok-etiketli veri akışlarının etiket önceliklendirmesi ile verimli sınıflandırılması

Abstract

Cataloged from PDF version of article.Includes bibliographical references (leaves 31-35).Real-time data processing systems generate huge amounts of data that need to be classified. The volume, variety, velocity, and veracity (uncertainty) of this data necessitate new approaches and the adaptation of existing classification methods. Moreover, the arriving data can belong to more than one class at the same time. As the number of labels grows larger, a significant portion of the multi-label data stream classification methods become computationally inefficient. We propose a novel online approach: the Prioritized Binary Transformation (PBT) method, which can classify data with large numbers of labels by ordering the labels using Principal Component Analysis (PCA) within a fixed-size window. This order is then used to transform the label vectors for classification. We perform an empirical analysis on 12 datasets and compare PBT to four prominent baselines using four evaluation metrics. PBT achieves the best average ranking in three of the four evaluation metrics. Moreover, we investigate efficiency under average execution time per data item and memory consumption where PBT achieves second and first average rankings, respectively.by Onur Yıldırı

Similar works

Full text

thumbnail-image

Bilkent University Institutional Repository

redirect
Last time updated on 03/11/2025

This paper was published in Bilkent University Institutional Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess