3 research outputs found

    Not-So-Linked Solution to the Linked Data Mining Challenge 2016

    No full text
    Abstract. We present a solution for the Linked Data Mining Challenge 2016, that achieved 92.5% accuracy according to the submission system. The solution uses a hand-crafted dataset, that was created by scraping various websites for reviews. We use logistic regression to learn a classification model and we publish all our results to GitHub

    A hybrid method for rating prediction using linked data features and text reviews

    Get PDF
    This paper describes our entry for the Linked Data Mining Challenge 2016, which poses the problem of classifying music albums as good or bad by mining Linked Data. The original labels are assigned according to aggregated critic scores published by the Metacritic s website. To this end, the challenge provides datasets that contain the DBpedia reference for music albums. Our approach benefits from Linked Data (LD) and free text to extract meaningful features that help to separate these two classes of music albums. Thus, our features can be summarized as follows: (1) direct object LD features, (2) aggregated count LD features, and (3) textual review features. We filtered out those properties somehow related with scores and Metacritic to build unbiased models. By using these sets of features, we trained seven models using 10-fold cross validation to estimate performance. We reached the best average accuracy of 87.81% in the training data using a Linear SVM model and all our features, while we reached 90% in the testing data.This research is partly supported by The Scientific and Technological Research Council of Turkey (Ref.No: B.14.2. TBT.0.06.01-21514107-020-155998
    corecore