3 research outputs found

    λ¬Έμž₯λ‹¨μœ„μ˜ 이쒅 κ·Έλž˜ν”„μ— κΈ°λ°˜ν•œ λ‰΄μŠ€ 쀑볡 제거

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (석사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·컴퓨터곡학뢀, 2015. 2. 이상ꡬ.With the flourishing development of the media of the network, dealing with the abusing news is becoming an essential requirement for portal news websites. However, previous research has only been attempting to improve the detecting efficiency or accuracy during finding near-duplicate news. Most of them rarely think about which news should be deleted or retained. Thus, we propose a heterogeneous graph-based news filtering framework using novel sentence level graph model for a new generation of duplicate news filtering, which is composed of two basic algorithms. First, extract and identify more duplicate news pairs by using sentence-level near-duplicate news detection algorithmand second, calculate an accurate representative score by using the graph-ranking based on representative news selection algorithm. The proposed framework has been tested using real world dataset and the experimental result show that the proposed algorithms can improve the accuracy of descriptive news selection effectively.Chapter 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . 3 1.3 Outline . . . . . . . . . . . . . . .. . . . . . . 4 Chapter 2 Related Work 5 2.1 Near-duplicate detection . . . . . . . . . . . . 5 2.2 Graph-based representative selection . . . . . . . 6 2.2.1 TextRank . . . . . . . . . . . . . . . . . . . . 6 2.2.2 CoRank . . . . . . . . . . . . . . . . . . . . . 7 2.2.3 FutureRank . . . . . . . . . . . . . . . . . . . 8 2.2.4 MutualRank . . . . . . . . . . . . . . . . . . . 10 2.2.5 Other Approach . . . . . . . . . . . . . . . . . 11 Chapter 3 Preliminaries 12 3.1 Problem Denition . . . . . . . . . . . . . . . . . 12 Chapter 4 Framework 15 4.1 Near-Duplicate Detection . . . . . . . . . . . . . 15 4.2 Representative Selection . . . . . . . . . . . . . 16 4.2.1 Graph Model . . . . . . . . . . . . . . . . . . 16 4.2.2 Algorithm . . . . . . . . . . . . . . . . . . . 19 Chapter 5 Experiment 23 5.1 Data Preparation . . . . . . . . . . . . . . . . . 23 5.2 Evaluation . . . . . . . . . . . . . . . . . . . . 24 5.2.1 Near-duplicate detection . . . . . . . . . . . . 24 5.2.2 Representative Selection . . . . . . . . . . . . 28 Chapter 6 Conclusion 32 Bibliography 33 μš”μ•½ 36 Acknowledgements 37Maste
    corecore