3 research outputs found

    作者主题模型及其改进的方法与应用研究综述

    Get PDF
    [目的/意义]作者主题模型作为近年来计算机领域关注度较高的新型概率模型,在文本挖掘与自然语言处理等方向已有广泛应用。分析国内外作者主题模型及其改进的思路与应用,更好地把握其研究现状,以期为计算机、图书情报等相关领域科研人员提供参考。[方法/过程]本文选取Web of Science核心数据库、DBLP及中国知网(CNKI)数据库作为文献来源,通过制定检索规则、去重及人工判读等操作提炼出关于作者主题模型及其改进方法的文献集,从模型应用过程的视角,结合文献分析法对现有研究进行总结归纳。[结果/结论]通过分析发现,现有相关研究已形成较为完整的分析流程,且模型的改进角度、适用领域也日益多样化。但性能优化、模型评价指标的规范完善以及在图书情报领域的进一步应用等方面仍有待深入探索。</p

    Term-community-based topic detection with variable resolution

    Get PDF
    Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like similar methods, it employs community detection in term co-occurrence graphs, but it is enhanced by including a resolution parameter that can be used for changing the targeted topic granularity. We also establish a term ranking and use semantic word-embedding for presenting term communities in a way that facilitates their interpretation. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics at various resolutions. A comparison with topics detected by Latent Dirichlet Allocation is also included. Finally, we discuss factors that influence topic interpretation.Comment: 31 pages, 6 figure

    Mining and Analyzing the Academic Network

    Get PDF
    Social Network research has attracted the interests of many researchers, not only in analyzing the online social networking applications, such as Facebook and Twitter, but also in providing comprehensive services in scientific research domain. We define an Academic Network as a social network which integrates scientific factors, such as authors, papers, affiliations, publishing venues, and their relationships, such as co-authorship among authors and citations among papers. By mining and analyzing the academic network, we can provide users comprehensive services as searching for research experts, published papers, conferences, as well as detecting research communities or the evolutions hot research topics. We can also provide recommendations to users on with whom to collaborate, whom to cite and where to submit.In this dissertation, we investigate two main tasks that have fundamental applications in the academic network research. In the first, we address the problem of expertise retrieval, also known as expert finding or ranking, in which we identify and return a ranked list of researchers, based upon their estimated expertise or reputation, to user-specified queries. In the second, we address the problem of research action recommendation (prediction), specifically, the tasks of publishing venue recommendation, citation recommendation and coauthor recommendation. For both tasks, to effectively mine and integrate heterogeneous information and therefore develop well-functioning ranking or recommender systems is our principal goal. For the task of expertise retrieval, we first proposed or applied three modified versions of PageRank-like algorithms into citation network analysis; we then proposed an enhanced author-topic model by simultaneously modeling citation and publishing venue information; we finally incorporated the pair-wise learning-to-rank algorithm into traditional topic modeling process, and further improved the model by integrating groups of author-specific features. For the task of research action recommendation, we first proposed an improved neighborhood-based collaborative filtering approach for publishing venue recommendation; we then applied our proposed enhanced author-topic model and demonstrated its effectiveness in both cited author prediction and publishing venue prediction; finally we proposed an extended latent factor model that can jointly model several relations in an academic environment in a unified way and verified its performance in four recommendation tasks: the recommendation on author-co-authorship, author-paper citation, paper-paper citation and paper-venue submission. Extensive experiments conducted on large-scale real-world data sets demonstrated the superiority of our proposed models over other existing state-of-the-art methods
    corecore