8,389 research outputs found

    Exploratory topic modeling with distributional semantics

    Full text link
    As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail. This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent Data Analysis (IDA 2015

    Discover semantic topics in patents within a specific domain

    Full text link
    © Rinton Press. Patent topic discovery is critical for innovation-oriented enterprises to hedge the patent application risks and raise the success rate of patent application. Topic models are commonly recognized as an efficient tool for this task by researchers from both academy and industry. However, many existing well-known topic models, e.g., Latent Dirichlet Allocation (LDA), which are particularly designed for the documents represented by word-vectors, exhibit low accuracy and poor interpretability on patent topic discovery task. The reason is that 1) the semantics of documents are still under-explored in a specific domain 2) and the domain background knowledge is not successfully utilized to guide the process of topic discovery. In order to improve the accuracy and the interpretability, we propose a new patent representation and organization with additional inter-word relationships mined from title, abstract, and claim of patents. The representation can endow each patent with more semantics than word-vector. Meanwhile, we build a Backbone Association Link Network (Backbone ALN) to incorporate domain background semantics to further enhance the semantics of patents. With new semantic-rich patent representations, we propose a Semantic LDA model to discover semantic topics from patents within a specific domain. It can discover semantic topics with association relations between words rather than a single word vector. At last, accuracy and interpretability of the proposed model are verified on real-world patents datasets from the United States Patent and Trademark Office. The experimental results show that Semantic LDA model yields better performance than other conventional models (e.g., LDA). Furthermore, our proposed model can be easily generalized to other related text mining corpus

    Machine Learning in Management Accounting Research : Literature Review and Pathways for the Future

    Get PDF
    This paper explores the possibilities of employing machine learning (ML) methods and new data sources in management accounting (MA) research. A review of current accounting and related research reveals that ML methods in MA are still in their infancy. However, a review of recently published ML research from related fields reveals several new opportunities to utilize ML in MA research. We suggest that the most promising areas to employ ML methods in MA research lie in (1) the exploitation of the rich potential of various textual data sources; (2) the quantification of qualitative and unstructured data to create new measures; (3) the creation of better estimates and predictions; and (4) the use of explainable AI to interpret ML models in detail. ML methods can play a crucial role in MA research by creating, developing, and refining theories through induction and abduction, as well as by providing tools for interventionist studies.© 2022 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.fi=vertaisarvioitu|en=peerReviewed
    • …
    corecore