14 research outputs found

    Opracowanie w chmurze czy chmury nad opracowaniem? Automatyczne indeksowanie dokumentów a biblioteki

    Get PDF
    The paper presents recent research in the field of automatic indexing of text documents, inter alia, in libraries, and the attitudes of Polish academic librarians towards the computerization of the subject cataloging. The methods of literature review and survey were used along with the analysis of Polish academic curricula in the field of library and information science. The article demonstrates on several examples that the similarities in document layout and the topical diversity or homogeneity are the key factors in the computerization of cataloging. The survey conducted amongst Polish subject indexing specialists from academic libraries shows that they have highly limited knowledge about automatic indexing. The results are then compared with the findings of the study on German- and English-speaking librarians’ opinions about automatic subject indexing. They are similar to the outcomes of the previous research by Alice Keller into the attitudes of, among others, the English-speaking subjects

    Increasing trend of scientists to switch between topics

    Get PDF
    We analyze the publication records of individual scientists, aiming to quantify the topic switching dynamics of scientists and its influence. For each scientist, the relations among her publications are characterized via shared references. We find that the co-citing network of the papers of a scientist exhibits a clear community structure where each major community represents a research topic. Our analysis suggests that scientists tend to have a narrow distribution of the number of topics. However, researchers nowadays switch more frequently between topics than those in the early days. We also find that high switching probability in early career (<12y) is associated with low overall productivity, while it is correlated with high overall productivity in latter career. Interestingly, the average citation per paper, however, is in all career stages negatively correlated with the switching probability. We propose a model with exploitation and exploration mechanisms that can explain the main observed features.Comment: 37 pages, 21 figure

    Research complexity of Australian universities

    Get PDF
    Strategic research direction and prioritisation is crucial for decision making in universities. Analysis of research diversification and sophistication helps differentiating universities according to their research attributes. Based on the Microsoft Academic Graph data set, this paper conducts research complexity analysis for all Australian universities, and examines the ubiquity and diversity of the research output. This paper also investigates research complexity indices of Australian universities, with further discussions for universities with research leadership, technological and practical focuses, and young research universities

    Citation Analysis with Microsoft Academic

    Full text link
    We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the 'fields of study' are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers' publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.Comment: preprin

    The coverage of Microsoft Academic: Analyzing the publication output of a university

    Full text link
    This is the first detailed study on the coverage of Microsoft Academic (MA). Based on the complete and verified publication list of a university, the coverage of MA was assessed and compared with two benchmark databases, Scopus and Web of Science (WoS), on the level of individual publications. Citation counts were analyzed, and issues related to data retrieval and data quality were examined. A Perl script was written to retrieve metadata from MA based on publication titles. The script is freely available on GitHub. We find that MA covers journal articles, working papers, and conference items to a substantial extent and indexes more document types than the benchmark databases (e.g., working papers, dissertations). MA clearly surpasses Scopus and WoS in covering book-related document types and conference items but falls slightly behind Scopus in journal articles. The coverage of MA is favorable for evaluative bibliometrics in most research fields, including economics/business, computer/information sciences, and mathematics. However, MA shows biases similar to Scopus and WoS with regard to the coverage of the humanities, non-English publications, and open-access publications. Rank correlations of citation counts are high between MA and the benchmark databases. We find that the publication year is correct for 89.5% of all publications and the number of authors is correct for 95.1% of the journal articles. Given the fast and ongoing development of MA, we conclude that MA is on the verge of becoming a bibliometric superpower. However, comprehensive studies on the quality of MA metadata are still lacking

    Eigenvector-Based Centrality Measures for Temporal Networks

    Get PDF
    Numerous centrality measures have been developed to quantify the importances of nodes in time-independent networks, and many of them can be expressed as the leading eigenvector of some matrix. With the increasing availability of network data that changes in time, it is important to extend such eigenvector-based centrality measures to time-dependent networks. In this paper, we introduce a principled generalization of network centrality measures that is valid for any eigenvector-based centrality. We consider a temporal network with N nodes as a sequence of T layers that describe the network during different time windows, and we couple centrality matrices for the layers into a supra-centrality matrix of size NTxNT whose dominant eigenvector gives the centrality of each node i at each time t. We refer to this eigenvector and its components as a joint centrality, as it reflects the importances of both the node i and the time layer t. We also introduce the concepts of marginal and conditional centralities, which facilitate the study of centrality trajectories over time. We find that the strength of coupling between layers is important for determining multiscale properties of centrality, such as localization phenomena and the time scale of centrality changes. In the strong-coupling regime, we derive expressions for time-averaged centralities, which are given by the zeroth-order terms of a singular perturbation expansion. We also study first-order terms to obtain first-order-mover scores, which concisely describe the magnitude of nodes' centrality changes over time. As examples, we apply our method to three empirical temporal networks: the United States Ph.D. exchange in mathematics, costarring relationships among top-billed actors during the Golden Age of Hollywood, and citations of decisions from the United States Supreme Court.Comment: 38 pages, 7 figures, and 5 table

    Lexicons of Key Terms in Scholarly Texts and Their Disciplinary Differences: From Quantum Semantics Construction to Relative-Entropy-Based Comparisons

    Get PDF
    Complex networks are often used to analyze written text and reports by rendering texts in the form of a semantic network, forming a lexicon of words or key terms. Many existing methods to construct lexicons are based on counting word co-occurrences, having the advantage of simplicity and ease of applicability. Here, we use a quantum semantics approach to generalize such methods, allowing us to model the entanglement of terms and words. We show how quantum semantics can be applied to reveal disciplinary differences in the use of key terms by analyzing 12 scholarly texts that represent the different positions of various disciplinary schools (of conceptual change research) on the same topic (conceptual change). In addition, attention is paid to how closely the lexicons corresponding to different positions can be brought into agreement by suitable tuning of the entanglement factors. In comparing the lexicons, we invoke complex network-based analysis based on exponential matrix transformation and use information theoretic relative entropy (Jensen–Shannon divergence) as the operationalization of differences between lexicons. The results suggest that quantum semantics is a viable way to model the disciplinary differences of lexicons and how they can be tuned for a better agreement

    Lexicons of Key Terms in Scholarly Texts and Their Disciplinary Differences: From Quantum Semantics Construction to Relative-Entropy-Based Comparisons

    Get PDF
    Complex networks are often used to analyze written text and reports by rendering texts in the form of a semantic network, forming a lexicon of words or key terms. Many existing methods to construct lexicons are based on counting word co-occurrences, having the advantage of simplicity and ease of applicability. Here, we use a quantum semantics approach to generalize such methods, allowing us to model the entanglement of terms and words. We show how quantum semantics can be applied to reveal disciplinary differences in the use of key terms by analyzing 12 scholarly texts that represent the different positions of various disciplinary schools (of conceptual change research) on the same topic (conceptual change). In addition, attention is paid to how closely the lexicons corresponding to different positions can be brought into agreement by suitable tuning of the entanglement factors. In comparing the lexicons, we invoke complex network-based analysis based on exponential matrix transformation and use information theoretic relative entropy (Jensen–Shannon divergence) as the operationalization of differences between lexicons. The results suggest that quantum semantics is a viable way to model the disciplinary differences of lexicons and how they can be tuned for a better agreement
    corecore