196,714 research outputs found

    Using citation-context to reduce topic drifting on pure citation-based recommendation

    Get PDF
    Recent works in the area of academic recommender systems have demonstrated the effectiveness of co-citation and citation closeness in related-document recommendations. However, documents recommended from such systems may drift away from the main theme of the query document. In this work, we investigate whether incorporating the textual information in close proximity to a citation as well as the citation position could reduce such drifting and further increase the performance of the recommender system. To investigate this, we run experiments with several recommendation methods on a newly created and now publicly available dataset containing 53 million unique citation-based records. We then conduct a user-based evaluation with domain-knowledgeable participants. Our results show that a new method based on the combination of Citation Proximity Analysis (CPA), topic modelling and word embeddings achieves more than 20% improvement in Normalised Discounted Cumulative Gain (nDCG) compared to CPA

    The Closer the Better: Similarity of Publication Pairs at Different Co-Citation Levels

    Full text link
    We investigate the similarities of pairs of articles which are co-cited at the different co-citation levels of the journal, article, section, paragraph, sentence and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as the co-citation level gets lower (from journal to bracket). While the main gain in similarity happens when moving from journal to article co-citation, all level changes entail an increase in similarity, especially section to paragraph and paragraph to sentence/bracket levels. We compare results from four journals over the years 2010-2015: Cell, the European Journal of Operational Research, Physics Letters B and Research Policy, with consistent general outcomes and some interesting differences. Our findings motivate the use of granular co-citation information as defined by meaningful units of text, with implications for, among others, the elaboration of maps of science and the retrieval of scholarly literature

    Birds of a Feather - Better Together? Exploring the Optimal Spatial Distribution of Ethnic Inventors

    Get PDF
    We examine how the spatial and social proximity of inventors affects knowledge flows, focusing especially on how the two forms of proximity interact. We develop a knowledge flow production function (KFPF) as a flexible tool for modeling access to knowledge and show that the optimal spatial concentration of socially proximate inventors in a city or nation depends on whether spatial and social proximity are complements or substitutes in facilitating knowledge flows. We employ patent citation data, using same-MSA and co-ethnicity as proxies for spatial and social proximity, respectively, to estimate the key KFPF parameters. Although co-location and co-ethnicity both predict knowledge flows, the marginal benefit of co-location is significantly less for co-ethnic inventors. These results imply that dispersion of socially proximate individuals is optimal from the perspectives of the city and the economy. In contrast, for socially proximate individuals themselves, spatial concentration is preferred - and the only stable equilibrium.

    Open Access Scientometrics and the UK Research Assessment Exercise

    No full text
    Scientometric predictors of research performance need to be validated by showing that they have a high correlation with the external criterion they are trying to predict. The UK Research Assessment Exercise (RAE) -- together with the growing movement toward making the full-texts of research articles freely available on the web -- offer a unique opportunity to test and validate a wealth of old and new scientometric predictors, through multiple regression analysis: Publications, journal impact factors, citations, co-citations, citation chronometrics (age, growth, latency to peak, decay rate), hub/authority scores, h-index, prior funding, student counts, co-authorship scores, endogamy/exogamy, textual proximity, download/co-downloads and their chronometrics, etc. can all be tested and validated jointly, discipline by discipline, against their RAE panel rankings in the forthcoming parallel panel-based and metric RAE in 2008. The weights of each predictor can be calibrated to maximize the joint correlation with the rankings. Open Access Scientometrics will provide powerful new means of navigating, evaluating, predicting and analyzing the growing Open Access database, as well as powerful incentives for making it grow faster

    A Correlation Study of Co-opinion and Co-citation Similarity Measures

    Get PDF
    Co-citation forms a relational document network. Co-citation-based measures are found to be effective in retrieving relevant documents. However, they are far from ideal and need further enhancements. Co-opinion concept was proposed and tested in previous research and found to be effective in retrieving relevant documents. The present study endeavors to explore the correlation between opinion (dis)similarity measures and the traditional co-citation-based ones including Citation Proximity Index (CPI), co-citedness and co-citation context similarity. The results show significant, though weak to medium, correlations between the variables. The correlations are direct for co-opinion measure, while being inverse for the opinion distance. Accordingly, the two groups of measures are revealed to represent some similar aspects of the document relation. Moreover, the weakness of the correlations implies that there are different dimensions represented by the two group

    Citation Analysis of North American Symposium on Knowledge Organization (NASKO) proceedings (2007-2015)

    Get PDF
    Knowledge Organization (KO) theoretical foundations are still being developed in a continuous process of epistemological, theoretical and methodological consolidation. The remarkable growth of scientific records has stimulated the analysis of this production and the creation of instruments to evaluate the behavior of science became indispensable. We propose the Domain Analysis of KO in North America through the citation analysis of North American Symposium on Knowledge Organization (NASKO) proceedings (2007-2015). We present the citation, co-citation and bibliographic coupling analysis to visualize and recognize the researchers that influence the scholarly communication in this domain. The most prolific authors through NASKO conferences are Smiraglia, Tennis, Green, Dousa, Grant Campbell, Pimentel, Beak, La Barre, Kipp and Fox. Regarding their theoretical references, Hjørland, Olson, Smiraglia, and Ranganathan are the authors who most inspired the event's studies. The co-citation network shows the highest frequency is between Olson and Mai, followed by Hjørland and Mai and Beghtol and Mai, consolidating Mai and Hjørland as the central authors of the theoretical references in NASKO. The strongest theoretical proximity in author bibliographic coupling network occurs between Fox and Tennis, Dousa and Tennis, Tennis and Smiraglia, Dousa and Beak, and Pimentel and Tennis, highlighting Tennis as central author, that interconnects the others in relation to KO theoretical references in NASKO. The North American chapter has demonstrated a strong scientific production as well as a high level of concern with theoretical and epistemological questions, gathering researchers from different countries, universities and knowledge areas
    corecore