102 research outputs found
Dynamic Studies of the Scientific Strengths of Nations Using a Highly Detailed Model of Science
Atlanta Conference on Science and Innovation Policy 2009This presentation was part of the session : Methods, Measures, and Dat
The Closer the Better: Similarity of Publication Pairs at Different Co-Citation Levels
We investigate the similarities of pairs of articles which are co-cited at
the different co-citation levels of the journal, article, section, paragraph,
sentence and bracket. Our results indicate that textual similarity,
intellectual overlap (shared references), author overlap (shared authors),
proximity in publication time all rise monotonically as the co-citation level
gets lower (from journal to bracket). While the main gain in similarity happens
when moving from journal to article co-citation, all level changes entail an
increase in similarity, especially section to paragraph and paragraph to
sentence/bracket levels. We compare results from four journals over the years
2010-2015: Cell, the European Journal of Operational Research, Physics Letters
B and Research Policy, with consistent general outcomes and some interesting
differences. Our findings motivate the use of granular co-citation information
as defined by meaningful units of text, with implications for, among others,
the elaboration of maps of science and the retrieval of scholarly literature
Mapping the Structure and Evolution of Chemistry Research
How does our collective scholarly knowledge grow over time? What major areas of science exist and how are they interlinked? Which areas are major knowledge producers; which ones are consumers? Computational scientometrics – the application of bibliometric/scientometric methods to large-scale scholarly datasets – and the communication of results via maps of science might help us answer these questions. This paper represents the results of a prototype study that aims to map the structure and evolution of chemistry research over a 30 year time frame. Information from the combined Science (SCIE) and Social Science (SSCI) Citations Indexes from 2002 was used to generate a disciplinary map of 7,227 journals and 671 journal clusters. Clusters relevant to study the structure and evolution of chemistry were identified using JCR categories and were further clustered into 14 disciplines. The changing scientific composition of these 14 disciplines and their knowledge exchange via citation linkages was computed. Major changes on the dominance, influence, and role of Chemistry, Biology, Biochemistry, and Bioengineering over these 30 years are discussed. The paper concludes with suggestions for future work
Design and update of a classification system : the UCSD map of science
Global maps of science can be used as a reference system to chart career trajectories, the location of emerging research
frontiers, or the expertise profiles of institutes or nations. This paper details data preparation, analysis, and layout performed
when designing and subsequently updating the UCSD map of science and classification system. The original classification
and map use 7.2 million papers and their references from Elsevier’s Scopus (about 15,000 source titles, 2001–2005) and
Thomson Reuters’ Web of Science (WoS) Science, Social Science, Arts & Humanities Citation Indexes (about 9,000 source
titles, 2001–2004)–about 16,000 unique source titles. The updated map and classification adds six years (2005–2010) of WoS
data and three years (2006–2008) from Scopus to the existing category structure–increasing the number of source titles to
about 25,000. To our knowledge, this is the first time that a widely used map of science was updated. A comparison of the
original 5-year and the new 10-year maps and classification system show (i) an increase in the total number of journals that
can be mapped by 9,409 journals (social sciences had a 80% increase, humanities a 119% increase, medical (32%) and
natural science (74%)), (ii) a simplification of the map by assigning all but five highly interdisciplinary journals to exactly one
discipline, (iii) a more even distribution of journals over the 554 subdisciplines and 13 disciplines when calculating the
coefficient of variation, and (iv) a better reflection of journal clusters when compared with paper-level citation data. When
evaluating the map with a listing of desirable features for maps of science, the updated map is shown to have higher
mapping accuracy, easier understandability as fewer journals are multiply classified, and higher usability for the generation
of data overlays, among others
Approaches to Understanding and Measuring Interdisciplinary Scientific Research (IDR): A Review of the Literature
Interdisciplinary scientific research (IDR) challenges the study of science from a number of fronts, including one of creating output science and engineering (S&E) indicators. This literature review began with a narrow focus on quantitative measures of the output of IDR, but expanded the scope as it became clear that differing definitions, assessment tools, evaluation processes, and measures all shed light on aspects of IDR. Key among the broader aspects are (a) characterizing the concept of knowledge integration, and (b) recognizing that it can occur within a single mind or as the result of team dynamics. Output measures alone cannot adequately capture this process. Among the quantitative measures considered, bibliometrics (co-authorships, collaborations, references, citations and co-citations) are the most developed, but leave considerable gaps in understanding. Emerging measures in diversity, entropy, and network dynamics are promising, but require sophisticated interpretations and thus would not serve well as S&E indicators. Combinations of quantitative and qualitative assessments coming from evaluation studies appear to reveal S&E processes but carry burdens of expense, intrusion, and lack of reproducibility. This review is a first step toward providing a more holistic view of measuring IDR; several avenues for future research highlight the need for metrics to reflect the actual practice of IDR
Recommended from our members
International trends in solid-state lighting : analyses of the article and patent literature.
We present an analysis of the literature of solid-state lighting, based on a comprehensive dataset of 35,851 English-language articles and 12,420 U.S. patents published or issued during the years 1977-2004 in the foundational knowledge domain of electroluminescent materials and phenomena. The dataset was created using a complex, iteratively developed search string. The records in the dataset were then partitioned according to: whether they are articles or patents, their publication or issue date, their national or continental origin, whether the active electroluminescent material was inorganic or organic, and which of a number of emergent knowledge sub-domains they aggregate into on the basis of bibliographic coupling. From these partitionings, we performed a number of analyses, including: identification of knowledge sub-domains of historical and recent importance, and trends over time of the contributions of various nations and continents to the knowledge domain and its sub-domains. Among the key results: (1) The knowledge domain as a whole has been growing quickly: the average growth rates of the inorganic and organic knowledge sub-domains have been 8%/yr and 25%/yr, respectively, compared to average growth rates less than 5%/yr for English-language articles and U.S. patents in other knowledge domains. The growth rate of the organic knowledge sub-domain is so high that its historical dominance by the inorganic knowledge sub-domain will, at current trajectories, be reversed in the coming decade. (2) Amongst nations, the U.S. is the largest contributor to the overall knowledge domain, but Japan is on a trajectory to become the largest contributor within the coming half-decade. Amongst continents, Asia became the largest contributor during the past half-decade, overwhelmingly so for the organic knowledge sub-domain. (3) The relative contributions to the article and patent datasets differ for the major continents: North America contributing relatively more patents, Europe contributing relatively more articles, and Asia contributing in a more balanced fashion. (4) For the article dataset, the nations that contribute most in quantity also contribute most in breadth, while the nations that contribute less in quantity concentrate their contributions in particular knowledge sub-domains. For the patent dataset, North America and Europe tend to contribute improvements in end-use applications (e.g., in sensing, phototherapy and communications), while Asia tends to contribute improvements at the materials and chip levels. (5) The knowledge sub-domains that emerge from aggregations based on bibliographic coupling are roughly organized, for articles, by the degree of localization of electrons and holes in the material or phenomenon of interest, and for patents, according to both their emphasis on chips, systems or applications, and their emphasis on organic or inorganic materials. (6) The six 'hottest' topics in the article dataset are: spintronics, AlGaN UV LEDs, nanowires, nanophosphors, polyfluorenes and electrophosphorescence. The nine 'hottest' topics in the patent dataset are: OLED encapsulation, active-matrix displays, multicolor OLEDs, thermal transfer for OLED fabrication, ink-jet printed OLEDs, phosphor-converted LEDs, ornamental LED packages, photocuring and phototherapy, and LED retrofitting lamps. A significant caution in interpreting these results is that they are based on English-language articles and U.S. patents, and hence will tend to over-represent the strength of English-speaking nations (particularly the U.S.), and under-represent the strength of non-English-speaking nations (particularly China)
Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches
We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents.We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models--BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE.PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts
- …