112 research outputs found
Recommended from our members
Bringing the Fuzzy Front End into Focus
Technology planning is relatively straightforward for well-established research and development (R and D) areas--those areas in which an organization has a history, the competitors are well understood, and the organization clearly knows where it is going with that technology. What we are calling the fuzzy front-end in this paper is that condition in which these factors are not well understood--such as for new corporate thrusts or emerging areas where the applications are embryonic. While strategic business planning exercises are generally good at identifying technology areas that are key to future success, they often lack substance in answering questions like: (1) Where are we now with respect to these key technologies? ... with respect to our competitors? (2) Where do we want or need to be? ... by when? (3) What is the best way to get there? In response to its own needs in answering such questions, Sandia National Laboratories is developing and implementing several planning tools. These tools include knowledge mapping (or visualization), PROSPERITY GAMES and technology roadmapping--all three of which are the subject of this paper. Knowledge mapping utilizes computer-based tools to help answer Question 1 by graphically representing the knowledge landscape that we populate as compared with other corporate and government entities. The knowledge landscape explored in this way can be based on any one of a number of information sets such as citation or patent databases. PROSPERITY GAMES are high-level interactive simulations, similar to seminar war games, which help address Question 2 by allowing us to explore consequences of various optional goals and strategies with all of the relevant stakeholders in a risk-free environment. Technology roadmapping is a strategic planning process that helps answer Question 3 by collaboratively identifying product and process performance targets and obstacles, and the technology alternatives available to reach those targets
Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches
We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents.We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models--BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE.PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts
Topic identification challenge
Merit, Expertise and Measuremen
The Citation Field of Evolutionary Economics
Evolutionary economics has developed into an academic field of its own,
institutionalized around, amongst others, the Journal of Evolutionary Economics
(JEE). This paper analyzes the way and extent to which evolutionary economics
has become an interdisciplinary journal, as its aim was: a journal that is
indispensable in the exchange of expert knowledge on topics and using
approaches that relate naturally with it. Analyzing citation data for the
relevant academic field for the Journal of Evolutionary Economics, we use
insights from scientometrics and social network analysis to find that, indeed,
the JEE is a central player in this interdisciplinary field aiming mostly at
understanding technological and regional dynamics. It does not, however, link
firmly with the natural sciences (including biology) nor to management
sciences, entrepreneurship, and organization studies. Another journal that
could be perceived to have evolutionary acumen, the Journal of Economic Issues,
does relate to heterodox economics journals and is relatively more involved in
discussing issues of firm and industry organization. The JEE seems most keen to
develop theoretical insights
Science Models as Value-Added Services for Scholarly Information Systems
The paper introduces scholarly Information Retrieval (IR) as a further
dimension that should be considered in the science modeling debate. The IR use
case is seen as a validation model of the adequacy of science models in
representing and predicting structure and dynamics in science. Particular
conceptualizations of scholarly activity and structures in science are used as
value-added search services to improve retrieval quality: a co-word model
depicting the cognitive structure of a field (used for query expansion), the
Bradford law of information concentration, and a model of co-authorship
networks (both used for re-ranking search results). An evaluation of the
retrieval quality when science model driven services are used turned out that
the models proposed actually provide beneficial effects to retrieval quality.
From an IR perspective, the models studied are therefore verified as expressive
conceptualizations of central phenomena in science. Thus, it could be shown
that the IR perspective can significantly contribute to a better understanding
of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric
An Analysis of the Abstracts Presented at the Annual Meetings of the Society for Neuroscience from 2001 to 2006
Annual meeting abstracts published by scientific societies often contain rich arrays of information that can be computationally mined and distilled to elucidate the state and dynamics of the subject field. We extracted and processed abstract data from the Society for Neuroscience (SFN) annual meeting abstracts during the period 2001–2006 in order to gain an objective view of contemporary neuroscience. An important first step in the process was the application of data cleaning and disambiguation methods to construct a unified database, since the data were too noisy to be of full utility in the raw form initially available. Using natural language processing, text mining, and other data analysis techniques, we then examined the demographics and structure of the scientific collaboration network, the dynamics of the field over time, major research trends, and the structure of the sources of research funding. Some interesting findings include a high geographical concentration of neuroscience research in the north eastern United States, a surprisingly large transient population (66% of the authors appear in only one out of the six studied years), the central role played by the study of neurodegenerative disorders in the neuroscience community, and an apparent growth of behavioral/systems neuroscience with a corresponding shrinkage of cellular/molecular neuroscience over the six year period. The results from this work will prove useful for scientists, policy makers, and funding agencies seeking to gain a complete and unbiased picture of the community structure and body of knowledge encapsulated by a specific scientific domain
Communities and patterns of scientific collaboration in Business and Management
This is the author's accepted version of this article deposited at arXiv (arXiv:1006.1788v2 [physics.soc-ph]) and subsequently published in Scientometrics October 2011, Volume 89, Issue 1, pp 381-396. The final publication is available at link.springer.com http://link.springer.com/article/10.1007%2Fs11192-011-0439-1Author's note: 17 pages. To appear in special edition of Scientometrics. Abstract on arXiv meta-data a shorter version of abstract on actual paper (both in journal and arXiv full pape
Community structure and patterns of scientific collaboration in Business and Management
This is the author's accepted version of this article deposited at arXiv (arXiv:1006.1788v2 [physics.soc-ph]) and subsequently published in Scientometrics October 2011, Volume 89, Issue 1, pp 381-396. The final publication is available at link.springer.com http://link.springer.com/article/10.1007%2Fs11192-011-0439-1Author's note: 17 pages. To appear in special edition of Scientometrics. Abstract on arXiv meta-data a shorter version of abstract on actual paper (both in journal and arXiv full pape
- …