Search CORE

175,534 research outputs found

Using distributional similarity to organise biomedical terminology

Author: Dowdall James
Keller Bill
Schneider Gerold
Weeds Julie
Weir David
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2005
Field of study

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy

ZORA

Sussex Research Online

Natural language processing and cognitive science : proceedings 2018

Author: Lubaszewski Wiesław
Sedes Florence
Sharp Bernadette
Publication venue: Jagiellonian Library
Publication date: 01/01/2018
Field of study

Jagiellonian Univeristy Repository

How journal rankings can suppress interdisciplinary research. A comparison between Innovation Studies and Business & Management

Author: ABS
Adams
Alice O’Hare
Alvesson
Andy Stirling
Balakrishnan
Barry
Bhupatiraju
Boix Mansilla
Bordons
Braun
Bruce
Börner
Campbell
Carayol
Cech
Clausen
Collingridge
Collini
Cummings
Cummings
de Nooy
Donovan
EURAB
Fagerberg
Fagerberg
Freeman
Funtowicz
Gibbons
Gläser
Goodall
Hamilton
Harley
Heinze
Hollingsworth
Huutoniemi
Huutoniemi
Ismael Rafols
Jacobs
Katz
Katz
Kiss
Langfeldt
Larivière
Laudel
Leahey
Lee
Lee
Lee
Levitt
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Leydesdorff
Liu
Llerena
Loet Leydesdorff
Lowe
Lowe
Mahdi
Mallard
Martin
Martin
Martin
Martin
Metzger
Minzberg
Mirowski
Narin
National Academies
Nightingale
Oswald
Page
Patenaude
Paul Nightingale
Porter
Porter
Porter
Porter
Price
Rafols
Rafols
Rafols
Rafols
Rhoten
Rhoten
Rhoten
Rinia
Rinia
Roessner
Sanz-Menéndez
Seglen
Shinn
Small
Stirling
Stirling
Stirling
Stokols
Taylor
Travis
Van Eck
Van Rijnsoever
Wagner
Walsh
Weinberg
Weingart
Whitley
Willmott
Willmott
Yegros-Yegros
Youtie
Zhou
Zitt
Zitt
Zitt
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

This study provides quantitative evidence on how the use of journal rankings can disadvantage interdisciplinary research in research evaluations. Using publication and citation data, it compares the degree of interdisciplinarity and the research performance of a number of Innovation Studies units with that of leading Business & Management schools in the UK. On the basis of various mappings and metrics, this study shows that: (i) Innovation Studies units are consistently more interdisciplinary in their research than Business & Management schools; (ii) the top journals in the Association of Business Schools' rankings span a less diverse set of disciplines than lower-ranked journals; (iii) this results in a more favourable assessment of the performance of Business & Management schools, which are more disciplinary-focused. This citation-based analysis challenges the journal ranking-based assessment. In short, the investigation illustrates how ostensibly 'excellence-based' journal rankings exhibit a systematic bias in favour of mono-disciplinary research. The paper concludes with a discussion of implications of these phenomena, in particular how the bias is likely to affect negatively the evaluation and associated financial resourcing of interdisciplinary research organisations, and may result in researchers becoming more compliant with disciplinary authority over time.Comment: 41 pages, 10 figure

arXiv.org e-Print Archive

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

Digital.CSIC

IDS OpenDocs

Sussex Research Online

International Migration, Integration and Social Cohesion online publications

The Most Influential Paper Gerard Salton Never Wrote

Author: Dubin David
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign.
Publication date: 01/01/2004
Field of study

Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled ???A Vector Space Model for Information Retrieval??? (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specifi c computations. Citations to the phantom paper refl ect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Improving the quality of the personalized electronic program guide

Author: McDonald Kieran
O'Sullivan Dermot
Smeaton Alan F.
Smyth Barry
Wilson David C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

As Digital TV subscribers are offered more and more channels, it is becoming increasingly difficult for them to locate the right programme information at the right time. The personalized Electronic Programme Guide (pEPG) is one solution to this problem; it leverages artificial intelligence and user profiling techniques to learn about the viewing preferences of individual users in order to compile personalized viewing guides that fit their individual preferences. Very often the limited availability of profiling information is a key limiting factor in such personalized recommender systems. For example, it is well known that collaborative filtering approaches suffer significantly from the sparsity problem, which exists because the expected item-overlap between profiles is usually very low. In this article we address the sparsity problem in the Digital TV domain. We propose the use of data mining techniques as a way of supplementing meagre ratings-based profile knowledge with additional item-similarity knowledge that can be automatically discovered by mining user profiles. We argue that this new similarity knowledge can significantly enhance the performance of a recommender system in even the sparsest of profile spaces. Moreover, we provide an extensive evaluation of our approach using two large-scale, state-of-the-art online systems—PTVPlus, a personalized TV listings portal and Físchlár, an online digital video library system

Crossref

DCU Online Research Access Service

Human assessments of document similarity

Author: Belkin
Belz
Cavnar
Cavnar
Damashek
Damashek
Flesch
Fox
Furnas
Gardenfors
Haenggi
Harman
Harman
Harman
Hjørland
Johnson-Laird
Järvelin
Landauer
Lee
Lin
Lund
Miller
Morris
Resnik
Salton
Saracevic
Skupin
Vorhees
Westerman
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Two studies are reported that examined the reliability of human assessments of document similarity and the association between human ratings and the results of n-gram automatic text analysis (ATA). Human interassessor reliability (IAR) was moderate to poor. However, correlations between average human ratings and n-gram solutions were strong. The average correlation between ATA and individual human solutions was greater than IAR. N-gram length influenced the strength of association, but optimum string length depended on the nature of the text (technical vs. nontechnical). We conclude that the methodology applied in previous studies may have led to overoptimistic views on human reliability, but that an optimal n-gram solution can provide a good approximation of the average human assessment of document similarity, a result that has important implications for future development of document visualization systems

Crossref

University of Gloucestershire Research Repository

Brunel University Research Archive

Vertex similarity in networks

Author: A. W. Wolfe
E. A. Leicht
E. Ravasz
F. Lorrain
G. Jeh
G. Salton
G. Salton
L. Donetti
M. E. J. Newman
M. Molloy
Petter Holme
T. Łuczak
Publication venue: 'American Physical Society (APS)'
Publication date: 14/10/2005
Field of study

We consider methods for quantifying the similarity of vertices in networks. We propose a measure of similarity based on the concept that two vertices are similar if their immediate neighbors in the network are themselves similar. This leads to a self-consistent matrix formulation of similarity that can be evaluated iteratively using only a knowledge of the adjacency matrix of the network. We test our similarity measure on computer-generated networks for which the expected results are known, and on a number of real-world networks

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

CERN Document Server