Search CORE

39,520 research outputs found

Using distributional similarity to organise biomedical terminology

Author: Dowdall James
Keller Bill
Schneider Gerold
Weeds Julie
Weir David
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2005
Field of study

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy

ZORA

Sussex Research Online

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer

Fat-tailed fluctuations in the size of organizations: the role of social influence

Author: Holme Petter
Liljeros Fredrik
Mondani Hernan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Organizational growth processes have consistently been shown to exhibit a fatter-than-Gaussian growth-rate distribution in a variety of settings. Long periods of relatively small changes are interrupted by sudden changes in all size scales. This kind of extreme events can have important consequences for the development of biological and socio-economic systems. Existing models do not derive this aggregated pattern from agent actions at the micro level. We develop an agent-based simulation model on a social network. We take our departure in a model by a Schwarzkopf et al. on a scale-free network. We reproduce the fat-tailed pattern out of internal dynamics alone, and also find that it is robust with respect to network topology. Thus, the social network and the local interactions are a prerequisite for generating the pattern, but not the network topology itself. We further extend the model with a parameter

\delta

that weights the relative fraction of an individual's neighbours belonging to a given organization, representing a contextual aspect of social influence. In the lower limit of this parameter, the fraction is irrelevant and choice of organization is random. In the upper limit of the parameter, the largest fraction quickly dominates, leading to a winner-takes-all situation. We recover the real pattern as an intermediate case between these two extremes.Comment: 15 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Publikationer från Umeå universitet

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Recommended from our members

Effects of classification context on categorization in natural categories

Author: C. E. Weatherburn
C. W. Kalish
D. L. Medin
D. Osherson
Danièle Dubois
E. E. Smith
E. M. Roth
E. Rosch
G. L. Murphy
G. L. Murphy
G. Rey
J. A. Hampton
J. A. Hampton
J. A. Hampton
James A. Hampton
L. J. Rips
L. W. Barsalou
M. E. McCloskey
N. Braisby
N. Braisby
N. Braisby
P. Bloom
R. A. Barr
R. Keefe
W.-K. Ahn
W.-K. Ahn
Wenchi Yeh
Z. Estes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2006
Field of study

The patterns of classification of borderline instances of eight common taxonomic categories were examined under three different instructional conditions to test two predictions: first, that lack of a specified context contributes to vagueness in categorization, and second, that altering the purpose of classification can lead to greater or lesser dependence on similarity in classification. The instructional conditions contrasted purely pragmatic with more technical/quasi-legal contexts as purposes for classification, and these were compared with a no-context control. The measures of category vagueness were between-subjects disagreement and within-subjects consistency, and the measures of similarity based categorization were category breadth and the correlation of instance categorization probability with mean rated typicality, independently measured in a neutral context. Contrary to predictions, none of the measures of vagueness, reliability, category breadth, or correlation with typicality were generally affected by the instructional setting as a function of pragmatic versus technical purposes. Only one subcondition, in which a situational context was implied in addition to a purposive context, produced a significant change in categorization. Further experiments demonstrated that the effect of context was not increased when participants talked their way through the task, and that a technical context did not elicit more all-or-none categorization than did a pragmatic context. These findings place an important boundary condition on the effects of instructional context on conceptual categorization

City Research Online

Crossref

Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data

Author: Franz Nico M.
Sterner Beckett
Publication venue
Publication date: 01/01/2017
Field of study

Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-offs between precision and ambiguity that are key to designing effective solutions for generating big data about biodiversity. We focus on the importance of theory-dependence as a source of ambiguity in taxonomic nomenclature and hence a persistent challenge for implementing a single, long-term solution to storing and accessing meaningful sets of biological specimens. We argue that ambiguity does have a positive role to play in scientific progress as a tool for efficiently symbolizing multiple aspects of taxa and mediating between conflicting hypotheses about their nature. Pursuing a deeper understanding of the trade-offs and synthesis of precision and ambiguity as virtues of scientific language and communication systems then offers a productive next step for realizing sound, big biodiversity data services

PhilPapers

Recommended from our members

Abstraction and context in concept representation

Author: Hampton J. A.
Publication venue: 'The Royal Society'
Publication date: 29/07/2003
Field of study

This paper develops the notion of abstraction in the context of the psychology of concepts, and discusses its relation to context dependence in knowledge representation. Three general approaches to modelling conceptual knowledge from the domain of cognitive psychology are discussed, which serve to illustrate a theoretical dimension of increasing levels of abstraction

City Research Online

Crossref

PubMed Central

Ontologies and Information Extraction

Author: Nazarenko Adeline
Nédellec Claire
Publication venue
Publication date: 01/01/2005
Field of study

This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

arXiv.org e-Print Archive

HAL Descartes

HAL-Paris 13

Mean-field methods in evolutionary duplication-innovation-loss models for the genome-level repertoire of protein domains

Author: A. Amato
A. Angelini
B. Bassetti
C. Branden
D. Aldous
G. Bianconi
J. Kingman
J. Pitman
M. A. Huynen
M. Cosentino Lagomarsino
M. Kamal
V. A. Kuznetsov
Publication venue: 'American Physical Society (APS)'
Publication date: 22/01/2010
Field of study

We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear or copy themselves. These models, related to a paradigm duplication-innovation model known as Chinese Restaurant Process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known species. In view of these data, we discuss the qualitative and quantitative differences of the alternative model formulations, focusing in particular on the roles of element loss and of the specificity of empirical domain classes.Comment: 10 Figures, 2 Table

arXiv.org e-Print Archive

Crossref

The methodology of analysing semantic change in historical perspective

Author: Grygiel Marcin
Publication venue
Publication date: 01/01/2005
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main