Search CORE

8,540 research outputs found

Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach

Author: A Criminisi
A Kaur
A Rosas
B Ibragimov
BS Reddy
C Lindner
C Wang
CP Klingenberg
F Pedregosa
FL Bookstein
J Aceto
JL Fearon
L Breiman
M Ristin
N Chazot
P Geurts
P Viola
R Donner
R Marée
T Niet van der
V Grau
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The detection of anatomical landmarks in bioimages is a necessary but tedious step for geometric morphometrics studies in many research domains. We propose variants of a multi-resolution tree-based approach to speed-up the detection of landmarks in bioimages. We extensively evaluate our method variants on three different datasets (cephalometric, zebrafish, and drosophila images). We identify the key method parameters (notably the multi-resolution) and report results with respect to human ground truths and existing methods. Our method achieves recognition performances competitive with current existing approaches while being generic and fast. The algorithms are integrated in the open-source Cytomine software and we provide parameter configuration guidelines so that they can be easily exploited by end-users. Finally, datasets are readily available through a Cytomine server to foster future research

Central Archive at the University of Reading

Crossref

Directory of Open Access Journals

HAL Descartes

Open Repository and Bibliography - Liège

DIAL UCLouvain

Automatic categorization of diverse experimental information in the bioscience literature

Author: Brown Nick
Chen Wen
Davis Paul
Fang Ruihua
Fernandes Jolene
Gelbart William M.
Marygold Steven J.
Matthews Beverley
Millburn Gillian
Schindelman Gary
Sternberg Paul W.
Tuli Mary Ann
Van Auken Kimberly
Wang Xiaodong
Zhang Haiyan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Background: Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance. Results: We successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction. Conclusions: Our methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort

Crossref

Springer - Publisher Connector

Harvard University - DASH

Caltech Authors

Cascaded multi-view canonical correlation (CaMCCo) for early diagnosis of Alzheimer\u27s disease via fusion of clinical, imaging and omic Features

Author: Ances Beau
Carroll Maria
et al
Franklin Erin
Mintun Mark
Morris John
Oliver Angela
Schneider Stacy
Shaw Leslie
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

Author: A Roberts
A Shah
Aleksandar Savkov
B Efron
G Hripcsak
G Savova
J Cohen
J Foster
J-W Fan
Jackie Cassell
John Carroll
K Verspoor
KH Krippendorff
LK Tanabe
M Bada
MP Marcus
Rob Koeling
S Abney
W Sun
Ö Uzuner
Ö Uzuner
Ö Uzuner
Ö Uzuner
Ö Uzuner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

Crossref

Springer - Publisher Connector

PubMed Central

Sussex Research Online