Search CORE

48 research outputs found

SORTA:a system for ontology-based re-coding and technical annotation of biomedical phenotype data

Author: Charbon Bart
de Boer Tommy
Haan Mark de
Hendriksen Dennis
Hillege Hans
Jetten Jonathan
Kelpin Fleur
Pang Chao
Sijmons Rolf
Sijtsma Anna
Smidt Nynke
Sollie Annet
Swertz Morris A.
van der Velde Joeri K.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 17/09/2015
Field of study

There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi-automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re) coding tasks and we believe it will prove useful for many more projects

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

Accuracy of aortic pulse wave velocity assessment with velocity-encoded MRI: validation in patients with Marfan syndrome

Author: Bax Jeroen J
de Roos Albert
Groenink Maarten
Hendriksen Dennis
Kroft Lucia J
Kröner Eleanore S
Radonic Teodora
Reiber Johan H
Scholte Arthur J
van den Boogaard Pieter J
van der Geest Rob J
Westenberg Jos J
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Genotype harmonizer:automatic strand alignment and format conversion for genotype data integration

Author: Bonder Marc Jan
Deelen Patrick
Franke Lude
Hendriksen Dennis
Swertz Morris A
van der Velde K Joeri
Westra Harm-Jan
Winder Erwin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. FINDINGS: Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java 'Genotype-IO' API. All software is open source under license LGPLv3 and available from http://www.molgenis.org/systemsgenetics. CONCLUSIONS: GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines

Crossref

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

CAPICE:a computational method for Consequence-Agnostic Pathogenicity Interpretation of Clinical Exome variations

Author: Abbott Kristin
Charbon Bart
de Ridder Dick
Deelen Patrick
Hendriksen Dennis
Kerstjens-Frederikse Wilhelmina S
Li Shuang
Sikkema-Raddatz Birgit
Sinke Richard J
Soudis Dimitrios
Swertz Morris A
van der Velde K Joeri
van Diemen Cleo C
van Dijk Aalt D J
van Gijn Marielle E
Zwerwer Leslie R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2020
Field of study

Exome sequencing is now mainstream in clinical practice. However, identification of pathogenic Mendelian variants remains time-consuming, in part, because the limited accuracy of current computational prediction methods requires manual classification by experts. Here we introduce CAPICE, a new machine-learning-based method for prioritizing pathogenic variants, including SNVs and short InDels. CAPICE outperforms the best general (CADD, GAVIN) and consequence-type-specific (REVEL, ClinPred) computational prediction methods, for both rare and ultra-rare variants. CAPICE is easily added to diagnostic pipelines as pre-computed score file or command-line software, or using online MOLGENIS web service with API. Download CAPICE for free and open-source (LGPLv3) at https://github.com/molgenis/capice.

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Proficiency testing of virus diagnostics based on bioinformatics analysis of simulated in silico high-throughput sequencing data sets

Author: Aarestrup F.M. (Frank)
Andrusch A. (Andreas)
Baumgärtner V. (Volkmar)
Beer M. (Martin)
Belka A. (Ariane)
Blanchard Y. (Yannick)
Borges V. (Vítor)
Brinkmann A. (Annika)
Camma C. (Cesare)
Corman V.M. (Victor)
Deboutte W. (Ward)
Drosten C. (Christian)
Ellis R.J. (Richard J.)
Hansmann F. (Florian)
Hendriksen R.S. (Rene S.)
Höper D. (Dirk)
Jones T. (Terry)
Koopmans D.V.M. M.P.G. (Marion)
Kroneman A. (Annelies)
Lorusso A. (Alessio)
Lucas P. (Pierrick)
Mangone I. (Iolanda)
Marcacci M. (Maurilia)
Matthijnssens J. (Jelle)
Melidou A. (Angeliki)
Nitsche A.
Nunes A. (Alexandra)
Osterhaus A. (Albert)
Oude Munnink B.B. (Bas B.)
Papa A. (Anna)
Petersen T.N. (Thomas Nordahl)
Pinto M. (Miguel)
Pohlmann A. (Anne)
Schmitz D. (Dennis)
Vries E. (Erhard) van der
Wylezich C. (Claudia)
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2019
Field of study

Quality management and independent assessment of high-throughput sequencing-based virus diagnostics have not yet been established as a mandatory approach for ensuring comparable results. The sensitivity and specificity of viral high-throughput sequence data analysis are highly affected by bioinformatics processing using publicly available and custom tools and databases and thus differ widely between individuals and institutions. Here we present the results of the COMPARE [Collaborative Management Platform for Detection and Analyses of (Re-) emerging and Foodborne Outbreaks in Europe] in silico virus proficiency test. An artificial, simulated in silico data set of Illumina HiSeq sequences was provided to 13 different European institutes for bioinformatics analysis to identify viral pathogens in high-throughput sequence data. Comparison of the participants’ analyses shows that the use of different tools, programs, and databases for bioinformatics analyses can impact the correct identification of viral sequences from a simple data set. The identification of slightly mutated and highly divergent virus genomes has been shown to be most challenging. Furthermore, the interpretation of the results, together with a fictitious case report, by the participants showed that in addition to the bioinformatics analysis, the virological evaluation of the results can be important in clinical settings. External quality assessment and proficiency testing should become an important part of validating high-throughput sequencing-based virus diagnostics and could improve the harmonization, comparability, and reproducibility of results. There is a need for the establishment of international proficiency testing, like that established for conventional laboratory tests such as PCR, for bioinformatics pipelines and the interpretation of such results

Erasmus University Digital Repository

BiobankUniverse:Automatic matchmaking between datasets for biobank data discovery and integration

Author: Bart Charbon
Chao Pang
David van Enckevort
Dennis Hendriksen
Fleur Kelpin
Fortier
Hans Hillege
Holub
Jonathan Jetten
Jonathan Wren
Kaisa Silander
Maelstrom Research
Mark de Haan
Merino-Martinez
Miles
Morris A Swertz
Niina Eklund
Norlin
Pang
Pang
Pennington
Petr Holub
Scholtens
Shima
Swertz
The Apache Software Foundation
Tommy de Boer
Wolffenbuttel
Wu
Publication venue
Publication date: 15/11/2017
Field of study

Motivation: Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions. Results: To overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Modulation of Androgen Receptor Signaling in Hormonal Therapy-Resistant Prostate Cancer Cell Lines

Author: A Veldhoven
A Waghray
B Han
BS Carver
CA Maher
CD Chen
CJ Best
D Compagno
D Singh
DG Bostwick
ED Crawford
G Attard
G Dennis Jr
G Jenster
GK Smyth
GK Smyth
GN Brooke
GS Prins
Guido Jenster
H Cheng
HL Devlin
J Ferlay
J Kokontis
J Lapointe
J Veldscholte
JA Ruizeveld de Winter
JC King
JM Kokontis
JR Sterbis
K Ida
K Shimizu
K Tamura
KK Waltering
L van der Heul-Nieuwenhuijsen
Laszlo Tora
LR Baugh
M Frydenberg
MA Eisenberger
MA Rubin
MJ Linja
Natasja F. Dits
ND Tararova
NG Nickols
P Haag
P Koivisto
P Mendiratta
PJ Hendriksen
PS Nelson
Q Wang
R Chmelar
R Snoek
RB Marques
RB Marques
RS Brown
Rute B. Marques
S Simpson
S Varambally
SA Tomlins
SA Tomlins
SA Tomlins
SA Tomlins
SA Tomlins
SE DePrimo
Sigrun Erkens-Schulze
SM Dehm
SM Dhanasekaran
T Furutani
TH van der Kwast
TP York
UR Chandran
UR Chandran
W Huang da
Wilfred F. J. van IJcken
Wytske M. van Weerden
X Liao
YP Yu
Z Culig
Z Culig
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: Prostate epithelial cells depend on androgens for survival and function. In (early) prostate cancer (PCa) androgens also regulate tumor growth, which is exploited by hormonal therapies in metastatic disease. The aim of the present study was to characterize the androgen receptor (AR) response in hormonal therapy-resistant PC346 cells and identify potential disease markers. Methodology/Principal Findings: Human 19K oligoarrays were used to establish the androgen-regulated expression profile of androgen-responsive PC346C cells and its derivative therapy-resistant sublines: PC346DCC (vestigial AR levels), PC346Flu1 (AR overexpression) and PC346Flu2 (T877A AR mutation). In total, 107 transcripts were differentially-expressed in PC346C and derivatives after R1881 or hydroxyflutamide stimulations. The AR-regulated expression profiles reflected the AR modifications of respective therapy-resistant sublines: AR overexpression resulted in stronger and broader transcriptional response to R1881 stimulation, AR down-regulation correlated with deficient response of AR-target genes and the T877A mutation resulted in transcriptional response to both R1881 and hydroxyflutamide. This AR-target signature was linked to multiple publicly available cell line and tumor derived PCa databases, revealing that distinct functional clusters were differentially modulated during PCa progression. Differentiation and secretory functions were up-regulated in primary PCa but repressed i

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Erasmus University Digital Repository

Bypass Mechanisms of the Androgen Receptor Pathway in Therapy-Resistant Prostate Cancer Cell Models

Author: A Colomba
A Puisieux
A Veldhoven
AF Fribourg
B Cinar
C Niehrs
C Zenzmaier
CE Petre-Draviam
Chad Creighton
CJ Best
CW Gregory
D Lodygin
EC Lee
ED Crawford
G Dennis Jr
G Jenster
GK Smyth
GK Smyth
Guido Jenster
GZ Cheng
HF Yuen
HL Devlin
HW Lo
J Dupont
J Lapointe
J Veldscholte
J Yang
JA Ruizeveld de Winter
JT Dong
K Edamura
K Fizazi
K Saeb-Parsy
K Tamura
L Hu
L van der Heul-Nieuwenhuijsen
L Zeng
LR Baugh
LS Lyons
M Asim
MA Eisenberger
ME Grossmann
ME Taplin
MJ Linja
ML Zhu
N Movilla
Natasja F. Dits
ND Tararova
O Tatarov
P Koivisto
P Mendiratta
PA Berry
PJ Hendriksen
R Maestro
RB Marques
RB Marques
RS Bridges
RS Brown
RS Verma
Rute B. Marques
S Gurumurthy
S Lu
S Lu
S Varambally
SA Tomlins
Sigrun Erkens-Schulze
SL Moores
SP Balk
SS Taneja
SY Hsieh
T Trenkle
TH van der Kwast
UR Chandran
W Huang da
WK Kwok
WK Kwok
Wytske M. van Weerden
Y Chen
Y Kawano
Y Liu
Y Wang
Y Xu
YP Yu
Z Culig
Z Culig
Z Dong
Z Yan-Qi
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: Prostate cancer is initially dependent on androgens for survival and growth, making hormonal therapy the cornerstone treatment for late-stage tumors. However, despite initial remission, the cancer will inevitably recur. The present study was designed to investigate how androgen-dependent prostate cancer cells eventually survive and resume growth under androgen-deprived and antiandrogen supplemented conditions. As model system, we used the androgen-responsive PC346C cell line and its therapy-resistant sublines: PC346DCC, PC346Flu1 and PC346Flu2. Methodology/Principal Findings: Microarray technology was used to analyze differences in gene expression between the androgen-responsive and therapy-resistant PC346 cell lines. Microarray analysis revealed 487 transcripts differentiallyexpressed between the androgen-responsive and the therapy-resistant cell lines. Most of these genes were common to all three therapy-resistant sublines and only a minority (,5%) was androgen-regulated. Pathway analysis revealed enrichment in functions involving cellular movement, cell growth and cell death, as well as association with cancer and reproductive system disease. PC346DCC expressed residual levels of androgen receptor (AR) and showed significant down-regulation of androgen-regulated genes (p-value = 10 27). Up-regulation of VAV3 and TWIST1 oncogenes and repression of the DKK3 tumor-suppressor was observed in PC346DCC, suggesting a potential AR bypass mechanism. Subsequent validation of these three genes in patient samples confirmed that expression was deregulated during prostate cancer progression

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Erasmus University Digital Repository