Search CORE

Variant Ranker: a web-tool to rank genomic data according to functional significance

Author: Alexander J.
Drineas P.
Georgitsi M.
Mantzaris D.
Paschou P.
Publication venue: BioMed Central
Publication date: 01/07/2017
Field of study

BACKGROUND: The increasing volume and complexity of high-throughput genomic data make analysis and prioritization of variants difficult for researchers with limited bioinformatics skills. Variant Ranker allows researchers to rank identified variants and determine the most confident variants for experimental validation. RESULTS: We describe Variant Ranker, a user-friendly simple web-based tool for ranking, filtering and annotation of coding and non-coding variants. Variant Ranker facilitates the identification of causal variants based on novelty, effect and annotation information. The algorithm implements and aggregates multiple prediction algorithm scores, conservation scores, allelic frequencies, clinical information and additional open-source annotations using accessible databases via ANNOVAR. The available information for a variant is transformed into user-specified weights, which are in turn encoded into the ranking algorithm. Through its different modules, users can (i) rank a list of variants (ii) perform genotype filtering for case-control samples (iii) filter large amounts of high-throughput data based on user custom filter requirements and apply different models of inheritance (iv) perform downstream functional enrichment analysis through network visualization. Using networks, users can identify clusters of genes that belong to multiple ontology categories (like pathways, gene ontology, disease categories) and therefore expedite scientific discoveries. We demonstrate the utility of Variant Ranker to identify causal genes using real and synthetic datasets. Our results indicate that Variant Ranker exhibits excellent performance by correctly identifying and ranking the candidate genes CONCLUSIONS: Variant Ranker is a freely available web server on http://paschou-lab.mbg.duth.gr/Software.html . This tool will enable users to prioritise potentially causal variants and is applicable to a wide range of sequencing data

ZENODO

White Rose Research Online

Letter to the editor: Genetics and Vitamin D supplementation in pregnancy

Author: Anagnostis P. G.
Goulis D. G.
Muscogiuri G.
Paschou S. A.
Vryonidou A.
Publication venue: 'The Endocrine Society'
Publication date: 01/01/2017
Field of study

Archivio della ricerca - Università degli studi di Napoli Federico II

Hypokalemia: A clinical update

Author: Anagnostis P.
Kardalas E.
Muscogiuri G.
Paschou S. A.
Siasos G.
Vryonidou A.
Publication venue: 'Bioscientifica'
Publication date: 01/01/2018
Field of study

Hypokalemia is a common electrolyte disturbance, especially in hospitalized patients. It can have various causes, including endocrine ones. Sometimes, hypokalemia requires urgent medical attention. The aim of this review is to present updated information regarding: (1) the definition and prevalence of hypokalemia, (2) the physiology of potassium homeostasis, (3) the various causes leading to hypokalemia, (4) the diagnostic steps for the assessment of hypokalemia and (5) the appropriate treatment of hypokalemia depending on the cause. Practical algorithms for the optimal diagnostic, treatment and follow-up strategy are presented, while an individualized approach is emphasized

Archivio della ricerca - Università degli studi di Napoli Federico II

Inferring Geographic Coordinates of Origin for Europeans Using Small Panels of Ancestry Informative Markers

Author: A Price
A Torroni
BP McEvoy
C Campbell
C Tian
C Tian
E Belle
E Parra
H Collins-Schramm
J Bach
J Marchini
J Novembre
Jamey Lewis
John Relethford
JZ Li
L Chikhi
M Bauchet
M Dean
M Seldin
MR Nelson
N Rosenberg
O Lao
P McKeigue
P Paschou
P Paschou
PC Sabeti
Peristera Paschou
Petros Drineas
S Biswas
S Wright
SC Heath
Publication venue: Public Library of Science
Publication date: 18/08/2010
Field of study

Recent large-scale studies of European populations have demonstrated the existence of population genetic structure within Europe and the potential to accurately infer individual ancestry when information from hundreds of thousands of genetic markers is used. In fact, when genomewide genetic variation of European populations is projected down to a two-dimensional Principal Components Analysis plot, a surprising correlation with actual geographic coordinates of self-reported ancestry has been reported. This substructure can hamper the search of susceptibility genes for common complex disorders leading to spurious correlations. The identification of genetic markers that can correct for population stratification becomes therefore of paramount importance. Analyzing 1,200 individuals from 11 populations genotyped for more than 500,000 SNPs (Population Reference Sample), we present a systematic exploration of the extent to which geographic coordinates of origin within Europe can be predicted, with small panels of SNPs. Markers are selected to correlate with the top principal components of the dataset, as we have previously demonstrated. Performing thorough cross-validation experiments we show that it is indeed possible to predict individual ancestry within Europe down to a few hundred kilometers from actual individual origin, using information from carefully selected panels of 500 or 1,000 SNPs. Furthermore, we show that these panels can be used to correctly assign the HapMap Phase 3 European populations to their geographic origin. The SNPs that we propose can prove extremely useful in a variety of different settings, such as stratification correction or genetic ancestry testing, and the study of the history of European populations

Public Library of Science (PLOS)

Hofstra Northwell Academic Works (Hofstra Northwell School of Medicine)

Genetic Association Signal Near NTN4 in Tourette Syndrome

Author: +29 additional authors
Budman Cathy
Davis L. K.
Evans P.
Gerber G.
Mathews C. A.
Paschou P.
Scharf J. M.
Tsetsos F.
Yu D. M.
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

Tourette syndrome (TS) is a neurodevelopmental disorder with a complex genetic etiology. Through an international collaboration, we genotyped 42 single nucleotide polymorphisms (p \u3c 10(-3)) from the recent TS genomewide association study (GWAS) in 609 independent cases and 610 ancestry-matched controls. Only rs2060546 on chromosome 12q22 (p = 3.3 x 10 (-4)) remained significant after Bonferroni correction. Meta-analysis with the original GWAS yielded the strongest association to date (p = 5.8 x 10 (7)). Although its functional significance is unclear, rs2060546 lies closest to NTN4, an axon guidance molecule expressed in developing striatum. Risk score analysis significantly predicted case-control status (p - 0.042), suggesting that many of these variants are true TS risk alleles

Tracing Cattle Breeds with Principal Components Analysis Ancestry Informative SNPs

Author: A Beja-Pereira
A Price
AV Zimin
B Ilbery
B Ilbery
B Weir
C Pfaff
CG Elsik
Christos Dadousis
CJ Edwards
CS Troy
D Goodman
DE MacHugh
DE MacHugh
DE Machugh
Dimitrios Lykidis
E Parra
H Collins-Schramm
Henry Harpending
J Canon
Jamey Lewis
JZ Li
M Dean
M Gautier
MP Heaton
MP Heaton
MS Khatkar
N Parrott
N Rosenberg
N Rosenberg
P McKeigue
P Paschou
P Paschou
Peristera Paschou
Petros Drineas
R Capoferri
R Negrini
R Villa-Angulo
R Willham
RA Gibbs
RL Tellam
RT Loftus
RT Loftus
S Biswas
S Wright
SD McKay
SH Eck
Zafiris Abas
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The recent release of the Bovine HapMap dataset represents the most detailed survey of bovine genetic diversity to date, providing an important resource for the design and development of livestock production. We studied this dataset, comprising more than 30,000 Single Nucleotide Polymorphisms (SNPs) for 19 breeds (13 taurine, three zebu, and three hybrid breeds), seeking to identify small panels of genetic markers that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal Components Analysis and algorithms that we have recently described for the selection of Ancestry Informative Markers from genomewide datasets, we present a decision-tree which can be used to accurately infer the origin of individual cattle. In doing so, we present a thorough examination of population genetic structure in modern bovine breeds. Performing extensive cross-validation experiments, we demonstrate that 250-500 carefully selected SNPs suffice in order to achieve close to 100% prediction accuracy of individual ancestry, when this particular set of 19 breeds is considered. Our methods, coupled with the dense genotypic data that is becoming increasingly available, have the potential to become a valuable tool and have considerable impact in worldwide livestock production. They can be used to inform the design of studies of the genetic basis of economically important traits in cattle, as well as breeding programs and efforts to conserve biodiversity. Furthermore, the SNPs that we have identified can provide a reliable solution for the traceability of breed-specific branded products

Public Library of Science (PLOS)

Archivio istituzionale della Ricerca - Università degli Studi di Parma

PCA-based population structure inference with generic clustering algorithms

Author: Ali Abdool
C Fraley
C Tracy
Chih Lee
Chun-Hsi Huang
DA Konovalov
G Schwarz
HM Cann
I Johnstone
JA Hartigan
JK Pritchard
L Liang
N Patterson
NA Rosenberg
P Paschou
R Tibshirani
WJ Ewens
X Zhu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Handling genotype data typed at hundreds of thousands of loci is very time-consuming and it is no exception for population structure inference. Therefore, we propose to apply PCA to the genotype data of a population, select the significant principal components using the Tracy-Widom distribution, and assign the individuals to one or more subpopulations using generic clustering algorithms. Results We investigated K-means, soft K-means and spectral clustering and made comparison to STRUCTURE, a model-based algorithm specifically designed for population structure inference. Moreover, we investigated methods for predicting the number of subpopulations in a population. The results on four simulated datasets and two real datasets indicate that our approach performs comparably well to STRUCTURE. For the simulated datasets, STRUCTURE and soft K-means with BIC produced identical predictions on the number of subpopulations. We also showed that, for real dataset, BIC is a better index than likelihood in predicting the number of subpopulations. Conclusion Our approach has the advantage of being fast and scalable, while STRUCTURE is very time-consuming because of the nature of MCMC in parameter estimation. Therefore, we suggest choosing the proper algorithm based on the application of population structure inference.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Ancestral Informative Marker Selection and Population Structure Visualization Using Sparse Laplacian Eigenfunctions

Author: A Lee
AL Price
AL Price
B Shameek
C Tian
CC Chang
CM Carvalho
DL Donoho
EJ Candes
EJ Parra
FRK Chung
G Coop
H Chen
H Tang
H Zou
HE Collins-Schramm
J Novembre
J Pritchard
J Shawe-Taylor
J Zhang
J Zhang
JK Pritchard
Jun Zhang
L Cavalli-Sforza
L Sun
L Sun
M Bauchet
M Belkin
Manfred Kayser
MS McPeek
N Mantel
N Rosenberg
NA Rosenberg
O Lao
P Menozzi
P Paschou
P Paschou
R Tibshirani
R Tibshirani
RR Hudson
U von Luxburg
V Vapnik
X Zhu
Publication venue: Public Library of Science
Publication date: 04/11/2010
Field of study

Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral informative markers usually require the prior knowledge of individual ancestry and have difficulty for admixed populations. Recently Principal Components Analysis (PCA) has been employed with success to select SNPs which are highly correlated with top significant principal components (PCs) without use of individual ancestral information. The approach is also applicable to admixed populations. Here we propose a novel approach based on our recent result on summarizing population structure by graph Laplacian eigenfunctions, which differs from PCA in that it is geometric and robust to outliers. Our approach also takes advantage of the priori sparseness of informative markers in the genome. Through simulation of a ring population and the real global population sample HGDP of 650K SNPs genotyped in 940 unrelated individuals, we validate the proposed algorithm at selecting most informative markers, a small fraction of which can recover the similar underlying population structure efficiently. Employing a standard Support Vector Machine (SVM) to predict individuals' continental memberships on HGDP dataset of seven continents, we demonstrate that the selected SNPs by our method are more informative but less redundant than those selected by PCA. Our algorithm is a promising tool in genome-wide association studies and population genetics, facilitating the selection of structure informative markers, efficient detection of population substructure and ancestral inference

Effect of population stratification analysis on false-positive rates for common and rare variants

Author: AL Price
AP Morris
B Devlin
Brad G Kurowski
C Dering
CJ Hoggart
DC Thomas
ET Cirulli
Hua He
LA Almasy
Lili Ding
Lisa J Martin
NA Rosenberg
NJ Schork
P Paschou
SP Dickson
Tesfaye M Baye
TM Baye
Xue Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Principal components analysis (PCA) has been successfully used to correct for population stratification in genome-wide association studies of common variants. However, rare variants also have a role in common disease etiology. Whether PCA successfully controls population stratification for rare variants has not been addressed. Thus we evaluate the effect of population stratification analysis on false-positive rates for common and rare variants at the single-nucleotide polymorphism (SNP) and gene level. We use the simulation data from Genetic Analysis Workshop 17 and compare false-positive rates with and without PCA at the SNP and gene level. We found that SNPs’ minor allele frequency (MAF) influenced the ability of PCA to effectively control false discovery. Specifically, PCA reduced false-positive rates more effectively in common SNPs (MAF > 0.05) than in rare SNPs (MAF < 0.01). Furthermore, at the gene level, although false-positive rates were reduced, power to detect true associations was also reduced using PCA. Taken together, these results suggest that sequence-level data should be interpreted with caution, because extremely rare SNPs may exhibit sporadic association that is not controlled using PCA

Springer - Publisher Connector