Search CORE

272 research outputs found

Variant calling:Considerations, practices, and developments

Author: Guryev Victor
Zverinova Stepanka
Publication venue: 'Wiley'
Publication date: 01/08/2022
Field of study

The success of many clinical, association, or population genetics studies critically relies on properly performed variant calling step. The variety of modern genomics protocols, techniques, and platforms makes our choices of methods and algorithms difficult and there is no "one size fits all" solution for study design and data analysis. In this review, we discuss considerations that need to be taken into account while designing the study and preparing for the experiments. We outline the variety of variant types that can be detected using sequencing approaches and highlight some specific requirements and basic principles of their detection. Finally, we cover interesting developments that enable variant calling for a broad range of applications in the genomics field. We conclude by discussing technological and algorithmic advances that have the potential to change the ways of calling DNA variants in the nearest future

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

CONREAL web server: identification and visualization of conserved transcription factor binding sites

Author: Berezikov Eugene
Cuppen Edwin
Guryev Victor
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

The use of orthologous sequences and phylogenetic footprinting approaches have become popular for the recognition of conserved and potentially functional sequences. Several algorithms have been developed for the identification of conserved transcription factor binding sites (TFBSs), which are characterized by their relatively short and degenerative recognition sequences. The CONREAL (conserved regulatory elements anchored alignment) web server provides a versatile interface to CONREAL-, LAGAN-, BLASTZ- and AVID-based predictions of conserved TFBSs in orthologous promoters. Comparative analysis using different algorithms can be started by keyword without any prior sequence retrieval. The interface is available at

Crossref

PubMed Central

CASCAD: a database of annotated candidate single nucleotide polymorphisms associated with expressed sequences

Author: Berezikov Eugene
Cuppen Edwin
Guryev Victor
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: With the recent progress made in large-scale genome sequencing projects a vast amount of novel data is becoming available. A comparative sequence analysis, exploiting sequence information from various resources, can be used to uncover hidden information, such as genetic variation. Although there are enormous amounts of SNPs for a wide variety of organisms submitted to NCBI dbSNP and annotated in most genome assembly viewers like Ensembl and the UCSC Genome Browser, these platforms do not easily allow for extensive annotation and incorporation of experimental data supporting the polymorphism. However, such information is very important for selecting the most promising and useful candidate polymorphisms for use in experimental setups. DESCRIPTION: The CASCAD database is designed for presentation and query of candidate SNPs that are retrieved by in silico mining of high-throughput sequencing data. Currently, the database provides collections of laboratory rat (Rattus norvegicus) and zebrafish (Danio rerio) candidate SNPs. The database stores detailed information about raw data supporting the candidate, extensive annotation and links to external databases (e.g. GenBank, Ensembl, UniGene, and LocusLink), verification information, and predictions of a potential effect for non-synonymous polymorphisms in coding regions. The CASCAD website allows search based on an arbitrary combination of 27 different parameters related to characteristics like candidate SNP quality, genomic localization, and sequence data source or strain. In addition, the database can be queried with any custom nucleotide sequences of interest. The interface is crosslinked to other public databases and tightly coupled with primer design and local genome assembly interfaces in order to facilitate experimental verification of candidates. CONCLUSIONS: The CASCAD database discloses detailed information on rat and zebrafish candidate SNPs, including the raw data underlying its discovery. An advanced web-based search interface allows universal access to the database content and allows various queries supporting many types of research utilizing single nucleotide polymorphisms

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Significance Tests for Gaussian Graphical Models Based on Shrunken Densities

Author: Bernal Arzola Victor
Bischoff Rainer
Grzegorczyk Marco
Guryev Victor
Horvatovich Peter
Publication venue: University of Bristol
Publication date: 20/07/2018
Field of study

Proceedings - University of Groningen

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Guryev Victor
Lansdorp Peter
Porubský David
Spierings Diana
Publication venue
Publication date: 23/09/2017
Field of study

The incomplete identification of structural variants from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long- and short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,181 indel variants (<50 bp) and 31,599 structural variants (≥50 bp) per human genome, a seven fold increase in structural variation compared to previous reports, including from the 1000 Genomes Project. We also discovered 156 inversions per genome, most of which previously escaped detection, as well as large unbalanced chromosomal rearrangements. We provide near-complete, haplotype-resolved structural variation for three genomes that can now be used as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Significance Tests for Gaussian Graphical Models Based on Shrunken Densities

Author: Bernal Arzola Victor
Bischoff Rainer
Grzegorczyk Marco
Guryev Victor
Horvatovich Peter
Publication venue: University of Bristol
Publication date: 20/07/2018
Field of study

Dissertations of the University of Groningen

The 'un-shrunk' partial correlation in Gaussian graphical models

Author: Bernal Victor
Bischoff Rainer
Grzegorczyk Marco
Guryev Victor
Horvatovich Peter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2021
Field of study

Abstract Background In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it remains open to study how the shrinkage affects the final result and its interpretation. Results We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as ‘un-shrinking’ the partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. This is demonstrated on two gene expression datasets from Escherichia coli and Mus musculus. Conclusions GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the ‘high-dimensional problem’. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Directory of Open Access Journals

Dissertations of the University of Groningen

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Guryev Victor
Lansdorp Peter
Porubský David
Spierings Diana
Publication venue
Publication date: 23/09/2017
Field of study

ARTS repository - University of Groningen

Cytochrome P450scc spin state transitions in the thin solid films

Author: Claudio Nicolini
Oleg Guryev
Sergei A. Usanov
Victor Erokhin
Publication venue
Publication date: 01/05/1996
Field of study

Langmuir-Blodgett films of cytochrome P450scc were prepared on the solid supports and their spectral properties were investigated. Being immobilized, hemoprotein changes its spin state from initially high to low spin. This transition is reversible since after the solubilization of hemoprotein, the spin state equilibrium is shifted towards high-spin state. Anaerobic reduction of film incorporated cytochrome P450scc by electron transfer chain (NADPH-->adrenodoxin reductase-->adrenodoxin) revealed the low rate of the reaction that coincides well with the content of the hemoprotein low-spin form. We suggest that particularly regular orientation of solid cytochrome P450scc are of crucial importance for this phenomenon

Crossref

Open Access Repository

Significance Tests for Gaussian Graphical Models Based on Shrunken Densities

Author: Bernal Arzola Victor
Bischoff Rainer
Grzegorczyk Marco
Guryev Victor
Horvatovich Peter
Publication venue: University of Bristol
Publication date: 20/07/2018
Field of study

ARTS repository - University of Groningen