34 research outputs found
A broad overview of genotype imputation: Standard guidelines, approaches, and future investigations in genomic association studies
The advent of genomic big data and the statistical need for reaching significant results have led genome-wide association studies to be ravenous of a huge number of genetic markers scattered along the whole genome. Since its very beginning, the so-called genotype imputation served this purpose; this statistical and inferential procedure based on a known reference panel opened the theoretical possibility to extend association analyses to a greater number of polymorphic sites which have not been previously assayed by the used technology. In this review, we present a broad overview of the genotype imputation process, showing the most known methods and presenting the main areas of interest, with a closer look to the most up-to-date approaches and a deeper understanding of its usage in the present-day genomic landscape, shedding a light on its future developments and investigation areas
Resistome, mobilome and virulome analysis of Shewanellaalgae and Vibrio spp. strains isolated in italian aquaculture centers
Antimicrobial resistance is a major public health concern restricted not only to healthcare settings but also to veterinary and environmental ones. In this study, we analyzed, by whole genome sequencing (WGS) the resistome, mobilome and virulome of 12 multidrug-resistant (MDR) marine strains belonging to Shewanellaceae and Vibrionaceae families collected at aquaculture centers in Italy. The results evidenced the presence of several resistance mechanisms including enzyme and efflux pump systems conferring resistance to beta-lactams, quinolones, tetracyclines, macrolides, polymyxins, chloramphenicol, fosfomycin, erythromycin, detergents and heavy metals. Mobilome analysis did not find circular elements but class I integrons, integrative and conjugative element (ICE) associated modules, prophages and different insertion sequence (IS) family transposases. These mobile genetic elements (MGEs) are usually present in other aquatic bacteria but also in Enterobacteriaceae suggesting their transferability among autochthonous and allochthonous bacteria of the resilient microbiota. Regarding the presence of virulence factors, hemolytic activity was detected both in the Shewanella algae and in Vibrio spp. strains. To conclude, these data indicate the role as a reservoir of resistance and virulence genes in the environment of the aquatic microbiota present in the examined Italian fish farms that potentially might be transferred to bacteria of medical interest
Genomic characterization of Achromobacter species isolates from chronic and occasional lung infection in cystic fibrosis patients
Achromobacter species are increasingly being detected in cystic fibrosis (CF) patients, where they can establish chronic infections by adapting to the lower airway environment. To better understand the mechanisms contributing to a successful colonization by Achromobacter species, we sequenced the whole genome of 54 isolates from 26 patients with occasional and early/late chronic lung infection. We performed a phylogenetic analysis and compared virulence and resistance genes, genetic variants and mutations, and hypermutability mechanisms between chronic and occasional isolates. We identified five Achromobacter species as well as two non-affiliated genogroups (NGs). Among them were the frequently isolated Achromobacter xylosoxidans and four other species whose clinical importance is not yet clear: Achromobacter insuavis, Achromobacter dolens, Achromobacter insolitus and Achromobacter aegrifaciens. While A. insuavis and A. dolens were isolated only from chronically infected patients and A. aegrifaciens only from occasionally infected patients, the other species were found in both groups. Most of the occasional isolates lacked functional genes involved in invasiveness, chemotaxis, type 3 secretion system and anaerobic growth, whereas the great majority (>60%) of chronic isolates had these genomic features. Interestingly, almost all (n=22/23) late chronic isolates lacked functional genes involved in lipopolysaccharide production. Regarding antibiotic resistance, we observed a species-specific distribution of blaOXA genes, confirming what has been reported in the literature and additionally identifying blaOXA-2 in some A. insolitus isolates and observing no blaOXA genes in A. aegrifaciens or NGs. No significant difference in resistance genes was found between chronic and occasional isolates. The results of the mutator genes analysis showed that no occasional isolate had hypermutator characteristics, while 60% of early chronic (<1 year from first colonization) and 78% of late chronic (>1 year from first colonization) isolates were classified as hypermutators. Although all A. dolens, A. insuavis and NG isolates presented two different mutS genes, these seem to have a complementary rather than compensatory function. In conclusion, our results show that Achromobacter species can exhibit different adaptive mechanisms and some of these mechanisms might be more useful than others in establishing a chronic infection in CF patients, highlighting their importance for the clinical setting and the need for further studies on the less clinically characterized Achromobacter species
Mobilome analysis of Achromobacter spp. isolates from chronic and occasional lung infection in cystic fibrosis patients
Achromobacter spp. is an opportunistic pathogen that can cause lung infections in patients with cystic fibrosis (CF). Although a variety of mobile genetic elements (MGEs) carrying antimicrobial resistance genes have been identified in clinical isolates, little is known about the contribution of Achromobacter spp. mobilome to its pathogenicity. To provide new insights, we performed bioinformatic analyses of 54 whole genome sequences and investigated the presence of phages, insertion sequences (ISs), and integrative and conjugative elements (ICEs). Most of the detected phages were previously described in other pathogens and carried type II toxin-antitoxin systems as well as other pathogenic genes. Interestingly, the partial sequence of phage Bcep176 was found in all the analyzed Achromobacter xylosoxidans genome sequences, suggesting the integration of this phage in an ancestor strain. A wide variety of IS was also identified either inside of or in proximity to pathogenicity islands. Finally, ICEs carrying pathogenic genes were found to be widespread among our isolates and seemed to be involved in transfer events within the CF lung. These results highlight the contribution of MGEs to the pathogenicity of Achromobacter species, their potential to become antimicrobial targets, and the need for further studies to better elucidate their clinical impact
16S rRNA gene amplicons and taxonomic classification of oral microbiome
The term microbiota refers to a set of microorganisms, considered as a living ecosystem, undergoing continuous changes in the growth and survival of all its members. The microbiome consists of the set of microorganism genomes. The human microbiota is estimated to contain about 10^14 commensal bacterial cells. The present high-throughput sequencing technology has led to the development of genome-based methods for bacterial classification and for understanding the functional role of the microbiota and its interaction with the host. In this study we explore the capability of a gene-based sequencing method to classify bacteria of the oral microbiome, the second largest microbial community in the human body, after the gut. The method depends on the detection of sequence variants in the bacterial 16S rRNA gene (length ~1500bp), present in all bacterial genomes. This gene includes nine hypervariable regions (V1-V9) that exhibit sequence diversity among different bacterial species. Therefore, the sequence variability of this gene is used to classify bacteria into proper taxonomic groups. The sequencing of one single hypervariable region cannot summarize the entire gene variability of the bacteria. Therefore, at least 2 hypervariable regions are generally studied. In gut studies the V3 and V4 regions are the most commonly analyzed. This could not be the case for oral microbiome studies. Here, we propose a study that investigates all the 9 hypervariable regions (6 amplicons) and how their characterization impacts on the overall taxa classification, at different taxonomic layers. This will permit to show up also the specificity of each hypervariable region (or their combination) to identify bacterial species. We collected 4 buccal swab samples from healthy individuals, and the extracted DNA was sequenced according to the QIAseq 16S/ITS panel handbook on an Illumina MiSeq NGS platform producing ~200,000 paired end reads (276PE) per sample. We carried out the study in two different ways: 1) by combining data from all amplicons of the 16S regions together, 2) by combining data from each amplicon region that was processed individually, in each sample. Amplicon analyses were performed using the Divisive Amplicon Denoising Algorithm (DADA2) that counts the number of amplicon sequence variants (ASVs) in each analyzed sample, reporting their abundance. ASVs were then classified using a pre-trained set for oral bacterial genome sequences (Human Oral Microbiome Database, version 15.1), slightly modified according to DADA2 requirements. The classification efficiency and accuracy (at genus or species layer) of every ASVs belonging to the different hypervariable regions was then ascertained. This analysis highlights the hypervariable regions able to capture the greatest gene variability for oral microbiome. Moreover, the ten most common species of each of the 6 amplicons, were reported for comparison purposes. We identified about 90 genera and more than 200 species; out of 9 identified phyla, Proteobacteria resulted to be the most abundant phylum (~ 56%). Of all the 2600 unique observed ASVs (4 samples), 1147 were successfully classified at the species taxonomic layer (overall classification rate: 44.1%). Overall, 204 different species were recognized with the entire set of combined amplicons, whereas 206 different species were identified by the combined results of single amplicons. The V1-V2 and V2-V3 amplicons recognized the highest number of species compared to the others, about 134 and 135 different species, respectively, of which 101 species in common. All the single regions showed almost the same ten most recurrent species. Moreover, each region resulted to be able to detect specific bacterial species that were not detectable by the other 16S regions. In conclusion, studying all the 9 16S gene regions is ~1.7 times more informative than studying just either one or 2 regions, and some species can be recognized only when studying specific regions. Still it remains doubtful how to treat data from different regions together to estimate the relative abundances of bacterial species within each sample
Reconstruction and functional analysis of altered molecular pathways in human atherosclerotic arteries
<p>Abstract</p> <p>Background</p> <p>Atherosclerosis affects aorta, coronary, carotid, and iliac arteries most frequently than any other body vessel. There may be common molecular pathways sustaining this process. Plaque presence and diffusion is revealed by circulating factors that can mediate systemic reaction leading to plaque rupture and thrombosis.</p> <p>Results</p> <p>We used DNA microarrays and meta-analysis to study how the presence of calcified plaque modifies human coronary and carotid gene expression. We identified a series of potential human atherogenic genes that are integrated in functional networks involved in atherosclerosis. Caveolae and JAK/STAT pathways, and S100A9/S100A8 interacting proteins are certainly involved in the development of vascular disease. We found that the system of caveolae is directly connected with genes that respond to hormone receptors, and indirectly with the apoptosis pathway.</p> <p>Cytokines, chemokines and growth factors released in the blood flux were investigated in parallel. High levels of RANTES, IL-1ra, MIP-1alpha, MIP-1beta, IL-2, IL-4, IL-5, IL-6, IL-7, IL-17, PDGF-BB, VEGF and IFN-gamma were found in plasma of atherosclerotic patients and might also be integrated in the molecular networks underlying atherosclerotic modifications of these vessels.</p> <p>Conclusion</p> <p>The pattern of cytokine and S100A9/S100A8 up-regulation characterizes atherosclerosis as a proinflammatory disorder. Activation of the JAK/STAT pathway is confirmed by the up-regulation of IL-6, STAT1, ISGF3G and IL10RA genes in coronary and carotid plaques. The functional network constructed in our research is an evidence of the central role of STAT protein and the caveolae system to contribute to preserve the plaque. Moreover, Cav-1 is involved in SMC differentiation and dyslipidemia confirming the importance of lipid homeostasis in the atherosclerotic phenotype.</p
Testing the performance of the imputation of MHC region in large datasets when using different reference panels
The major histocompatibility complex (MHC) contains a group of genes (~260 genes in ~4Mb) involved in several inflammatory disorders and immune response including the HLA-C gene. So far, the IPD-IMGT/HLA database reports more than 4000 different HLA-C alleles. Given the highly polymorphic nature of the gene, GWAS generally don’t study or study only a small subset of polymorphic sites of the region. Imputation procedures may help in gaining additional information on this region. However, the successful imputation of the MHC region would require a reference panel with detailed information. The main goal of this study is to investigate whether imputation procedures using appropriate reference panels may effectively increase the number of polymorphic sites of the MHC region for association with complex traits. We studied the MHC region imputation performances using 3 different reference panels (Michigan and TOPMed imputation servers): TOPMed-r2, 1000 Genomes (Phase3, v5), and the novel four-digit multi-ethnic HLA panel (v1, 2021). Here, 5 datasets with more than 1000 individuals each underwent imputation. We then focused on the imputation results of the MHC region that surround the HLA-C gene (hg19: 31234948-31241032). Imputation reported a different number of markers for the different reference panels: 482 in 1000G, 365 in TOPMed, and 1272 in HLA-panel. Of note, the HLA panels gave a higher number of imputed markers than the others. We then selected the 104 common markers imputed by all the 3 reference panels. Moreover, 162 markers were found only by 1000G panel, 194 by TOPMed, and 998 by the HLA-panel. The first preliminary comparisons showed a high concordance value for the genotype calling by the 3 different reference sets. The efficiency of the imputation was measured by the R-squared (R2) values stratifying the markers into 3 groups according to the minor allele frequency (MAF). The 104 common markers showed high R2 values (>0.96). As expected, in the other marker groups, the R2 mean values were lower for markers with MAF<0.1 (>0.65 in 1000G, 0.15-0.20 in TOPMed, >0.40 in HLA panel). In conclusion, imputation-based procedures with dedicated HLA panels can produce much more high-quality information than other general purpose reference panels for the MHC region
Pocket-sized genomics and transcriptomics analyses: a look at the newborn BioVRPi project
BioVRPi is a newborn project, started in January 2021, that focuses on Raspberry Pi (RPi) employment in bioinformatics, with particular regards on genomics. In the previous years, some research groups have already reported several examples of applications for RPi, including bioinformatic basic training and proteomics. Our project aims to develop and offer a low-cost, stable, and tested bioinformatic environment for students and researchers involved in genomics and transcriptomics fields. Raspberry Pi is a small single-board low-cost computer that was developed by the Raspberry Pi Foundation since 2012. Its original purpose aimed to facilitate computer science basic teaching in developing countries, but the growing worldwide interest has permitted its constant progress and development. Thanks to its features, RPi can suit several disciplines in need for computational supports and reach almost every, if not all, research group in the world. We tested RPi capabilities on real case studies, relatively to Genome-Wide Association Studies (GWAS) for complex traits in Homo sapiens data and in transcriptomic analyses (RNA-seq) on the Strongyloides stercoralis human parasite samples, using two RPi-4 devices equipped with different amount of RAM (8GB for genomics and 2 GB for transcriptome analyses, respectively), and running a 64-bit Operating System. The analyses leveraged on state-of-art bioinformatic toolset, such as Plink and Plink1.9, SAMtools, Bowtie 2, R, and different R packages, all compiled from source code. Moreover, the GWAS was run according to the golden standard protocols and results from the different platforms were compared. The results showed that RPi are effective devices that can efficiently handle whole GWAS and RNA-seq analyses. Benchmarking showed that the computational time taken by RPi was of the same order of magnitude when compared to the ones from a commonly used bioinformatic computer. At last, BioVRPi project shows how to implement new strategies for bioinformatic analyses, in order to provide a having-fun environment to learn and explore new alternatives in bioinformatic data analysis
Analyzing BioRad-Illumina Single Cell RNA-Seq data with open source tools
Single cell RNA-Seq is a powerful technique that is becoming more popular since it enables to sequence the transcriptome of each cell within a population of different cell types in a single experiment. Currently, there are a few different technologies, like BioRad-Illumina ddSeq and 10X Chromium
Methylation profile study of CD14+ monocytesof multiple sclerosis-affected individuals.
Methylation is one of the most studied epigenetic mechanisms known to affect gene expression. It refers to the covalent binding of a methyl group to the fifth position of cytosine residues in the CpG dinucleotide context in mammals. In our study we analysed 26 CD14+ monocyte samples coming from relapsing remitting-multiple sclerosis (MS) patients anc controls. DNA libraries were prepared by SeqCap Epi Enrichment System (Roche) with enzymatic fragmentation and bisulfite conversion of 26 DNAs (pool 1) and then sequenced by Illumina Next-Generation Sequencing platform. The aim was to estimate the epigenetic profile and investigate differentially methylated regions between cases and controls. Our preliminary results showed an unexpected epigenetic pattern (~2.5 million CpGs after QC steps) lacking many methylation signals, suggesting that the enzymatic fragmentation disrupted somehow most of methylated cytosines. To evaluate whether the method of DNA fragmentation had an impact on the observed results, eight samples (pool 2) belonging to pool 1 were then analysed using mechanical fragmentation of DNA as a second and independent method. Pool 2 samples showed the expected methylation profile, with many loci either fully methylated or non-methylated. Methylation profiles from samples common to pool 1 and pool 2 were then compared to one another. Through bioinformatic and statistical tools the data were processed to infer any correlations between the methylation signals (β values) of the two pools and then to recover as many lost methylation signals as possible from the pool 1 samples, using the pool 2 samples as reference. Preliminary results showed that most fully methylated loci in pool 2 showed a lower β value in pool 1 samples, while for hypomethylated loci the two pools show a concordance of ~99%. Moreover, differentially methylated loci between MS cases and controls show a signal of differential methylation (nominal pvalue threshold 1%) for 1.359 CpG loci, a part of them map on the DIP2C gene. Further analyses need to be done to investigate the impact of enzymatic fragmentation on methylation estimation and to get the epigenetic profiles on the dataset of 26 MS samples. In addition, miRNA expression from this dataset will be integrated with methylation signals