Search CORE

23 research outputs found

Strategioita toksikogenomidata-analyysien standardisoinnin ja robustisuuden parantamiseksi

Author: Marwah Veer Singh
Publication venue: 'University of Helsinki Libraries'
Publication date: 06/09/2019
Field of study

Toxicology is the scientific pursuit of identifying and classifying the toxic effect of a substance, as well as exploration and understanding of the adverse effects due to toxic exposure. The modern toxicological efforts have been driven by the human industrial exploits in the production of engineered substances with advanced interdisciplinary scientific collaborations. These engineered substances must be carefully tested to ensure public safety. This task is now more challenging than ever with the employment of new classes of chemical compounds, such as the engineered nanomaterials. Toxicological paradigms have been redefined over the decades to be more agile, versatile, and sensitive. On the other hand, the design of toxicological studies has become more complex, and the interpretation of the results is more challenging. Toxicogenomics offers a wealth of data to estimate the gene regulation by inspection of the alterations of many biomolecules (such as DNA, RNA, proteins, and metabolites). The response of functional genes can be used to infer the toxic effects on the biological system resulting in acute or chronic adverse effects. However, the dense data from toxicogenomics studies is difficult to analyze, and the results are difficult to interpret. Toxicogenomic evidence is still not completely integrated into the regulatory framework due to these drawbacks. Nanomaterial properties such as particle size, shape, and structure increase complexity and unique challenges to Nanotoxicology. This thesis presents the efforts in the standardization of toxicogenomics data by showcasing the potential of omics in nanotoxicology and providing easy to use tools for the analysis, and interpretation of omics data. This work explores two main themes: i) omics experimentation in nanotoxicology and investigation of nanomaterial effect by analysis of the omics data, and ii) the development of analysis pipelines as easy to use tools that bring advanced analytical methods to general users. In this work, I explored a potential solution that can ensure effective interpretability and reproducibility of omics data and related experimentation such that an independent researcher can interpret it thoroughly. DNA microarray technology is a well-established research tool to estimate the dynamics of biological molecules with high throughput. The analysis of data from these assays presents many challenges as the study designs are quite complex. I explored the challenges of omics data processing and provided bioinformatics solutions to standardize this process. The responses of individual molecules to a given exposure is only partially informative and more sophisticated models, disentangling the complex networks of dynamic molecular interactions, need to be explored. An analytical solution is presented in this thesis to tackle down the challenge of producing robust interpretations of molecular dynamics in biological systems. It allows exploring the substructures in molecular networks underlying mechanisms of molecular adaptation to exposures. I also present here a multi-omics approach to defining the mechanism of action for human cell lines exposed to nanomaterials. All the methodologies developed in this project for omics data processing and network analysis are implemented as software solutions that are designed to be easily accessible also by users with no expertise in bioinformatics. Our strategies are also developed in an effort to standardize omics data processing and analysis and to promote the use of omics-based evidence in chemical risk assessment.Toxicology is the scientific pursuit of identifying and classifying the toxic effect of a substance, as well as exploration and understanding of the adverse effects due to toxic exposure. The modern toxicological efforts have been driven by the human industrial exploits in the production of engineered substances with advanced interdisciplinary scientific collaborations. These engineered substances must be carefully tested to ensure public safety. This task is now more challenging than ever with the employment of new classes of chemical compounds, such as the engineered nanomaterials. Toxicological paradigms have been redefined over the decades to be more agile, versatile, and sensitive. On the other hand, the design of toxicological studies has become more complex, and the interpretation of the results is more challenging. Toxicogenomics offers a wealth of data to estimate the gene regulation by inspection of the alterations of many biomolecules (such as DNA, RNA, proteins, and metabolites). The response of functional genes can be used to infer the toxic effects on the biological system resulting in acute or chronic adverse effects. However, the dense data from toxicogenomics studies is difficult to analyze, and the results are difficult to interpret. Toxicogenomic evidence is still not completely integrated into the regulatory framework due to these drawbacks. Nanomaterial properties such as particle size, shape, and structure increase complexity and unique challenges to Nanotoxicology. This thesis presents the efforts in the standardization of toxicogenomics data by showcasing the potential of omics in nanotoxicology and providing easy to use tools for the analysis, and interpretation of omics data. This work explores two main themes: i) omics experimentation in nanotoxicology and investigation of nanomaterial effect by analysis of the omics data, and ii) the development of analysis pipelines as easy to use tools that bring advanced analytical methods to general users. In this work, I explored a potential solution that can ensure effective interpretability and reproducibility of omics data and related experimentation such that an independent researcher can interpret it thoroughly. DNA microarray technology is a well-established research tool to estimate the dynamics of biological molecules with high throughput. The analysis of data from these assays presents many challenges as the study designs are quite complex. I explored the challenges of omics data processing and provided bioinformatics solutions to standardize this process. The responses of individual molecules to a given exposure is only partially informative and more sophisticated models, disentangling the complex networks of dynamic molecular interactions, need to be explored. An analytical solution is presented in this thesis to tackle down the challenge of producing robust interpretations of molecular dynamics in biological systems. It allows exploring the substructures in molecular networks underlying mechanisms of molecular adaptation to exposures. I also present here a multi-omics approach to defining the mechanism of action for human cell lines exposed to nanomaterials. All the methodologies developed in this project for omics data processing and network analysis are implemented as software solutions that are designed to be easily accessible also by users with no expertise in bioinformatics. Our strategies are also developed in an effort to standardize omics data processing and analysis and to promote the use of omics-based evidence in chemical risk assessment

Helsingin yliopiston digitaalinen arkisto

INfORM : Inference of NetwOrk Response Modules

Author: Fortino Vittorio
Grecot Dario
Kinare Pia Anneli Sofia
Lauerma Antti
Marwah Veer Singh
Scala Giovanni
Serra Angela
Publication venue
Publication date: 01/01/2018
Field of study

The Summary: Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps.Peer reviewe

Archivio della ricerca - Università degli studi di Napoli Federico II

PubMed Central

TamPub Julkaisuarkisto - TamPub Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

Allelic Expression Imbalance in the Human Retinal Transcriptome and Potential Impact on Inherited Retinal Diseases

Author: Kohl Susanne
Llavona Pablo
Mutarelli Margherita
Pinelli Michele
Schimpf-Linzenbold Simone
Singh Marwah Veer
Thaler Sebastian
Vetter Jan
Wissinger Bernd
Yoeruek Efdal
Publication venue
Publication date: 01/10/2017
Field of study

Inherited retinal diseases (IRDs) are often associated with variable clinical expressivity (VE) and incomplete penetrance (IP). Underlying mechanisms may include environmental, epigenetic, and genetic factors. Cis-acting expression quantitative trait loci (cis-eQTLs) can be implicated in the regulation of genes by favoring or hampering the expression of one allele over the other. Thus, the presence of such loci elicits allelic expression imbalance (AEI) that can be traced by massive parallel sequencing techniques. In this study, we performed an AEI analysis on RNA-sequencing (RNA-seq) data, from 52 healthy retina donors, that identified 194 imbalanced single nucleotide polymorphisms(SNPs) in 67 IRD genes. Focusing on SNPs displaying AEI at a frequency higher than 10%, we found evidence of AEI in several IRD genes regularly associated with IP and VE (BEST1, RP1, PROM1, and PRPH2). Based on these SNPs commonly undergoing AEI, we performed pyrosequencing in an independent sample set of 17 healthy retina donors in order to confirm our findings. Indeed, we were able to validate CDHR1, BEST1, and PROM1 to be subjected to cis-acting regulation. With this work, we aim to shed light on differentially expressed alleles in the human retina transcriptome that, in the context of autosomal dominant IRD cases, could help to explain IP or VE.Peer reviewe

Crossref

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

Nextcast : A software suite to analyse and model toxicogenomics data

Author: Cattelani Luca
del Giudice Giusy
Federico Antonio
Fortino Vittorio
Fratello Michele
Greco Dario
Kinaret Pia Anneli Sofia
Laurino Omar
Marwah Veer Singh
Pavel Alisa
Saarimäki Laura Aliisa
Scala Giovanni
Serra Angela
Publication venue
Publication date: 01/01/2022
Field of study

The recent advancements in toxicogenomics have led to the availability of large omics data sets, representing the starting point for studying the exposure mechanism of action and identifying candidate biomarkers for toxicity prediction. The current lack of standard methods in data generation and analysis hampers the full exploitation of toxicogenomics-based evidence in regulatory risk assessment. Moreover, the pipelines for the preprocessing and downstream analyses of toxicogenomic data sets can be quite challenging to implement. During the years, we have developed a number of software packages to address specific questions related to multiple steps of toxicogenomics data analysis and modelling. In this review we present the Nextcast software collection and discuss how its individual tools can be combined into efficient pipelines to answer specific biological questions. Nextcast components are of great support to the scientific community for analysing and interpreting large data sets for the toxicity evaluation of compounds in an unbiased, straightforward, and reliable manner. The Nextcast software suite is available at: ( https://github.com/fhaive/nextcast).(c) 2022 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Peer reviewe

Archivio della ricerca - Università degli studi di Napoli Federico II

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

FunMappOne : a tool to hierarchically organize and visually navigate functional gene annotations in multiple experiments

Author: Greco Dario
Marwah Veer Singh
Saarimaki Laura Aliisa
Scala Giovanni
Serra Angela
Publication venue
Publication date: 01/01/2019
Field of study

BackgroundFunctional annotation of genes is an essential step in omics data analysis. Multiple databases and methods are currently available to summarize the functions of sets of genes into higher level representations, such as ontologies and molecular pathways. Annotating results from omics experiments into functional categories is essential not only to understand the underlying regulatory dynamics but also to compare multiple experimental conditions at a higher level of abstraction. Several tools are already available to the community to represent and compare functional profiles of omics experiments. However, when the number of experiments and/or enriched functional terms is high, it becomes difficult to interpret the results even when graphically represented. Therefore, there is currently a need for interactive and user-friendly tools to graphically navigate and further summarize annotations in order to facilitate results interpretation also when the dimensionality is high.ResultsWe developed an approach that exploits the intrinsic hierarchical structure of several functional annotations to summarize the results obtained through enrichment analyses to higher levels of interpretation and to map gene related information at each summarized level. We built a user-friendly graphical interface that allows to visualize the functional annotations of one or multiple experiments at once. The tool is implemented as a R-Shiny application called FunMappOne and is available at https://github.com/grecolab/FunMappOne.ConclusionFunMappOne is a R-shiny graphical tool that takes in input multiple lists of human or mouse genes, optionally along with their related modification magnitudes, computes the enriched annotations from Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, or Reactome databases, and reports interactive maps of functional terms and pathways organized in rational groups. FunMappOne allows a fast and convenient comparison of multiple experiments and an easy way to interpret results.Peer reviewe

Directory of Open Access Journals

TamPub Julkaisuarkisto - TamPub Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

eUTOPIA : solUTion for Omics data PreprocessIng and Analysis

Author: Alenius Harri
Fortino Vittorio
Greco Dario
Kinaret Pia Anneli Sofia
Marwah Veer Singh
Scala Giovanni
Serra Angela
Publication venue
Publication date: 01/01/2019
Field of study

BackgroundApplication of microarrays in omics technologies enables quantification of many biomolecules simultaneously. It is widely applied to observe the positive or negative effect on biomolecule activity in perturbed versus the steady state by quantitative comparison. Community resources, such as Bioconductor and CRAN, host tools based on R language that have become standard for high-throughput analytics. However, application of these tools is technically challenging for generic users and require specific computational skills. There is a need for intuitive and easy-to-use platform to process omics data, visualize, and interpret results.ResultsWe propose an integrated software solution, eUTOPIA, that implements a set of essential processing steps as a guided workflow presented to the user as an R Shiny application.ConclusionseUTOPIA allows researchers to perform preprocessing and analysis of microarray data via a simple and intuitive graphical interface while using state of the art methods.Peer reviewe

Archivio della ricerca - Università degli studi di Napoli Federico II

Directory of Open Access Journals

TamPub Julkaisuarkisto - TamPub Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

Author: Aarti Desai
Abhay Jere
Akshay Yadav
Kishor Dhaygude
Ujwala Bangar
Veer Singh Marwah
Vineet Jha
Vivek Kulkarni
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources

CiteSeerX

Directory of Open Access Journals

PubMed Central

Allelic Expression Imbalance in the Human Retinal Transcriptome and Potential Impact on Inherited Retinal Diseases

Author: Bernd Wissinger
Efdal Yoeruek
Jan Vetter
Margherita Mutarelli
Michele Pinelli
Pablo Llavona
Sebastian Thaler
Simone Schimpf-Linzenbold
Sodi
Susanne Kohl
Veer Singh Marwah
Publication venue: 'MDPI AG'
Publication date
Field of study

Crossref

Assembly metrics for E.coli genome assembled from Illumina paired end data with Velvet, SOAPdenovo, ABySS, Meraculous and IDBA-UD.

Author: Aarti Desai (402084)
Abhay Jere (378022)
Akshay Yadav (402086)
Kishor Dhaygude (402088)
Ujwala Bangar (402089)
Veer Singh Marwah (402085)
Vineet Jha (402087)
Vivek Kulkarni (402090)
Publication venue
Publication date
Field of study

The expected genome size for E.coli MG1655 is 4639675 bases.</p

FigShare

Memory requirement for genome assembly.

Author: Aarti Desai (402084)
Abhay Jere (378022)
Akshay Yadav (402086)
Kishor Dhaygude (402088)
Ujwala Bangar (402089)
Veer Singh Marwah (402085)
Vineet Jha (402087)
Vivek Kulkarni (402090)
Publication venue
Publication date
Field of study

Memory required to assemble E.coli (A), S.kudriavzevii (B) and C.elegans (C) genomes increased, although not proportionately, with increasing depth of sequencing.</p

FigShare