Search CORE

27 research outputs found

Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future.

Author: Enright Anton J
Iliopoulos Ioannis
Malliarakis Dimitris
Papanikolaou Nikolas
Pavlopoulos Georgios A
Theodosiou Theodosis
Publication venue: Gigascience
Publication date: 01/01/2015
Field of study

"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further

Crossref

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

Metagenomics : tools and insights for analyzing next-generation sequencing data derived from biodiversity studies

Author: Arvanitidis Christos
Iliopoulos Ioannis
Kotoulas Georgios
Oulas Anastasis
Papanikolaou Nikolas
Pavlopoulos Georgios A
Pavloudi Christina
Polymenakou Paraskevi
Publication venue: 'SAGE Publications'
Publication date: 01/01/2015
Field of study

Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of “metagenomics”, often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Recommended from our members

BioTextQuest v2.0: An evolved tool for biomedical literature mining and concept discovery.

Author: Andreakos Evangelos
Antonakis Andreas
Baltoumas Fotis
Baltsavia Ismini
Brandau Sven
Chatzaki Ekaterini
Iliopoulos Ioannis
Karaglani Makrina
Mossialos Dimitrios
Ouzounis Christos
Papanikolaou Nikolas
Pavlopoulos Georgios
Promponas Vasilis
Theodosiou Theodosios
Vrettos Konstantinos
Publication venue: eScholarship, University of California
Publication date: 01/12/2024
Field of study

The process of navigating through the landscape of biomedical literature and performing searches or combining them with bioinformatics analyses can be daunting, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related repositories. Herein, we present BioTextQuest v2.0, a tool for biomedical literature mining. BioTextQuest v2.0 is an open-source online web portal for document clustering based on sets of selected biomedical terms, offering efficient management of information derived from PubMed abstracts. Employing established machine learning algorithms, the tool facilitates document clustering while allowing users to customize the analysis by selecting terms of interest. BioTextQuest v2.0 streamlines the process of uncovering valuable insights from biomedical research articles, serving as an agent that connects the identification of key terms like genes/proteins, diseases, chemicals, Gene Ontology (GO) terms, functions, and others through named entity recognition, and their application in biological research. Instead of manually sifting through articles, researchers can enter their PubMed-like query and receive extracted information in two user-friendly formats, tables and word clouds, simplifying the comprehension of key findings. The latest update of BioTextQuest leverages the EXTRACT named entity recognition tagger, enhancing its ability to pinpoint various biological entities within text. BioTextQuest v2.0 acts as a research assistant, significantly reducing the time and effort required for researchers to identify and present relevant information from the biomedical literature

eScholarship - University of California

DrugQuest - a text mining workflow for drug association discovery.

Author: Papanikolaou Nikolas,
Publication venue
Publication date: 16/01/2018
Field of study

Ezid

Genome urbanization: clusters of topologically co-regulated genes delineate functional compartments in the genome of Saccharomyces cerevisiae

Author: Malliarou Maria
Nikolaou Christoforos
Papanikolaou Nikolas
Roca Joaquim
Tsochatzidou Maria
Publication venue: 'Oxford University Press (OUP)'
Publication date: 23/03/2017
Field of study

The eukaryotic genome evolves under the dual constraint of maintaining coordinated gene transcription and performing effective DNA replication and cell division, the coupling of which brings about inevitable DNA topological tension. DNA supercoiling is resolved and, in some cases, even harnessed by the genome through the function of DNA topoisomerases, as has been shown in the concurrent transcriptional activation and suppression of genes upon transient deactivation of topoisomerase II (topoII). By analyzing a genome-wide transcription run-on experiment upon thermal inactivation of topoII in Saccharomyces cerevisiae we were able to define 116 gene clusters of consistent response (either positive or negative) to topological stress. A comprehensive analysis of these topologically co-regulated gene clusters reveals pronounced preferences regarding their functional, regulatory and structural attributes. Genes that negatively respond to topological stress, are positioned in gene-dense pericentromeric regions, are more conserved and associated to essential functions, while upregulated gene clusters are preferentially located in the gene-sparse nuclear periphery, associated with secondary functions and under complex regulatory control. We propose that genome architecture evolves with a core of essential genes occupying a compact genomic ‘old town’, whereas more recently acquired, condition-specific genes tend to be located in a more spacious ‘suburban’ genomic periphery.University of Crete Small-Scale Research Grant [4274 to C.N.]. Funding for open access charge: Plan Nacional de I+D+I of Spain Grant Number: BFU2015-67007-P to J.R.Peer reviewe

Crossref

Digital.CSIC

Gene socialization: gene order, GC content and gene silencing in Salmonella

Author: Ioannis Iliopoulos
Kalliopi Trachana
Nikolas Papanikolaou
Theodosios Theodosiou
Vasilis J Promponas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

BACKGROUND: Genes of conserved order in bacterial genomes tend to evolve slower than genes whose order is not conserved. In addition, genes with a GC content lower than the GC content of the resident genome are known to be selectively silenced by the histone-like nucleoid structuring protein (H-NS) in Salmonella. RESULTS: In this study, we use a comparative genomics approach to demonstrate that in Salmonella, genes whose order is not conserved (or genes without homologs) in closely related bacteria possess a significantly lower average GC content in comparison to genes that preserve their relative position in the genome. Moreover, these genes are more frequently targeted by H-NS than genes that have conserved their genomic neighborhood. We also observed that duplicated genes that do not preserve their genomic neighborhood are, on average, under less selective pressure. CONCLUSIONS: We establish a strong association between gene order, GC content and gene silencing in a model bacterial species. This analysis suggests that genes that are not under strong selective pressure (evolve faster than others) in Salmonella tend to accumulate more AT-rich mutations and are eventually silenced by H-NS. Our findings may establish new approaches for a better understanding of bacterial genome evolution and function, using information from functional and comparative genomics

Crossref

PubMed Central

Genome urbanization: clusters of topologically co-regulated genes delineate functional compartments in the genome of Saccharomyces cerevisiae

Author: Christoforos Nikolaou
Joaquim Roca
Maria Malliarou
Maria Tsochatzidou
Nikolas Papanikolaou
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

DrugQuest - a text mining workflow for drug association discovery.

Author: Iliopoulos Ioannis
Papanikolaou Nikolas
Pavlopoulos Georgios A
Theodosiou Theodosios
Vizirianakis Ioannis S
Publication venue: eScholarship, University of California
Publication date: 01/06/2016
Field of study

BackgroundText mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases.ResultsHerein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface.ConclusionsDrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest

PubMed Central

eScholarship - University of California

Supplement (01-10)

Author: Iliopoulos Ioannis
Ouzounis Christos A.
Papanikolaou Nikolas
Psomopoulos Fotis E.
Siarkou Victoria I.
Tsaftaris Athanasios S.
Publication venue
Publication date: 22/05/2012
Field of study

Supplements described in manuscript: S1 is sequence input (fasta format); S2 is time-ordered list of species; S3 is CAST output; S4 is parsed BLAST output; S5 is MCL output; S6 is BLAST output for unique genes; S7 is genome distance matrix; S8 is full sequence dataset from Figure 4 (fasta format); S9 is sequence dataset from Figure 5 (fasta format); S10 is count of new family contributions

Dryad Digital Repository (Duke University)