Search CORE

609,308 research outputs found

Maximal Extraction of Biological Information from Genetic Interaction Data

Author: AH Tong
AM Dudley
BL Drees
D Segre
David J. Galas
DJ Galas
DR Shook
Gregory W. Carter
GW Carter
H Sinha
HD Madhani
JM Gancedo
Joel S. Bader
KB Lengeler
KD Entian
L Avery
LM Steinmetz
LV Zhang
M Ashburner
M Li
M Schuldiner
O Carlborg
PD Grunwald
R Kelley
R Milo
RJ Taylor
RP Onge
S Jana
SR Collins
T Ideker
Timothy Galitski
W Zhong
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Targeted genetic perturbation is a powerful tool for inferring gene function in model organisms. Functional relationships between genes can be inferred by observing the effects of multiple genetic perturbations in a single strain. The study of these relationships, generally referred to as genetic interactions, is a classic technique for ordering genes in pathways, thereby revealing genetic organization and gene-to-gene information flow. Genetic interaction screens are now being carried out in high-throughput experiments involving tens or hundreds of genes. These data sets have the potential to reveal genetic organization on a large scale, and require computational techniques that best reveal this organization. In this paper, we use a complexity metric based in information theory to determine the maximally informative network given a set of genetic interaction data. We find that networks with high complexity scores yield the most biological information in terms of (i) specific associations between genes and biological functions, and (ii) mapping modules of co-functional genes. This information-based approach is an automated, unsupervised classification of the biological rules underlying observed genetic interactions. It might have particular potential in genetic studies in which interactions are complex and prior gene annotation data are sparse

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central

An agent-based simulation framework for complex systems

Author: Benso Alfredo
Di Carlo Stefano
Politano Gianfranco Michele Maria
Savino Alessandro
Ur Rehman Hafeez
Publication venue
Publication date: 01/01/2012
Field of study

In this abstract we present a new approach to the simulation of complex systems as biological interaction networks, chemical reactions, ecosystems, etc. It aims at overcoming previously proposed analytical approaches that, because of several computational challenges, could not handle systems of realistic com- plexity. The proposed model is based on a set of agents interacting through a shared environment. Each agent functions independently from the others, and its be- havior is driven only by its current status and the "content" of the surrounding environment. The environment is the only "data repository" and does not store the value of variables, but only their presence and concentration. Each agent performs 3 main functions: 1. it samples the environment at random locations 2. based on the distribution of the sampled data and a proper Transfer Func- tion, it computes the rate at which the output values are generated 3. it writes the output "products" at random locations. The environment is modeled as a Really Random Access Memory (R2AM). Data is written and sampled at random memory locations. Each memory location represent an atomic sample (a molecule, a chemical compound, a protein, an ion, . . . ). Presence and concentration of these samples are what constitutes the environment data set. The environment can be sensitive to external stimuli (e.g., pH, Temperature, ...) and can include topological information to allow its partitioning (e.g. between nucleus and cytoplasm in a cell) and the modeling of sample "movements" within the environment. The proposed approach is easily scalable in both complexity and computa- tional costs. Each module could implement a very simple object as a single chemical reaction or a very complex process as a gene translation into a pro- tein. At the same time, from the hardware point of view, the complexity of the objects implementing a single agent can range from a single software process to a dedicated computer or hardware platfor

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Semantic Clustering of Genomic Documents using GO Terms as Feature Set

Author: Dr. B.L.Shivakumar
V.Bhuvaneswari
Publication venue: Global Journals Inc. (US)
Publication date: 15/01/2012
Field of study

The biological databases generate huge volume of genomics and proteomics data. The sequence information is used by researches to find similarity of genes, proteins and to find other related information. The genomic sequence database consists of large number of attributes as annotations, represented for defining the sequences in Xml format. It is necessary to have proper mechanism to group the documents for information retrieval. Data mining techniques like clustering and classification methods can be used to group the documents. The objective of the paper is to analyze the set of keywords which can be represented as features for grouping the documents semantically. This paper focuses on clustering genomic documents based on both structural and content similarity .The structural similarity is found using structural path between the documents. The semantic similarity is found for the structurally similar documents. We have proposed a methodology to cluster the genomic documents using sequence attributes without using the sequence data. The sequence attributes for genomic documents are analyzed using Filter based feature selection methods to find the relevant feature set for grouping the similar documents. Based on the attribute ranking we have clustered the similar documents using All Keyword approach (KBA) and GO Terms based approach (GOTA). The experimental results of the clusters are validated for two approaches by inferring biological meaning using Gene Ontology. From the results it was inferred that all keywords based approach grouped documents based on the semantic meaning of Gene Ontology terms. The GO terms based approach grouped larger number of documents without considering any other keywords, which is semantically relevant which results in reducing the complexity of the attributes considered. We claim that using GO terms can alone be used as features set to group genomic documents with high similarity

Global Journal of Computer Science and Technology (GJCST)

Exponential Random Graph Modeling for Complex Brain Networks

Author: A Rinaldo
AF Alexander-Bloch
AM Peiffer
BCM van Wijk
CJ Stam
D Meunier
D Meunier
DJ Watts
DR Hunter
DR Hunter
DS Bassett
E Bullmore
G Gong
G Gong
G Robins
GL Robins
GL Robins
GL Robins
H Akaike
KE Joyce
KE Muller
M Morris
M Rubinov
M Rubinov
MAJ van Duijn
ME Lynall
MP van den Heuvel
MS Handcock
MS Handcock
N Tzourio-Mazoyer
O Frank
Olaf Sporns
Paul J. Laurienti
PE Pattison
S Hayasaka
S Wasserman
Satoru Hayasaka
Sean L. Simpson
TAB Snijders
Y Iturria-Medina
ZA Gaal
ZM Saul
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Exponential random graph models (ERGMs), also known as p* models, have been utilized extensively in the social science literature to study complex networks and how their global structure depends on underlying structural components. However, the literature on their use in biological networks (especially brain networks) has remained sparse. Descriptive models based on a specific feature of the graph (clustering coefficient, degree distribution, etc.) have dominated connectivity research in neuroscience. Corresponding generative models have been developed to reproduce one of these features. However, the complexity inherent in whole-brain network data necessitates the development and use of tools that allow the systematic exploration of several features simultaneously and how they interact to form the global network architecture. ERGMs provide a statistically principled approach to the assessment of how a set of interacting local brain network features gives rise to the global structure. We illustrate the utility of ERGMs for modeling, analyzing, and simulating complex whole-brain networks with network data from normal subjects. We also provide a foundation for the selection of important local features through the implementation and assessment of three selection approaches: a traditional p-value based backward selection approach, an information criterion approach (AIC), and a graphical goodness of fit (GOF) approach. The graphical GOF approach serves as the best method given the scientific interest in being able to capture and reproduce the structure of fitted brain networks

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

PubMed Central

graphite - a Bioconductor package to convert pathway topology to gene network

Author: Calura Enrica
Cavalieri Duccio
Romualdi C.
Sales G
Publication venue
Publication date: 01/01/2012
Field of study

BACKGROUND: Gene set analysis is moving towards considering pathway topology as a crucial feature. Pathway elements are complex entities such as protein complexes, gene family members and chemical compounds. The conversion of pathway topology to a gene/protein networks (where nodes are a simple element like a gene/protein) is a critical and challenging task that enables topology-based gene set analyses. Unfortunately, currently available R/Bioconductor packages provide pathway networks only from single databases. They do not propagate signals through chemical compounds and do not differentiate between complexes and gene families. RESULTS: Here we present graphite, a Bioconductor package addressing these issues. Pathway information from four different databases is interpreted following specific biologically-driven rules that allow the reconstruction of gene-gene networks taking into account protein complexes, gene families and sensibly removing chemical compounds from the final graphs. The resulting networks represent a uniform resource for pathway analyses. Indeed, graphite provides easy access to three recently proposed topological methods. The graphite package is available as part of the Bioconductor software suite. CONCLUSIONS: graphite is an innovative package able to gather and make easily available the contents of the four major pathway databases. In the field of topological analysis graphite acts as a provider of biological information by reducing the pathway complexity considering the biological meaning of the pathway elements

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Florence Research

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Single-molecule real-time sequencing combined with optical mapping yields completely finished fungal genome

Author: Datema Erwin
Faino Luigi
Janssen Antoine
Seidl Michael F.
Thomma Bart P. H. J.
Van Den Berg Grardy C. M.
Wittenberg Alexander H. J.
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2015
Field of study

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio- generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. IMPORTANCE Studying whole-genome sequences has become an important aspect of biological research. The advent of nextgeneration sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBiogenerated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome sanalyses to facilitate functional studies into an organism’s biology

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Recommended from our members

Entourage: Visualizing Relationships between Biological Pathways using Contextual Subsets

Author: Gratzl Samuel
Kalkofen Denis
Lex Alexander
Partl Christian
Pfister Hanspeter
Schmalstieg Dieter
Streit Marc
Wasserman Anne Mai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Biological pathway maps are highly relevant tools for many tasks in molecular biology. They reduce the complexity of the overall biological network by partitioning it into smaller manageable parts. While this reduction of complexity is their biggest strength, it is, at the same time, their biggest weakness. By removing what is deemed not important for the primary function of the pathway, biologists lose the ability to follow and understand cross-talks between pathways. Considering these cross-talks is, however, critical in many analysis scenarios, such as judging effects of drugs. In this paper we introduce Entourage, a novel visualization technique that provides contextual information lost due to the artificial partitioning of the biological network, but at the same time limits the presented information to what is relevant to the analyst’s task. We use one pathway map as the focus of an analysis and allow a larger set of contextual pathways. For these context pathways we only show the contextual subsets, i.e., the parts of the graph that are relevant to a selection. Entourage suggests related pathways based on similarities and highlights parts of a pathway that are interesting in terms of mapped experimental data. We visualize interdependencies between pathways using stubs of visual links, which we found effective yet not obtrusive. By combining this approach with visualization of experimental data, we can provide domain experts with a highly valuable tool. We demonstrate the utility of Entourage with case studies conducted with a biochemist who researches the effects of drugs on pathways. We show that the technique is well suited to investigate interdependencies between pathways and to analyze, understand, and predict the effect that drugs have on different cell types.Engineering and Applied Science

Harvard University - DASH

PubMed Central

The Novartis Repository