Search CORE

59,649 research outputs found

BiofilmGeneSet: Leveraging Multi-Omics Data Mining and ICA To Discover Biofilm Stage Genes of Interest from Condition-Specific Expression Dataset

Author: Alaba Mathew Olakunle
Publication venue: USD RED
Publication date: 01/01/2022
Field of study

Biofilm formation occurs in the attachment, colony, maturation, and dispersion stages. Understanding the molecular basis at every point of this process is essential to developing efficient diagnostics devices and effective antibiofilm agents. Gene expression data provide molecular insight for both static and temporal biofilm development. The most used analytic techniques for biofilm gene expression data are clustering and network inference algorithms, which class genes with similar expressions across the samples. However, these methods are inherently deficient because they do not capture gene(s) expressed in a subset of the samples. These subsets might be unique to a developmental stage, for example. Secondly, these methods perform a nonoverlapping gene assignment to the classes. This also leads to loss of information because gene expression is combinatorial, and a gene product can simultaneously participate more or less in different pathways. In this study, I developed an analysis Framework referred BiofilmGeneSet to classify genes significantly contributing to biofilm developmental stages. I applied the JADE algorithm to Expression data (X) to extract statistically independent expression modules (S) and their module activity (A). Next, Pearson correlation coefficients between the module activity and expression profile were computed to determine significant modules. BioNERO: an all-in-one Bioconductor package for comprehensive and easy biological network reconstruction was applied to the same data to evaluate the performance of this workflow. Of the 15 independent expression modules, modules 14, 11, and 4 were significantly associated with the attachment, colony, and maturation stages. The significance of this work can be summarized as follows: (i) a new data mining and expression gene classification framework with high accuracy compared to weighted gene co-expression network methods for problem-based gene set identification; (ii) a new gene set as a potential biomarker for each biofilm development stage; (iii) the generalization of our framework allows us to find gene sets relevant to several other related biological events such as quorum sensing, EPS, antibiotic resistance, etc.; (iv) a relevant functional annotation that will guide scientist in designing an experiment to validate our newly discovered marker gene sets

USD RED (University of South Dakota)

Integration of molecular network data reconstructs Gene Ontology.

Author: Gligorijević V
Janjić V
Pržulj N
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/08/2014
Field of study

Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

PubMed Central

Spiral - Imperial College Digital Repository

Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments

Author: Lin Nan
Liu Tianqing
Shi Ningzhong
Zhang Baoxue
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data. Results We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> and Wang <it>et al</it>. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. It is also computationally much faster than Wang <it>et al</it>. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Conclusion Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Preparation and characterization of magnetite (Fe3O4) nanoparticles By Sol-Gel method

Author: A. Sekak Khairunnadim
Asman Saliza
Mustafa Mohd K.
Takai Zakiyyu I.
Publication venue: Universiti Malaysia Perlis (UniMAP)
Publication date: 01/01/2019
Field of study

The magnetite (Fe3O4) nanoparticles were successfully synthesized and annealed under vacuum at different temperature. The Fe3O4 nanoparticles prepared via sol-gel assisted method and annealed at 200-400ºC were characterized by Fourier Transformation Infrared Spectroscopy (FTIR), X-ray Diffraction spectra (XRD), Field Emission Scanning Electron Microscope (FESEM) and Atomic Force Microscopy (AFM). The XRD result indicate the presence of Fe3O4 nanoparticles, and the Scherer`s Formula calculated the mean particles size in range of 2-25 nm. The FESEM result shows that the morphologies of the particles annealed at 400ºC are more spherical and partially agglomerated, while the EDS result indicates the presence of Fe3O4 by showing Fe-O group of elements. AFM analyzed the 3D and roughness of the sample; the Fe3O4 nanoparticles have a minimum diameter of 79.04 nm, which is in agreement with FESEM result. In many cases, the synthesis of Fe3O4 nanoparticles using FeCl3 and FeCl2 has not been achieved, according to some literatures, but this research was able to obtained Fe3O4 nanoparticles base on the characterization results

UTHM Institutional Repository

SMART: Unique splitting-while-merging framework for gene clustering

Author: A Thalamuthu
AD Lanterman
AE Teschendorff
AK Jain
Asoke K. Nandi
B Abu-Jamous
B Fritzke
B Fritzke
CR Lin
CS Wallace
D Dembele
D Jiang
David J. Roberts
G Celeux
H Akaike
J Qin
J Rissanen
KY Yeung
L Hubert
L Mavridis
L Zhao
MAT Figueiredo
P Tamayo
PT Spellman
R Xu
R Xu
RJ Cho
Rui Fa
S Bandyopadhyay
S Monti
S Wu
Sergio Gómez
T Kohonen
T Pramila
TR Golub
WM Rand
YJ Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2014
Field of study

Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Recommended from our members

ManiNetCluster: a novel manifold learning approach to reveal the functional links between gene networks.

Author: Blaby Ian K
Nguyen Nam D
Wang Daifeng
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

BACKGROUND:The coordination of genomic functions is a critical and complex process across biological systems such as phenotypes or states (e.g., time, disease, organism, environmental perturbation). Understanding how the complexity of genomic function relates to these states remains a challenge. To address this, we have developed a novel computational method, ManiNetCluster, which simultaneously aligns and clusters gene networks (e.g., co-expression) to systematically reveal the links of genomic function between different conditions. Specifically, ManiNetCluster employs manifold learning to uncover and match local and non-linear structures among networks, and identifies cross-network functional links. RESULTS:We demonstrated that ManiNetCluster better aligns the orthologous genes from their developmental expression profiles across model organisms than state-of-the-art methods (p-value <2.2×10-16). This indicates the potential non-linear interactions of evolutionarily conserved genes across species in development. Furthermore, we applied ManiNetCluster to time series transcriptome data measured in the green alga Chlamydomonas reinhardtii to discover the genomic functions linking various metabolic processes between the light and dark periods of a diurnally cycling culture. We identified a number of genes putatively regulating processes across each lighting regime. CONCLUSIONS:ManiNetCluster provides a novel computational tool to uncover the genes linking various functions from different networks, providing new insight on how gene functions coordinate across different conditions. ManiNetCluster is publicly available as an R package at https://github.com/daifengwanglab/ManiNetCluster

eScholarship - University of California

Joint co-clustering: co-clustering of genomic and clinical bioimaging data

Author: Aho
Amann
Aurenhammer
Brabender
Brey
Bullinger
Chen
Dacic
Draghici
Elisa Ficarra
Elmoataz
Enrico Macii
Ficarra
Gebhardt
Giovanni De Micheli
Hengerer
Jacob
Kersting
Kittler
Luca Benini
Malpica
McInerney
Mukherjee
Otsu
Ridler
Ruifrok
Saviozzi
Sungroh Yoon
Suzuki
Taneja
Tidow
Troyanskaya
Tusher
Yang
Yoon
Zheng
Zhou
Publication venue: Elsevier
Publication date: 01/01/2007
Field of study

AbstractFor better understanding the genetic mechanisms underlying clinical observations, and better defining a group of potential candidates for protein-family-inhibiting therapy, it is interesting to determine the correlations between genomic, clinical data and data coming from high resolution and fluorescent microscopy. We introduce a computational method, called joint co-clustering, that can find co-clusters or groups of genes, bioimaging parameters and clinical traits that are believed to be closely related to each other based on the given empirical information. As bioimaging parameters, we quantify the expression of growth factor receptor EGFR/erb-B family in non-small cell lung carcinoma (NSCLC) through a fully-automated computer-aided analysis approach. This immunohistochemical analysis is usually performed by pathologists via visual inspection of tissue samples images. Our fully-automated techniques streamlines this error-prone and time-consuming process, thereby facilitating analysis and diagnosis. Experimental results for several real-life datasets demonstrate the high quantitative precision of our approach. The joint co-clustering method was tested with the receptor EGFR/erb-B family data on non-small cell lung carcinoma (NSCLC) tissue and identified statistically significant co-clusters of genes, receptor protein expression and clinical traits. The validation of our results with the literature suggest that the proposed method can provide biologically meaningful co-clusters of genes and traits and that it is a very promising approach to analyse large-scale biological data and to study multi-factorial genetic pathologies through their genetic alterations

Infoscience - École polytechnique fédérale de Lausanne

Elsevier - Publisher Connector

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

PORTO Publications Open Repository TOrino