Search CORE

41 research outputs found

CODC: A Copula-based model to identify differential coexpression

Author: Bandyopadhyay S. (Sanghamitra)
Lall S. (Snehalika)
Ray S. (Sumanta)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/06/2020
Field of study

Differential coexpression has recently emerged as a new way to establish a fundamental difference in expression pattern among a group of genes between two populations. Earlier methods used some scoring techniques to detect changes in correlation patterns of a gene pair in two conditions. However, modeling differential coexpression by means of finding differences in the dependence structure of the gene pair has hitherto not been carried out. We exploit a copula-based framework to model differential coexpression between gene pairs in two different conditions. The Copula is used to model the dependency between expression profiles of a gene pair. For a gene pair, the distance between two joint distributions produced by copula is served as differential coexpression. We used five pan-cancer TCGA RNA-Seq data to evaluate the model that outperforms the existing state of the art. Moreover, the proposed model can detect a mild change in the coexpression pattern across two conditions. For noisy expression data, the proposed method perf

CWI's Institutional Repository

Predicting potential drug targets and repurposable drugs for COVID-19 via a deep generative model for graphs

Author: Bandyopadhyay S. (Sanghamitra)
Lall S. (Snehalika)
Mukhopadhyay A. (Anirban)
Ray S. (Sumanta)
Schönhuth A. (Alexander)
Publication venue
Publication date: 05/07/2020
Field of study

Coronavirus Disease 2019 (COVID-19) has been creating a worldwide pandemic situation. Repurposing drugs, already shown to be free of harmful side effects, for the treatment of COVID-19 patients is an important option in launching novel therapeutic strategies. Therefore, reliable molecule interaction data are a crucial basis, where drug-/protein-protein interaction networks establish invaluable, year-long carefully curated data resources. However, these resources have not yet been systematically exploited using high-performance artificial intelligence approaches. Here, we combine three networks, two of which are year-long curated, and one of which, on SARS-CoV-2-human host-virus protein interactions, was published only most recently (30th of April 2020), raising a novel network that puts drugs, human and virus proteins into mutual context. We apply Variational Graph AutoEncoders (VGAEs), representing most advanced deep learning based methodology for the analysis of data that are subject to network constraints. Reliable simulations confirm that we operate at utmost accuracy in terms of predicting missing links. We then predict hitherto unknown links between drugs and human proteins against which virus proteins preferably bind. The corresponding therapeutic agents present splendid starting points for exploring novel host-directed therapy (HDT) option

arXiv.org e-Print Archive

CWI's Institutional Repository

Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes

Author: A Horzyk
AA Alizadeh
AK Jain
Anirban Mukhopadhyay
AV Lukashin
C Xiang
CA Coello Coello
CW Hsu
D Dembele
DE Goldberg
DJ Lockhart
E Zitzler
I Davidson
J Handl
J Herrero
JC Bezdek
JT Tou
K Crammer
K Deb
M Hollander
MB Eisen
P Reymonda
P Rousseeuw
P Tamayo
R Sharan
RJ Cho
S Bandyopadhyay
S Bandyopadhyay
S Bandyopadhyay
S Bandyopadhyay
S Bandyopadhyay
S Chu
S Tavazoie
Sanghamitra Bandyopadhyay
SY Kim
SZ Selim
U Maulik
U Maulik
Ujjwal Maulik
V Vapnik
VR Iyer
X Wen
XL Xie
Y Xu
ZS Qin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

MultiMiTar: A Novel Multi Objective Optimization based miRNA-Target Prediction Method

Author: A Grimson
A Krek
CC Chang
D Betel
F Xiao
GL Papadopoulos
K Deb
M Kertesz
M Maragkakis
M Peter
M Selbach
M Sturm
M Yousef
P Alexiou
R Friedman
Ramkrishna Mitra
S Bandyopadhyay
S Bandyopadhyay
S Wu
Sanghamitra Bandyopadhyay
SD Hsu
SM Johnson
T Ishida
Timothy Ravasi
V Rusinov
VN Vapnik
VN Vapnik
X Wang
Publication venue: Public Library of Science
Publication date: 15/09/2011
Field of study

BACKGROUND: Machine learning based miRNA-target prediction algorithms often fail to obtain a balanced prediction accuracy in terms of both sensitivity and specificity due to lack of the gold standard of negative examples, miRNA-targeting site context specific relevant features and efficient feature selection process. Moreover, all the sequence, structure and machine learning based algorithms are unable to distribute the true positive predictions preferentially at the top of the ranked list; hence the algorithms become unreliable to the biologists. In addition, these algorithms fail to obtain considerable combination of precision and recall for the target transcripts that are translationally repressed at protein level. METHODOLOGY/PRINCIPAL FINDING: In the proposed article, we introduce an efficient miRNA-target prediction system MultiMiTar, a Support Vector Machine (SVM) based classifier integrated with a multiobjective metaheuristic based feature selection technique. The robust performance of the proposed method is mainly the result of using high quality negative examples and selection of biologically relevant miRNA-targeting site context specific features. The features are selected by using a novel feature selection technique AMOSA-SVM, that integrates the multi objective optimization technique Archived Multi-Objective Simulated Annealing (AMOSA) and SVM. CONCLUSIONS/SIGNIFICANCE: MultiMiTar is found to achieve much higher Matthew's correlation coefficient (MCC) of 0.583 and average class-wise accuracy (ACA) of 0.8 compared to the others target prediction methods for a completely independent test data set. The obtained MCC and ACA values of these algorithms range from -0.269 to 0.155 and 0.321 to 0.582, respectively. Moreover, it shows a more balanced result in terms of precision and sensitivity (recall) for the translationally repressed data set as compared to all the other existing methods. An important aspect is that the true positive predictions are distributed preferentially at the top of the ranked list that makes MultiMiTar reliable for the biologists. MultiMiTar is now available as an online tool at www.isical.ac.in/~bioinfo_miu/multimitar.htm. MultiMiTar software can be downloaded from www.isical.ac.in/~bioinfo_miu/multimitar-download.htm

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Analyzing miRNA co-expression networks to explore TF-miRNA regulation

Author: A Krek
A Tanzer
AT Amin
BP Lewis
D Brown
D Karolchik
DO Perkins
DP Bartel
E Hornstein
G Wang
GN Brock
J Handl
J Lu
LC Laurent
LF Sempere
Malay Bhattacharyya
MB Eisen
N Slonim
P Chopra
P Datta
P Rousseeuw
R McGill
R Shalgi
S Bandyopadhyay
S Bandyopadhyay
S Baskerville
Sanghamitra Bandyopadhyay
Z Tian
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

PuTmiR: A database for extracting neighboring transcription factors of human microRNAs

Author: C Tibiche
CJ Braun
CW Hsu
D Brown
D Karolchik
DO Perkins
DP Bartel
FJ Slack
GA Calin
H Wang
HK Saini
HK Saini
J Lu
J Wang
KK Farh
Malay Bhattacharyya
N Liu
Q Cui
Q Cui
R Shalgi
RJ Roberts
S Bandyopadhyay
S Bandyopadhyay
S Fujita
S Griffiths-Jones
Sanghamitra Bandyopadhyay
T Fukao
V Matys
X Xie
Z Duan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Some of the recent investigations in systems biology have revealed the existence of a complex regulatory network between genes, microRNAs (miRNAs) and transcription factors (TFs). In this paper, we focus on TF to miRNA regulation and provide a novel interface for extracting the list of putative TFs for human miRNAs. A putative TF of an miRNA is considered here as those binding within the close genomic locality of that miRNA with respect to its starting or ending base pair on the chromosome. Recent studies suggest that these putative TFs are possible regulators of those miRNAs. Description The interface is built around two datasets that consist of the exhaustive lists of putative TFs binding respectively in the 10 kb upstream region (USR) and downstream region (DSR) of human miRNAs. A web server, named as PuTmiR, is designed. It provides an option for extracting the putative TFs for human miRNAs, as per the requirement of a user, based on genomic locality, i.e., any upstream or downstream region of interest less than 10 kb. The degree distributions of the number of putative TFs and miRNAs against each other for the 10 kb USR and DSR are analyzed from the data and they explore some interesting results. We also report about the finding of a significant regulatory activity of the YY1 protein over a set of oncomiRNAs related to the colon cancer. Conclusion The interface provided by the PuTmiR web server provides an important resource for analyzing the direct and indirect regulation of human miRNAs. While it is already an established fact that miRNAs are regulated by TFs binding to their USR, this database might possibly help to study whether an miRNA can also be regulated by the TFs binding to their DSR.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Multi-Class Clustering of Cancer Subtypes through SVM Based Ensemble of Pareto-Optimal Solutions for Gene Marker Identification

Author: A Chlenski
A Strehl
AA Alizadeh
AK Jain
AK Jain
Alfons Navarro
Anirban Mukhopadhyay
C Coello Coello
DE Goldberg
E Zitzler
E Zitzler
F Hedborg
G Melino
J Han
J Handl
J Khan
K Crammer
K Deb
K Deb
K Deb
KL Schaefer
KP Kumar
KY Yeung
KY Yeung
L Fei
MCP de Souto
P Rousseeuw
P Tamayo
S Bandyopadhyay
S Bandyopadhyay
S Kilpinen
S Kwon
Sanghamitra Bandyopadhyay
T Ward
TR Golub
U Alon
U Maulik
U Maulik
U Maulik
Ujjwal Maulik
V Vapnik
Publication venue: Public Library of Science
Publication date: 12/11/2010
Field of study

With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM) classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes

Public Library of Science (PLOS)

Crossref

PubMed Central

SFSSClass: an integrated approach for miRNA based tumor classification

Author: A Pasquinelli
A Tanay
AI Saeed
B Harfe
B Reinhart
D Bartel
G Calin
G Calin
I Kononenko
J Chou
J Lu
KY Yeung
M Lagos-Quintana
M Lagos-Quintana
Michael Q Zhang
MV Iorio
NC Lau
P Blower
R Lee
R Tibshirani
Ramkrishna Mitra
S Volinia
Sanghamitra Bandyopadhyay
SC Madiera
U Shankavaram
Ujjwal Maulik
Y Wang
Y Zheng
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: MicroRNA (miRNA) expression profiling data has recently been found to be particularly important in cancer research and can be used as a diagnostic and prognostic tool. Current approaches of tumor classification using miRNA expression data do not integrate the experimental knowledge available in the literature. A judicious integration of such knowledge with effective miRNA and sample selection through a biclustering approach could be an important step in improving the accuracy of tumor classification. Results: In this article, a novel classification technique called SFSSClass is developed that judiciously integrates a biclustering technique SAMBA for simultaneous feature (miRNA) and sample (tissue) selection (SFSS), a cancer-miRNA network that we have developed by mining the literature of experimentally verified cancer-miRNA relationships and a classifier uncorrelated shrunken centroid (USC). SFSSClass is used for classifying multiple classes of tumors and cancer cell lines. In a part of the investigation, poorly differentiated tumors (PDT) having non diagnostic histological appearance are classified while training on more differentiated tumor (MDT) samples. The proposed method is found to outperform the best known accuracy in the literature on the experimental data sets. For example, while the best accuracy reported in the literature for classifying PDT samples is similar to 76.5%, the accuracy of SFSSClass is found to be similar to 82.3%. The advantage of incorporating biclustering integrated with the cancer-miRNA network is evident from the consistently better performance of SFSSClass (integration of SAMBA, cancer-miRNA network and USC) over USC (eg., similar to 70.5% for SFSSClass versus similar to 58.8% in classifying a set of 17 MDT samples from 9 tumor types, similar to 91.7% for SFSSClass versus similar to 75% in classifying 12 cell lines from 6 tumor types and similar to 382.3% for SFSSClass versus similar to 41.2% in classifying 17 PDT samples from 11 tumor types). Conclusion: In this article, we develop the SFSSClass algorithm which judiciously integrates a biclustering technique for simultaneous feature (miRNA) and sample (tissue) selection, the cancer-miRNA network and a classifier. The novel integration of experimental knowledge with computational tools efficiently selects relevant features that have high intra-class and low interclass similarity. The performance of the SFSSClass is found to be significantly improved with respect to the other existing approaches

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

Author: A Ben-Hur
A Mukhopadhyay
A Mukhopadhyay
A Mukhopadhyay
A Panchenko
A Prelic
AL DeFranco
Anirban Mukhopadhyay
B Goethals
C Zhou
D Gibellini
F Supek
H Vashistha
J Doolittle
J Hipp
J Huang
J Jiang
JI MacPherson
L Zhang
MD Dyer
MJ Zaki
MR Arkin
N Lin
N Pasquier
O Tastan
P Gupta
Peter Csermely
R Agrawal
R Agrawal
R Cheung
R Jansen
RG Ptak
RN Saha
S Bandyopadhyay
Sanghamitra Bandyopadhyay
SC Madeira
U Maulik
U Maulik
U Maulik
U Maulik
Ujjwal Maulik
W Fu
X Wang
Y Qi
Y Qi
Y Yamanishi
Publication venue: Public Library of Science
Publication date: 23/04/2012
Field of study

Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Fuzzy based Impulse Noise Reduction Method

Author: AM Mirza
Ayyaz Hussain
C-S Lee
C-S Lee
G Resconi
HL Eng
I Pitas
J Astola
JH Wang
JW Tukey
K Arakawa
M Senthil Arumugam
M. Arfan Jaffar
P Liu
Pitas
S Schulte
S Schulte
SJ Ko
Sohail Masood Bhatti
Sriparna Saha and Sanghamitra Bandyopadhyay
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref