Search CORE

12 research outputs found

Recommended from our members

Genome-wide Predictive Simulation on the Effect of Perturbation and the Cause of Phenotypic variations with Network Biology Approach

Author: Jang In Sock
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Thanks to modern high-throughput technologies such as microarray-based gene expression profiling, a large amount of molecular profile data have been generated in several disease related contexts. Despite the fact that these data likely contain systems-level information about disease regulation, revealing the underlying dynamics between genes and mechanisms of gene regulation in genome wide way remains a major challenge. Understanding these mechanisms in genome-wide fashion and the resulting dynamical behavior is a key goal of the nascent field of systems biology. One approach to dissect the logic of the cell, is to use reverse engineering algorithms that infer regulatory interactions form molecular profile data. In this context, use of information theoretic approaches has been very successful: for instance, the ARACNe algorithm has been able to successfully infer transcriptional interactions between transcription factors and their target genes; similarly, the MINDy algorithm has identified post-translational modulators of transcription factor activity by multivariate analysis of large gene expression profile datasets. Many methods have been proposed to improve ARACNe both from a computational efficiency perspective and in terms of increasing the accuracy of the predicted interactions. Yet, the main core of ARACNe, i.e., the data processing inequality (DPI), has remained virtually unaffected even though modern information theory has extended the DPI theorem into higher-order interactions. First, we introduce an improvement of ARACNe, hARACNe, which recursively applies a higher-order DPI analysis. We show that the new algorithm successfully detects false positive feed-forward loops involving more than three genes. Second, we extend the MINDy algorithm using co-information as a novel metric, thus replacing the conditional mutual information and significantly improving the algorithm"™s predictions. Largely, two ultimate goals of systems perturbation studies are to reveal how human diseases are connected with the genes, and to find regulatory mechanism that determine disease cell behavior. However, these goals remain daunting: even the most talented researchers still have to rely on laborious genetic screens and very simplified hypotheses about effects of given perturbation have been experimentally validated and roughly analyzed with very limited regulatory sub-network such as pathway. To overcome these limitations, use of gene regulatory network is explored in this thesis research. Specifically, we propose creation of a new algorithm that can accurately predict cell state in genome-wide fashion following perturbation of individual genes, such as from silencing or ectopic expression experiments. Furthermore, experimentally validated methods to predict genome-wide changes in a cellular system following a genetic perturbation (e.g., gene silencing or ectopic expression) are still unavailable, and even though phenotypic variations are experimentally profiled and gene signatures are selected by being statistically tested, finding the exact regulator which systematically causes significant variations of gene signature is still quite challenging. In this research, I introduce and experimentally validate a probabilistic Bayesian method to simulate the propagation of genetic perturbations on integrated gene regulatory networks inferred by the hARACNe and coMINDy algorithms from human B cell data. With the same predictive framework, we also computationally predict the master driver (regulator) that is most likely to have produced the observed variations in gene expression levels; these studies as a systematized pre-screening process before genetic manipulation. I predict in silico the effect of silencing of several genes as well as the cause of phenotypic variations. Performance analysis, tested by Gene Set Enrichment Analysis (GSEA), shows that the new methods are highly predictive, thus providing an initial step toward building predictive probabilistic regulatory models, which may be applicable as pre-screening steps in perturbation studies

Columbia University Academic Commons

Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling

Author: Alvarez Mariano Javier
Aparicio Samuel
Bilal Erhan
Børresen-Dale Anne-Lise
Caldas Carlos
Califano Andrea
Curtis Christina
Dutkowski Janusz
Friend Stephen H.
Guinney Justin
Ideker Trey
Jang In Sock
Kristensen Vessela N.
Logsdon Benjamin A.
Margolin Adam A.
Mecham Brigham H.
Pandey Gaurav
Rueda Oscar M.
Sauerwine Benjamin A.
Schadt Eric E.
Shimoni Yishai
Stolovitzky Gustavo A.
Tost Jorg
Vollan Hans Kristian Moen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models

Crossref

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen

Author: Bence Szalai
Dennis Wang
Elias Chaibub Neto
Eric K.Y. Tang
Giovanni Y. Di Veroli
Gustavo Stolovitzky
In Sock Jang
Jaewoo Kang
Jonathan R. Dry
Julio Saez-Rodriguez
Justin Guinney
Krishna C. Bulusu
Mathew J. Garnett
Mehmet Eren Ahnsen
Michael P. Menden
Mike J. Mason
Mikhail Zaslavskiy
Minji Jeon
Robert Vogel
Russ Wolfinger
Stephen Fawell
Thea Norman
Thomas Yu
Tin Nguyen
Yanfang Guan
Zara Ghazoui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/10/2022
Field of study

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells

UTUPub

Molecular Beam Epitaxy of Highly Crystalline Monolayer Molybdenum Disulfide on Hexagonal Boron Nitride

Author: Ding Zijing
Fu Deyi
Fu Wei
Jang A-Rang
Li Lingjun
Loh Kian Ping
Pantelides Sokrates T.
Poh Sock Mui
Ren Tianhua
Shin Hyeon Suk
Shin Tae Joo
Song Peng
Xu Hai
Yoon Seong In
Zhang Yu-Yang
Zhao Xiaoxu
Zhou Wu
Publication venue: 'American Chemical Society (ACS)'
Publication date: 21/06/2017
Field of study

Atomically thin molybdenum disulfide (MoS2), a direct-band-gap semiconductor, is promising for applications in electronics and optoelectronics, but the scalable synthesis of highly crystalline film remains challenging. Here we report the successful epitaxial growth of a continuous, uniform, highly crystalline monolayer MoS2 film on hexagonal boron nitride (h-BN) by molecular beam epitaxy. Atomic force microscopy and electron microscopy studies reveal that MoS2 grown on h-BN primarily consists of two types of nucleation grains (0?? aligned and 60?? antialigned domains). By adopting a high growth temperature and ultralow precursor flux, the formation of 60?? antialigned grains is largely suppressed. The resulting perfectly aligned grains merge seamlessly into a highly crystalline film. Large-scale monolayer MoS2 film can be grown on a 2 in. h-BN/sapphire wafer, for which surface morphology and Raman mapping confirm good spatial uniformity. Our study represents a significant step in the scalable synthesis of highly crystalline MoS2 films on atomically flat surfaces and paves the way to large-scale applications

ScholarWorks@UNIST

FigShare

Functional Kinomics Identifies Candidate Therapeutic Targets in Head and Neck Cancer

Author: Adam A. Margolin
Asel Biktasova
Carla Grandori
Chang Xu
Christopher J. Kemp
Christopher M. Schaupp
Debnath
Eduardo Méndez
In Sock Jang
James Annis
Katayama
Kay E. Gurley
Luisa Angelica Lerma
Mano
Matsumoto
Michael Kao
Russell Moser
Wendell G. Yarbrough
Xu
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date
Field of study

Crossref

Joint cell-line and patient modeling of drug sensitivity reveals novel molecular biomarkers for targeted and conventional chemotherapy.

Author: Adam Margolin
Antoine Hollebecque
Benjamin Besse
Charles Ferte
Christophe Massard
Elias Chaibub Neto
Eric Angevin
Fabrice Andre
Frederic Commo
in Sock Jang
Jean-Charles Soria
Justin Guinney
Ludovic Lacroix
Mehmet Gonen
Michel Ducreux
Olga Nikolova
Stephen Henry Friend
Valerie Koubi-Pick
Publication venue: 'American Society of Clinical Oncology (ASCO)'
Publication date
Field of study

Crossref

Gene expression subclass analysis.

(A) Comparison of hierarchical clustering of METABRIC data (left panel) and Perou data (right panel). Hierarchical clustering on the gene expression data of the PAM50 genes in both datasets reveals a similar gene expression pattern that separates into several subclasses. Although several classes are apparent, they are consistent with sample assignment into basal-like, Her2-enriched and luminal subclasses in the Perou data. Similarly, in the METABRIC data the subclasses are consistent with the available clinical data for triple-negative, ER and PR status, and HER2 positive. (B) Kaplan-Meier plot for subclasses. The METABRIC test dataset was separated into 3 major subclasses according to clinical features. The subclasses were determined by the clinical features: triple negative (red); ER or PR positive status (blue); and HER2 positive with ER and PR negative status (green). The survival curve was estimated using a standard Kaplan-Meier curve, and shows the expected differences in overall survival between the subclasses. (C,D) Kaplan-Meier curve by grade and histology. The test dataset was separated by tumor grade (subplot C; grade 1 – red, grade 2 – green, grade 3- blue), or by histology (subplot D; Infilitrating Lobular – red, Infiltrating Ductal – yellow, Medullary –green, Mixed Histology – blue, or Mucinous - purple). The survival curves were estimated using a standard Kaplan-Meier curve, and show the expected differences in overall survival for the clinical features.</p

FigShare

Distribution of concordance index scores of models submitted in the pilot competition.

(A) Models are categorized by the type of features they use. Boxes indicate the 25th (lower end), 50th (middle red line) and 75th (upper end) of the scores in each category, while the whiskers indicate the 10th and 90th percentiles of the scores. The scores for the baseline and best performer are highlighted. (B) Model performance by submission date. In the initial phase of the competition, slight improvements over the baseline model were achieved by applying machine learning approaches to only the clinical data (red circles), whereas initial attempts to incorporate molecular data significantly decreased performance (green, purple, and black circles). In the intermediate phase of the competition, models combining molecular and clinical data (green circles) predominated and achieved slightly improved performance over clinical only models. Towards the end of the competition, models combining clinical information with molecular features selected based on prior information (purple circles) predominated.</p

FigShare

Model performance by feature set and learning algorithm.

(A) The concordance index is displayed for each model from the controlled experiment (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003047#pcbi.1003047.s005" target="_blank">Table S4</a>). The methods and features sets are arranged according to the mean concordance index score. The ensemble method (cyan curve) infers survival predictions based on the average rank of samples from each of the four other learning algorithms, and the ensemble feature set uses the average rank of samples based on models trained using all of the other feature sets. <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003047#s2" target="_blank">Results</a> for the METABRIC2 and MicMa datasets are show in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003047#pcbi.1003047.s001" target="_blank">Figure S1</a>. (B) The concordance index of models from the controlled phase by type. The ensemble method again utilizes the average rank for models in each category.</p

FigShare

Consistency of results in 2 additional datasets.

(A,C) Concordance index scores for all models evaluated in the controlled experiment. Scores from the original evaluation are compared against METABRIC2 (A) and MicMa (C). The 4 machine learning algorithms are displayed in different colors. (B,D) Individual plots for each machine learning algorithm.</p

FigShare