751 research outputs found
Conditional independence relations among biological markers may improve clinical decision as in the case of triple negative breast cancers
The associations existing among different biomarkers are important in clinical settings because they contribute to the characterisation of specific pathways related to the natural history of the disease, genetic and environmental determinants. Despite the availability of binary/linear (or at least monotonic) correlation indices, the full exploitation of molecular information depends on the knowledge of direct/indirect conditional independence (and eventually causal) relationships among biomarkers, and with target variables in the population of interest. In other words, that depends on inferences which are performed on the joint multivariate distribution of markers and target variables. Graphical models, such as Bayesian Networks, are well suited to this purpose. Therefore, we reconsidered a previously published case study on classical biomarkers in breast cancer, namely estrogen receptor (ER), progesterone receptor (PR), a proliferative index (Ki67/MIB-1) and to protein HER2/neu (NEU) and p53, to infer conditional independence relations existing in the joint distribution by inferring (learning) the structure of graphs entailing those relations of independence. We also examined the conditional distribution of a special molecular phenotype, called triple-negative, in which ER, PR and NEU were absent. We confirmed that ER is a key marker and we found that it was able to define subpopulations of patients characterized by different conditional independence relations among biomarkers. We also found a preliminary evidence that, given a triple-negative profile, the distribution of p53 protein is mostly supported in 'zero' and 'high' states providing useful information in selecting patients that could benefit from an adjuvant anthracyclines/alkylating agent-based chemotherapy
Hidden Markov Models
Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research
Recommended from our members
Building trajectories through clinical data to model disease progression
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Clinical trials are typically conducted over a population within a defined time period
in order to illuminate certain characteristics of a health issue or disease process. These cross-sectional studies provide a snapshot of these disease processes over a large number of people but do not allow us to model the temporal nature of disease, which is essential for modeling detailed prognostic predictions. Longitudinal studies, on the other hand, are used to explore how these processes develop over time in a number of people but can be expensive and time-consuming, and many studies only cover a relatively small window within the disease process. This thesis describes the application of intelligent data analysis techniques for extracting information from time series generated by different diseases. The aim of this thesis is to identify intermediate stages
in a disease process and sub-categories of the disease exhibiting subtly different symptoms. It explores the use of a bootstrap technique that fits trajectories through the data generating โpseudo time-seriesโ. It addresses issues including: how clinical variables interact as a disease progresses along the trajectories in the data; and how to automatically identify different disease states along these trajectories, as well as the transitions between them. The thesis documents how reliable time-series models can be created from large amounts of historical cross-sectional data and a novel relabling/latent variable approach has enabled the exploration of the temporal nature of disease progression. The proposed algorithms are tested extensively on simulated data and on three real clinical datasets. Finally, a study is carried out to explore whether we can โcalibrateโ pseudo time-series models with real longitudinal data in order to improve them. Plausible directions for future research are discussed at the end of the thesis
Data based identification and prediction of nonlinear and complex dynamical systems
We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin
Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA
The overwhelming amount of available scholarly literature in the life
sciences poses significant challenges to scientists wishing to keep up with
important developments related to their research, but also provides a useful
resource for the discovery of recent information concerning genes, diseases,
compounds and the interactions between them. In this paper, we describe an
algorithm called Bio-LDA that uses extracted biological terminology to
automatically identify latent topics, and provides a variety of measures to
uncover putative relations among topics and bio-terms. Relationships identified
using those approaches are combined with existing data in life science datasets
to provide additional insight. Three case studies demonstrate the utility of
the Bio-LDA model, including association predication, association search and
connectivity map generation. This combined approach offers new opportunities
for knowledge discovery in many areas of biology including target
identification, lead hopping and drug repurposing.Comment: 14 pages, 8 figures, 10 table
The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders. As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. However, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches need to be capable of efficiently integrating and exploiting the biological and technological heterogeneity of such datasets in order to best infer the underlying, highly dynamic regulatory networks, often in the absence of sufficient ground truth data for model training or testing. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology. As an example, one of the top performing algorithms in a community challenge on gene regulatory network inference from transcriptomic data is based on a random forest regression model. In this concise survey, we aim to highlight how randomized methods may serve as a highly valuable tool, in particular, with increasing amounts of large-scale, biological experiments and datasets being collected. Given the complexity and interdisciplinary nature of the gene regulatory network inference problem, we hope our survey maybe helpful to both computational and biological scientists. It is our aim to provide a starting point for a dialogue about the concepts, benefits, and caveats of the toolbox of randomized methods, since unravelling the intricate web of highly dynamic, regulatory events will be one fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases
RNA ์ํธ์์ฉ ๋ฐ DNA ์์ด์ ์ ๋ณดํด๋ ์ ์ํ ๊ธฐ๊ณํ์ต ๊ธฐ๋ฒ
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ)--์์ธ๋ํ๊ต ๋ํ์ :๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ,2020. 2. ๊น์ .์๋ฌผ์ฒด ๊ฐ ํํํ์ ์ฐจ์ด๋ ๊ฐ ๊ฐ์ฒด์ ์ ์ ์ ์ ๋ณด ์ฐจ์ด๋ก๋ถํฐ ๊ธฐ์ธํ๋ค. ์ ์ ์ ์ ๋ณด์ ๋ณํ์ ๋ฐ๋ผ์, ๊ฐ ์๋ฌผ์ฒด๋ ์๋ก ๋ค๋ฅธ ์ข
์ผ๋ก ์งํํ๊ธฐ๋ ํ๊ณ , ๊ฐ์ ๋ณ์ ๊ฑธ๋ฆฐ ํ์๋ผ๋ ์๋ก ๋ค๋ฅธ ์ํ๋ฅผ ๋ณด์ด๊ธฐ๋ ํ๋ค. ์ด์ฒ๋ผ ์ค์ํ ์๋ฌผํ์ ์ ๋ณด๋ ๋์ฉ๋ ์ํ์ฑ ๋ถ์ ๊ธฐ๋ฒ ๋ฑ์ ํตํด ๋ค์ํ ์ค๋ฏน์ค ๋ฐ์ดํฐ๋ก ์ธก์ ๋๋ค. ๊ทธ๋ฌ๋, ์ค๋ฏน์ค ๋ฐ์ดํฐ๋ ๊ณ ์ฐจ์ ํน์ง ๋ฐ ์๊ท๋ชจ ํ๋ณธ ๋ฐ์ดํฐ์ด๊ธฐ ๋๋ฌธ์, ์ค๋ฏน์ค ๋ฐ์ดํฐ๋ก๋ถํฐ ์๋ฌผํ์ ์ ๋ณด๋ฅผ ํด์ํ๋ ๊ฒ์ ๋งค์ฐ ์ด๋ ค์ด ๋ฌธ์ ์ด๋ค. ์ผ๋ฐ์ ์ผ๋ก, ๋ฐ์ดํฐ ํน์ง์ ๊ฐ์๊ฐ ์ํ์ ๊ฐ์๋ณด๋ค ๋ง์ ๋, ์ค๋ฏน์ค ๋ฐ์ดํฐ์ ํด์์ ๊ฐ์ฅ ๋ํดํ ๊ธฐ๊ณํ์ต ๋ฌธ์ ๋ค ์ค ํ๋๋ก ๋ง๋ญ๋๋ค.
๋ณธ ๋ฐ์ฌํ์ ๋
ผ๋ฌธ์ ๊ธฐ๊ณํ์ต ๊ธฐ๋ฒ์ ํ์ฉํ์ฌ ๊ณ ์ฐจ์์ ์ธ ์๋ฌผํ์ ๋ฐ์ดํฐ๋ก๋ถํฐ ์๋ฌผํ์ ์ ๋ณด๋ฅผ ์ถ์ถํ๊ธฐ ์ํ ์๋ก์ด ์๋ฌผ์ ๋ณดํ ๋ฐฉ๋ฒ๋ค์ ๊ณ ์ํ๋ ๊ฒ์ ๋ชฉํ๋ก ํ๋ค.
์ฒซ ๋ฒ์งธ ์ฐ๊ตฌ๋ DNA ์์ด์ ํ์ฉํ์ฌ ์ข
๊ฐ ๋น๊ต์ ๋์์ DNA ์์ด์์ ์๋ ๋ค์ํ ์ง์ญ์ ๋ด๊ธด ์๋ฌผํ์ ์ ๋ณด๋ฅผ ์ ์ ์ ๊ด์ ์์ ํด์ํด๋ณด๊ณ ์ ํ์๋ค. ์ด๋ฅผ ์ํด, ์์ ๊ธฐ๋ฐ k ๋จ์ด ๋ฌธ์์ด ๋น๊ต๋ฐฉ๋ฒ, RKSS ์ปค๋์ ๊ฐ๋ฐํ์ฌ ๋ค์ํ ๊ฒ๋ ์์ ์ง์ญ์์ ์ฌ๋ฌ ์ข
๊ฐ ๋น๊ต ์คํ์ ์ํํ์๋ค. RKSS ์ปค๋์ ๊ธฐ์กด์ k ๋จ์ด ๋ฌธ์์ด ์ปค๋์ ํ์ฅํ ๊ฒ์ผ๋ก, k ๊ธธ์ด ๋จ์ด์ ์์ ์ ๋ณด์ ์ข
๊ฐ ๊ณตํต์ ์ ํํํ๋ ๋น๊ต๊ธฐ์ค์ ๊ฐ๋
์ ํ์ฉํ์๋ค. k ๋จ์ด ๋ฌธ์์ด ์ปค๋์ k์ ๊ธธ์ด์ ๋ฐ๋ผ ๋จ์ด ์๊ฐ ๊ธ์ฆํ์ง๋ง, ๋น๊ต๊ธฐ์ค์ ์ ๊ทน์์์ ๋จ์ด๋ก ์ด๋ฃจ์ด์ ธ ์์ผ๋ฏ๋ก ์์ด ๊ฐ ์ ์ฌ๋๋ฅผ ๊ณ์ฐํ๋ ๋ฐ ํ์ํ ๊ณ์ฐ๋์ ํจ์จ์ ์ผ๋ก ์ค์ผ ์ ์๋ค. ๊ฒ๋ ์์ ์ธ ์ง์ญ์ ๋ํด์ ์คํ์ ์งํํ ๊ฒฐ๊ณผ, RKSS ์ปค๋์ ๊ธฐ์กด์ ์ปค๋์ ๋นํด ์ข
๊ฐ ์ ์ฌ๋ ๋ฐ ์ฐจ์ด๋ฅผ ํจ์จ์ ์ผ๋ก ๊ณ์ฐํ ์ ์์๋ค. ๋ํ, RKSS ์ปค๋์ ์คํ์ ์ฌ์ฉ๋ ์๋ฌผํ์ ์ง์ญ์ ํฌํจ๋ ์๋ฌผํ์ ์ ๋ณด๋ ์ฐจ์ด๋ฅผ ์๋ฌผํ์ ์ง์๊ณผ ๋ถํฉ๋๋ ์์๋ก ๋น๊ตํ ์ ์์๋ค.
๋ ๋ฒ์งธ ์ฐ๊ตฌ๋ ์๋ฌผํ์ ๋คํธ์ํฌ๋ฅผ ํตํด ๋ณต์กํ๊ฒ ์ฝํ ์ ์ ์ ์ํธ์์ฉ ๊ฐ ์ ๋ณด๋ฅผ ํด์ํ์ฌ, ๋ ๋์๊ฐ ์๋ฌผํ์ ๊ธฐ๋ฅ ํด์์ ํตํด ์์ ์ํ์ ๋ถ๋ฅํ๊ณ ์ ํ์๋ค. ์ด๋ฅผ ์ํด, ๊ทธ๋ํ ์ปจ๋ณผ๋ฃจ์
๋คํธ์ํฌ์ ์ดํ
์
๋ฉ์ปค๋์ฆ์ ํ์ฉํ์ฌ ํจ์ค์จ์ด ๊ธฐ๋ฐ ํด์ ๊ฐ๋ฅํ ์ ์ํ ๋ถ๋ฅ ๋ชจ๋ธ(GCN+MAE)์ ๊ณ ์ํ์๋ค. ๊ทธ๋ํ ์ปจ๋ณผ๋ฃจ์
๋คํธ์ํฌ๋ฅผ ํตํด์ ์๋ฌผํ์ ์ฌ์ ์ง์์ธ ํจ์ค์จ์ด ์ ๋ณด๋ฅผ ํ์ตํ์ฌ ๋ณต์กํ ์ ์ ์ ์ํธ์์ฉ ์ ๋ณด๋ฅผ ํจ์จ์ ์ผ๋ก ๋ค๋ฃจ์๋ค. ๋ํ, ์ฌ๋ฌ ํจ์ค์จ์ด ์ ๋ณด๋ฅผ ์ดํ
์
๋ฉ์ปค๋์ฆ์ ํตํด ํด์ ๊ฐ๋ฅํ ์์ค์ผ๋ก ๋ณํฉํ์๋ค. ๋ง์ง๋ง์ผ๋ก, ํ์ตํ ํจ์ค์จ์ด ๋ ๋ฒจ ์ ๋ณด๋ฅผ ๋ณด๋ค ๋ณต์กํ๊ณ ๋ค์ํ ์ ์ ์ ๋ ๋ฒจ๋ก ํจ์จ์ ์ผ๋ก ์ ๋ฌํ๊ธฐ ์ํด์ ๋คํธ์ํฌ ์ ํ ์๊ณ ๋ฆฌ์ฆ์ ํ์ฉํ์๋ค. ๋ค์ฏ ๊ฐ์ ์ ๋ฐ์ดํฐ์ ๋ํด GCN+MAE ๋ชจ๋ธ์ ์ ์ฉํ ๊ฒฐ๊ณผ, ๊ธฐ์กด์ ์ ์ํ ๋ถ๋ฅ ๋ชจ๋ธ๋ค๋ณด๋ค ๋์ ์ฑ๋ฅ์ ๋ณด์์ผ๋ฉฐ ์ ์ํ ํน์ด์ ์ธ ํจ์ค์จ์ด ๋ฐ ์๋ฌผํ์ ๊ธฐ๋ฅ์ ๋ฐ๊ตดํ ์ ์์๋ค.
์ธ ๋ฒ์งธ ์ฐ๊ตฌ๋ ํจ์ค์จ์ด๋ก๋ถํฐ ์๋ธ ํจ์ค์จ์ด/๋คํธ์ํฌ๋ฅผ ์ฐพ๊ธฐ ์ํ ์ฐ๊ตฌ๋ค. ํจ์ค์จ์ด๋ ์๋ฌผํ์ ๋คํธ์ํฌ์ ๋จ์ผ ์๋ฌผํ์ ๊ธฐ๋ฅ์ด ์๋๋ผ ๋ค์ํ ์๋ฌผํ์ ๊ธฐ๋ฅ์ด ํฌํจ๋์ด ์์์ ์ฃผ๋ชฉํ์๋ค. ๋จ์ผ ๊ธฐ๋ฅ์ ์ง๋ ์ ์ ์ ์กฐํฉ์ ์ฐพ๊ธฐ ์ํด์ ์๋ฌผํ์ ๋คํธ์ํฌ์์์ ์กฐ๊ฑด ํน์ด์ ์ธ ์ ์ ์ ๋ชจ๋์ ์ฐพ๊ณ ์ ํ์์ผ๋ฉฐ MIDAS๋ผ๋ ๋๊ตฌ๋ฅผ ๊ฐ๋ฐํ์๋ค. ํจ์ค์จ์ด๋ก๋ถํฐ ์ ์ ์ ์ํธ์์ฉ ๊ฐ ํ์ฑ๋๋ฅผ ์ ์ ์ ๋ฐํ๋๊ณผ ๋คํธ์ํฌ ๊ตฌ์กฐ๋ฅผ ํตํด ๊ณ์ฐํ์๋ค. ๊ณ์ฐ๋ ํ์ฑ๋๋ค์ ํ์ฉํ์ฌ ๋ค์ค ํด๋์ค์์ ์๋ก ๋ค๋ฅด๊ฒ ํ์ฑํ๋ ์๋ธ ํจ์ค๋ค์ ํต๊ณ์ ๊ธฐ๋ฒ์ ๊ธฐ๋ฐํ์ฌ ๋ฐ๊ตดํ์๋ค. ๋ํ, ์ดํ
์
๋ฉ์ปค๋์ฆ๊ณผ ๊ทธ๋ํ ์ปจ๋ณผ๋ฃจ์
๋คํธ์ํฌ๋ฅผ ํตํด์ ํด๋น ์ฐ๊ตฌ๋ฅผ ํจ์ค์จ์ด๋ณด๋ค ๋ ํฐ ์๋ฌผํ์ ๋คํธ์ํฌ์ ํ์ฅํ๋ ค๊ณ ์๋ํ์๋ค. ์ ๋ฐฉ์ ๋ฐ์ดํฐ์ ๋ํด ์คํ์ ์งํํ ๊ฒฐ๊ณผ, MIDAS์ ๋ฅ๋ฌ๋ ๋ชจ๋ธ์ ๋ค์ค ํด๋์ค์์ ์ฐจ์ด๊ฐ ๋๋ ์ ์ ์ ๋ชจ๋์ ํจ๊ณผ์ ์ผ๋ก ์ถ์ถํ ์ ์์๋ค.
๊ฒฐ๋ก ์ ์ผ๋ก, ๋ณธ ๋ฐ์ฌํ์ ๋
ผ๋ฌธ์ DNA ์์ด์ ๋ด๊ธด ์งํ์ ์ ๋ณด๋ ๋น๊ต, ํจ์ค์จ์ด ๊ธฐ๋ฐ ์ ์ํ ๋ถ๋ฅ, ์กฐ๊ฑด ํน์ด์ ์ธ ์ ์ ์ ๋ชจ๋ ๋ฐ๊ตด์ ์ํ ์๋ก์ด ๊ธฐ๊ณํ์ต ๊ธฐ๋ฒ์ ์ ์ํ์๋ค.Phenotypic differences among organisms are mainly due to the difference in genetic information. As a result of genetic information modification, an organism may evolve into a different species and patients with the same disease may have different prognosis. This important biological information can be observed in the form of various omics data using high throughput instrument technologies such as sequencing instruments. However, interpretation of such omics data is challenging since omics data is with very high dimensions but with relatively small number of samples. Typically, the number of dimensions is higher than the number of samples, which makes the interpretation of omics data one of the most challenging machine learning problems.
My doctoral study aims to develop new bioinformatics methods for decoding information in these high dimensional data by utilizing machine learning algorithms.
The first study is to analyze the difference in the amount of information between different regions of the DNA sequence. To achieve the goal, a ranked-based k-spectrum string kernel, RKSS kernel, is developed for comparative and evolutionary comparison of various genomic region sequences among multiple species. RKSS kernel extends the existing k-spectrum string kernel by utilizing rank information of k-mers and landmarks of k-mers that represents a species. By using a landmark as a reference point for comparison, the number of k-mers needed to calculating sequence similarities is dramatically reduced. In the experiments on three different genomic regions, RKSS kernel captured more reliable distances between species according to genetic information contents of the target region. Also, RKSS kernel was able to rearrange each region to match a biological common insight.
The second study aims to efficiently decode complex genetic interactions using biological networks and, then, to classify cancer subtypes by interpreting biological functions. To achieve the goal, a pathway-based deep learning model using graph convolutional network and multi-attention based ensemble (GCN+MAE) for cancer subtype classification is developed. In order to efficiently reduce the relationships between genes using pathway information, GCN+MAE is designed as an explainable deep learning structure using graph convolutional network and attention mechanism. Extracted pathway-level information of cancer subtypes is transported into gene-level again by network propagation. In the experiments of five cancer data sets, GCN+MAE showed better cancer subtype classification performances and captured subtype-specific pathways and their biological functions.
The third study is to identify sub-networks of a biological pathway. The goal is to dissect a biological pathway into multiple sub-networks, each of which is to be of a single functional unit. To achieve the goal, a condition-specific sub-module detection method in a biological network, MIDAS (MIning Differentially Activated Subpaths) is developed. From the pathway, edge activities are measured by explicit gene expression and network topology. Using the activities, differentially activated subpaths are explored by a statistical approach. Also, by extending this idea on graph convolutional network, different sub-networks are highlighted by attention mechanisms. In the experiment with breast cancer data, MIDAS and the deep learning model successfully decomposed gene-level features into sub-modules of single functions.
In summary, my doctoral study proposes new computational methods to compare genomic DNA sequences as information contents, to model pathway-based cancer subtype classifications and regulations, and to identify condition-specific sub-modules among multiple cancer subtypes.Chapter 1 Introduction 1
1.1 Biological questions with genetic information 2
1.1.1 Biological Sequences 2
1.1.2 Gene expression 2
1.2 Formulating computational problems for the biological questions 3
1.2.1 Decoding biological sequences by k-mer vectors 3
1.2.2 Interpretation of complex relationships between genes 7
1.3 Three computational problems for the biological questions 9
1.4 Outline of the thesis 14
Chapter 2 Ranked k-spectrum kernel for comparative and evolutionary comparison of DNA sequences 15
2.1 Motivation 16
2.1.1 String kernel for sequence comparison 17
2.1.2 Approach: RKSS kernel 19
2.2 Methods 21
2.2.1 Mapping biological sequences to k-mer space: the k-spectrum string kernel 23
2.2.2 The ranked k-spectrum string kernel with a landmark 24
2.2.3 Single landmark-based reconstruction of phylogenetic tree 27
2.2.4 Multiple landmark-based distance comparison of exons, introns, CpG islands 29
2.2.5 Sequence Data for analysis 30
2.3 Results 31
2.3.1 Reconstruction of phylogenetic tree on the exons, introns, and CpG islands 31
2.3.2 Landmark space captures the characteristics of three genomic regions 38
2.3.3 Cross-evaluation of the landmark-based feature space 45
Chapter 3 Pathway-based cancer subtype classification and interpretation by attention mechanism and network propagation 46
3.1 Motivation 47
3.2 Methods 52
3.2.1 Encoding biological prior knowledge using Graph Convolutional Network 52
3.2.2 Re-producing comprehensive biological process by Multi-Attention based Ensemble 53
3.2.3 Linking pathways and transcription factors by network propagation with permutation-based normalization 55
3.3 Results 58
3.3.1 Pathway database and cancer data set 58
3.3.2 Evaluation of individual GCN pathway models 60
3.3.3 Performance of ensemble of GCN pathway models with multi-attention 60
3.3.4 Identification of TFs as regulator of pathways and GO term analysis of TF target genes 67
Chapter 4 Detecting sub-modules in biological networks with gene expression by statistical approach and graph convolutional network 70
4.1 Motivation 70
4.1.1 Pathway based analysis of transcriptome data 71
4.1.2 Challenges and Summary of Approach 74
4.2 Methods 78
4.2.1 Convert single KEGG pathway to directed graph 79
4.2.2 Calculate edge activity for each sample 79
4.2.3 Mining differentially activated subpath among classes 80
4.2.4 Prioritizing subpaths by the permutation test 82
4.2.5 Extension: graph convolutional network and class activation map 83
4.3 Results 84
4.3.1 Identifying 36 subtype specific subpaths in breast cancer 86
4.3.2 Subpath activities have a good discrimination power for cancer subtype classification 88
4.3.3 Subpath activities have a good prognostic power for survival outcomes 90
4.3.4 Comparison with an existing tool, PATHOME 91
4.3.5 Extension: detection of subnetwork on PPI network 98
Chapter 5 Conclusions 101
๊ตญ๋ฌธ์ด๋ก 127Docto
- โฆ