1,043 research outputs found

    Pathway-Based Multi-Omics Data Integration for Breast Cancer Diagnosis and Prognosis.

    Get PDF
    Ph.D. Thesis. University of Hawaiสปi at Mฤnoa 2017

    Algorithmic methods to infer the evolutionary trajectories in cancer progression

    Full text link
    The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the 'selective advantage' relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc's ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses

    Molecular epidemiology studies on risk factors for breast cancer and disease aggressiveness

    Get PDF
    Breast cancer is a heterogeneous disease. Aggressive subtypes are characterized by faster growth rates, increased capability to invade and metastasize, leading to poorer clinical outcomes. In this thesis, we use a molecular epidemiology approach to investigate the association between risk factors and aggressive breast cancer defined by tumor characteristics, intrinsic subtypes, mode of detection, and survival. Using a variety of methods, we analyzed data from well-characterized breast cancer cohorts in Sweden, genome-wide association studies, and gene expression profiling of tumors. In Paper I, we found that breast cancer genetic load, defined by rare deleterious variants in 31 breast cancer genes, and unlike common variants, is positively associated with unfavorable tumor characteristics, patient survival, and mode of detection. In Paper II, we observed that women with low breast cancer risk defined by the Tyrer-Cuzick risk score were more likely to develop aggressive tumors. We computed a low-risk gene expression profile that was consistently associated with worse prognosis. In addition, our analysis showed that increased proliferation rather than estrogen status underlie this association. In Paper III, we examined gene expression profiles in a subset of aggressive breast cancer tumors, known as interval cancers. By taking mammographic density and intrinsic PAM50 subtypes into account, we found an interval cancer gene expression profile to be associated with immune subtypes in breast cancer, particularly those involving interferon response. In Paper IV, we show that breast cancer has a shared immune-related genetic component with celiac disease, an autoimmune disorder. In consistency with previous epidemiological findings, we found that a higher genetic load for celiac disease was associated with lower breast cancer risk. Overall, this thesis aims to provide scientific evidence towards a better understanding of the factors underlying the development of aggressive breast cancers that could shed light on the design of better preventative strategies aimed at lowering disease mortalit

    ์ •๋Ÿ‰ ๋‹จ๋ฐฑ์ฒดํ•™ ๋ฐ ์ƒ๋ฌผ์ •๋ณดํ•™์„ ์ด์šฉํ•œ ๊ณต๊ฒฉ์ ์ธ ์œ ๋ฐฉ์•” ๋ฐ”์ด์˜ค ๋งˆ์ปค์˜ ๋ฐœ๊ตด

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์˜๊ณผ๋Œ€ํ•™ ์˜ํ•™๊ณผ, 2022.2. ์œ ํ•œ์„.์„œ๋ก : ์งˆ๋Ÿ‰๋ถ„์„๊ธฐ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์ฒดํ•™์€ ๋Œ€๊ทœ๋ชจ ๋ถ„์ž์ƒ๋ฌผํ•™๊ณผ ์„ธํฌ์ƒ๋ฌผํ•™์„ ๋‹จ๋ฐฑ์งˆ ์ˆ˜์ค€์—์„œ ๋‹ค๋ฃจ๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๋Œ€๋Ÿ‰ ๋‹จ๋ฐฑ์งˆ์˜ ๋™์ • ๋ฐ ์ •๋Ÿ‰์œผ๋กœ ๋‹จ๋ฐฑ์ฒดํ•™ ๋ถ„์„๊ธฐ๋ฒ•์€ ๋‹จ๋ฐฑ์งˆ์˜ ์„œ์—ด, ๋ฐœํ˜„, ์ „์‚ฌ ํ›„ ๋ณ€ํ˜• ๋ฐ ๋‹จ๋ฐฑ์งˆ-๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ๋“ฑ์„ ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ์„ธํฌ์ฃผ๋ถ€ํ„ฐ ์ œํ•œ๋œ ์–‘์˜ ์ž„์ƒ ์‹œ๋ฃŒ์ธ ์ฒด์•ก, ์‹ ์„ ํ•œ ๋ƒ‰๋™ ์กฐ์ง, ํŒŒ๋ผํ•€ ํฌ๋งค (FFPE) ์กฐ์ง ๋“ฑ์œผ๋กœ๋ถ€ํ„ฐ ๋‹จ๋ฐฑ์งˆ์„ ์ถ”์ถœํ•œ๋‹ค. ๋†’์€ ์ฒ˜๋ฆฌ๋Ÿ‰๊ณผ ๊ฐ๋„๋ฅผ ๊ฐ€์ง„ ์ฐจ์„ธ๋Œ€ ๊ณ ์† ์งˆ๋Ÿ‰๋ถ„์„๊ธฐ ๊ธฐ๋ฐ˜ ๋ถ„์„์œผ๋กœ ์ˆ˜์ฒœ ๊ฐœ์˜ ๋‹จ๋ฐฑ์งˆ์„ ๋™์‹œ์— ์ •๋Ÿ‰ ํ•˜์—ฌ ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์‚ฐํ•œ๋‹ค. ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ถ„์„ ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ์งˆ๋ณ‘์˜ ์ƒํƒœ, ์˜ˆํ›„, ์น˜๋ฃŒ์— ๋”ฐ๋ฅธ ํšจ๊ณผ์— ๋”ฐ๋ฅธ ๋‹จ๋ฐฑ์งˆ ๋ฐœํ˜„ ์ˆ˜์ค€์˜ ์ฐจ์ด๋ฅผ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ๊ณ , ๋” ๋‚˜์•„๊ฐ€ ์งˆ๋ณ‘์˜ ์ƒ๋ฌผํ•™์  ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์ œ์‹œํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฐฉ๋ฒ•: 1์žฅ์—์„œ, ๊ฐ€์žฅ ๊ณต๊ฒฉ์ ์ธ ์‚ผ์ค‘ ์Œ์„ฑ (TNBC) ์œ ๋ฐฉ์•” ํ•˜์œ„ ์œ ํ˜•์ธ ํด๋ผ์šฐ๋”˜ ๋‚ฎ์€ (Claudin-low) ํ•˜์œ„ ์œ ํ˜•์—์„œ ์•” ์ค„๊ธฐ์„ธํฌ ๋งˆ์ปค์ธ CD44์˜ ์—ญํ• ์„ ๊ทœ๋ช…ํ•˜์˜€๋‹ค. ์œ ์ „์ž ์กฐ์ž‘ ๊ธฐ๋ฒ•์„ ํ†ตํ•ด CD44 ๋ฐœํ˜„์„ ์กฐ์ ˆํ•œ ์„ธํฌ์ฃผ๋ฅผ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. CD44์˜ ๋ฐœํ˜„์„ ๊ฐ์†Œ์‹œ์ผฐ์„ ๋•Œ, ๋‹จ๋ฐฑ์งˆ ๋ฐœํ˜„ ์–‘์ƒ์˜ ๋ณ€ํ™”๋ฅผ ๋ถ„์„ํ•˜์—ฌ ๋ถ„์ž์ƒ๋ฌผํ•™์  ์—ญํ• ์„ ์ž…์ฆํ•˜์˜€๋‹ค. 2์žฅ์—์„œ, ์œ ๋ฐฉ์•” ํ™˜์ž ์ค‘ ํƒ€์žฅ๊ธฐ๋กœ์˜ ์›๊ฒฉ ์ „์ด ๊ณ ์œ„ํ—˜๊ตฐ ํ™˜์ž์— ๋Œ€ํ•œ ์˜ˆํ›„ ์˜ˆ์ธก ๋ฐ”์ด์˜ค ๋งˆ์ปค๋ฅผ ๋ฐœ๊ตดํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋™์ผ ๋ณ‘๊ธฐ 28๋ช…์˜ ํ™˜์ž (์กฐ๊ธฐ ์›๊ฒฉ์ „์ด: 9๋ช…, ์ง€์—ฐ ์›๊ฒฉ์ „์ด: 9๋ช…, ๋น„์›๊ฒฉ์ „์ด: 10๋ช…) FFPE ์ข…์–‘ ์กฐ์ง์„ ์ˆ˜์ง‘ํ•˜์˜€๋‹ค. ์ œํ•œ์ ์ธ ์–‘์˜ ์‹œ๋ฃŒ๋ฅผ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ๋‹จ๋ฐฑ์ฒด ๋ถ„์„๋ฒ•์„ ํ™•๋ฆฝํ•˜์˜€๋‹ค. ์›๊ฒฉ์ „์ด ์˜ˆํ›„์˜ˆ์ธก์„ ์œ„ํ•œ ๋ฐ”์ด์˜ค ๋งˆ์ปค ํ›„๋ณด๊ตฐ์„ ๋ฐœ๊ตดํ•˜์˜€๊ณ , ์ „์‚ฌ์ฒด ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ์—์„œ ํšŒ๊ท€ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์—ฌ ๊ฒ€์ฆํ•˜์˜€๋‹ค. ๊ฒฐ๊ณผ: 1์žฅ์—์„œ, Cluain-low ํ•˜์œ„์œ ํ˜• ์œ ๋ฐฉ์•” ์„ธํฌ์ฃผ MDA-MB-231์—์„œ 7396๊ฐœ, Hs578T ์—์„œ 6567๊ฐœ์˜ ๋‹จ๋ฐฑ์งˆ์„ ๋™์ •ํ•˜์˜€๋‹ค. ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ ๋ฐœํ˜„์˜ ์ฐจ์ด๋ฅผ ๋‚˜ํƒ€๋‚ธ MDA-MB-231์˜ 4908๊ฐœ ๋‹จ๋ฐฑ์งˆ, Hs578T์˜ 855๊ฐœ ๋‹จ๋ฐฑ์งˆ์„ ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ถ„์„ (gene ontology, ๋‹จ๋ฐฑ์งˆ-๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ๋„คํŠธ์›Œํฌ ๋ถ„์„) ํ•˜์—ฌ ์„ธํฌ ์ฆ์‹, ๋Œ€์‚ฌ๊ณผ์ •, ์œ ์ „์ž์˜ ๋ฐœํ˜„ ์กฐ์ ˆ์„ ํ†ตํ•œ ์•”ํ™” ๊ณผ์ •์„ ์ œ์‹œํ•˜์˜€๋‹ค. ์ƒ๋ฌผํ•™์  ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ํ™•์ธ์„ ์œ„ํ•ด ๊ธฐ๋Šฅ ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€๊ณ , CD44๊ฐ€ ๋Œ€๋Ÿ‰์˜ ๋‹จ๋ฐฑ์งˆ์˜ ๋ฐœํ˜„์„ ์กฐ์ ˆํ•˜์—ฌ ์„ธํฌ ์ฆ์‹๊ณผ ์ด๋™์„ ์กฐ์ ˆํ•˜๋Š” ๊ฒƒ์„ ๊ฒ€์ฆํ•˜์˜€๋‹ค. 2์žฅ์—์„œ, ์œ ๋ฐฉ์•” FFPE ์Šฌ๋ผ์ด๋“œ์—์„œ ์ข…์–‘ ๋ถ€๋ถ„๋งŒ์„ ์„ ๋ณ„ํ•˜์—ฌ ๋ถ„๋ฆฌํ•˜์—ฌ ์งˆ๋Ÿ‰ ๋ถ„์„ํ•˜์—ฌ 9455๊ฐœ์˜ ๋‹จ๋ฐฑ์งˆ์„ ๋™์ •ํ•˜์˜€๋‹ค. ์› ๋ฐœ์•” ์ง„๋‹จ ํ›„ ์›๊ฒฉ์ „์ด๊ฐ€ ์ผ์–ด๋‚œ ๊ธฐ๊ฐ„์— ๋”ฐ๋ผ ๋ฐœํ˜„์˜ ์œ ์˜ํ•œ ์ฐจ์ด๊ฐ€ ์žˆ๋Š” ๋‹จ๋ฐฑ์งˆ ์ค‘ ๋น„๊ต ๋ถ„์„, ์ƒ๊ด€๊ด€๊ณ„ ๋„คํŠธ์›Œํฌ ๋ถ„์„, ๋จธ์‹  ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ํŠน์„ฑ ์ถ”์ถœ, ์ƒ์กด๋ถ„์„์„ ํ†ตํ•ด 7๊ฐœ์˜ ์ตœ์ข… ๋ฐ”์ด์˜ค ๋งˆ์ปค ํ›„๋ณด๊ตฐ์„ ๋ฐœ๊ตดํ•˜์˜€๋‹ค. 7๊ฐœ์˜ ๋งˆ์ปค ํ›„๋ณด๊ตฐ์œผ๋กœ ์™ธ๋ถ€๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ Cox ๋น„๋ก€ ์œ„ํ—˜ ํšŒ๊ท€ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜์—ฌ ์›๊ฒฉ์ „์ด ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ฒฐ๋ก : 1์žฅ์—์„œ 2๊ฐ€์ง€ Cluain-low ํ•˜์œ„์œ ํ˜• ์œ ๋ฐฉ์•” ์„ธํฌ์ฃผ์—์„œ ์•” ์ค„๊ธฐ์„ธํฌ ๋งˆ์ปค์ธ CD44 ๋ฐœํ˜„์„ ๊ฐ์†Œ์‹œํ‚จ ๋‹จ๋ฐฑ์ฒด ๋ฐœํ˜„ ๋น„๊ต ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜์˜€๋‹ค. ์ด๋ฅผ ๋ถ„์„ํ•˜์—ฌ CD44๊ฐ€ ์•”์„ธํฌ์˜ ์œ ์ „์  ๋ฐœํ˜„, ๋Œ€์‚ฌ, ๋ถ€์ฐฉ์„ ์œ ๊ธฐ์ ์œผ๋กœ ์กฐ์ ˆํ•˜์—ฌ ํ•ต์‹ฌ ์•”ํ™” ๊ณผ์ •์ธ ์„ธํฌ ์ฆ์‹, ์ด๋™์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ณต๊ฒฉ์ ์ธ ์‚ผ์ค‘ ์Œ์„ฑ ์œ ๋ฐฉ์•”์˜ ํ•ต์‹ฌ ์กฐ์ ˆ์ธ์ž์ธ CD44์˜ ์ƒ๋ฌผํ•™์  ๊ธฐ์ „์„ ๋ถ„์ž์  ์ˆ˜์ค€์—์„œ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™”์œผ๋ฉฐ, ๋” ๋‚˜์•„๊ฐ€ Cluain-low ํ•˜์œ„์œ ํ˜• ์œ ๋ฐฉ์•”์˜ ์ž ์žฌ์  ์น˜๋ฃŒ์˜ ํ‘œ์  ๋ฌผ์งˆ์ด ๋  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. 2์žฅ์—์„œ ์œ ๋ฐฉ์•” ํ™˜์ž์˜ ํŒŒ๋ผํ•€ ํฌ๋งค ์ข…์–‘ ์กฐ์ง ๋‹จ๋ฐฑ์ฒด ๋ถ„์„์„ ํ†ตํ•ด ์‹ฌ์ธต์ ์ธ ๋‹จ๋ฐฑ์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜์˜€๊ณ , ์›๊ฒฉ์ „์ด ์˜ˆ์ธก์„ ์œ„ํ•œ ์ž ์žฌ์  ๋ฐ”์ด์˜ค ๋งˆ์ปค๋ฅผ ๋ฐœ๊ตดํ•˜์˜€๋‹ค. ์ด๋Ÿฌํ•œ ์›๊ฒฉ์ „์ด ์˜ˆํ›„ ์˜ˆ์ธก ๋ฐ”์ด์˜ค ๋งˆ์ปค์˜ ๊ฐœ๋ฐœ๊ณผ ์ƒ๋ฌผ ์ •๋ณดํ•™ ๋ถ„์„์„ ํ†ตํ•œ ๋ถ„์ž ์ƒ๋ฌผํ•™์  ๊ธฐ์ „์˜ ๊ทœ๋ช…์€ ์ •๋ฐ€์˜ํ•™ ์‹คํ˜„์˜ ํ•ต์‹ฌ ๊ทผ๊ฑฐ์ž๋ฃŒ๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ฉฐ, ์œ ๋ฐฉ์•” ํ™˜์ž์˜ ํšจ๊ณผ์ ์ธ ์น˜๋ฃŒ ๊ณ„ํš ์ˆ˜๋ฆฝ์— ๋„์›€์„ ์ค„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.Mass spectrometry (MS)-based proteomics covers large-scale molecular and cellular biology at the protein level. Through the identification and quantification of proteins, the proteome analysis can interpret protein sequence, post-transcriptional modification and protein-protein interactions. This allows us to profile new disease biomarkers. From the cell lines to the limited amount of samples (body fluids, fresh frozen tissues, and FFPE tissues), thousands of proteins were discovered simultaneously to detect changes in expression level with disease status. The resulting expression data are complex and ambiguous patterns. Therefore, exquisite bioinformatics algorithms have to be applied to determine these unique biomarker patterns. A proteomic study discovers a list of biomarkers and helps elucidate the biological mechanisms. In Chapter โ… , mass spectrometry-based proteomics was performed using breast cancer cells. To discover global proteome changes induced by CD44 expression levels, we regulated CD44 transcription by siRNA in two claudin-low breast cancer cell lines. For deep coverage of proteome, we used tandem mass tag-based MS analysis. We discovered 2736 proteins were upregulated and 2172 proteins were downregulated in CD44-knockdown MDA-MB-231 cells. For Hs 578T CD44-knockdown cells, 412 proteins were upregulated and 443 were downregulated. Informatics (Gene ontology and protein-protein interaction network) analysis demonstrated altered oncogenic cellular processes including proliferation, metabolism, and gene expression regulations. To confirm the changes of biology patterns, functional studies were conducted. As a result, we discovered that CD44-regulated proteome of claudin-low breast cancer cells, revealing changes that mediate cell proliferation and migration. In Chapter โ…ก, label free-based MS proteomic analysis of clinical FFPE tissues. To discover candidate prognosis markers for distant metastasis of breast cancer, 10 no-metastasis, 9 late-metastasis, and 9 early-metastasis patientsโ€™ primary tumor samples were analyzed. To achieve an in-depth proteome in the minimum of FFPE slides per sample, we performed well-defined proteomic strategies with high-resolution quadrupole Orbitrap LC-MS/MS. We identified a total of 9,455 protein groups using FFPE slides at 1% of the peptide and protein FDR level. Five biomarker candidates were differentially expressed using pair-wise comparison, and correlation network analysis filtered five candidates into two no metastasis specific and one late metastasis specific proteins. In addition, machine learning-based feature selection detected ten early metastasis classifier proteins, and the system biology method filtered into seven proteins. For external validation, we used published mRNA data of breast primary tumor. Consequently, we suggested seven prognosis protein marker candidates that can help patients who need active treatment.General Abstract i Table of Contents iii Lists of Tables and Figures iv List of Abbreviations x General Introduction 1 Chapter โ…  Quantitative Proteomics Reveals Knockdown of CD44 Promotes Proliferation and Migration in Claudin-Low MDA-MB-231 and Hs 578T Breast Cancer Cell Lines 4 Abstract 5 Introduction 6 Material and Methods 8 Results 16 Discussion 37 Chapter โ…ก In-depth proteome profiling of breast cancer formalin-fixed paraffin-embedded tissue for distant metastasis 41 Abstract 42 Introduction 44 Material and Methods 46 Results 51 Discussion 72 General Discussion 77 Refernece 82 Abstract in Korean 93๋ฐ•

    To metabolomics and beyond: a technological portfolio to investigate cancer metabolism

    Get PDF
    Tumour cells have exquisite flexibility in reprogramming their metabolism in order to support tumour initiation, progression, metastasis and resistance to therapies. These reprogrammed activities include a complete rewiring of the bioenergetic, biosynthetic and redox status to sustain the increased energetic demand of the cells. Over the last decades, the cancer metabolism field has seen an explosion of new biochemical technologies giving more tools than ever before to navigate this complexity. Within a cell or a tissue, the metabolites constitute the direct signature of the molecular phenotype and thus their profiling has concrete clinical applications in oncology. Metabolomics and fluxomics, are key technological approaches that mainly revolutionized the field enabling researchers to have both a qualitative and mechanistic model of the biochemical activities in cancer. Furthermore, the upgrade from bulk to single-cell analysis technologies provided unprecedented opportunity to investigate cancer biology at cellular resolution allowing an in depth quantitative analysis of complex and heterogenous diseases. More recently, the advent of functional genomic screening allowed the identification of molecular pathways, cellular processes, biomarkers and novel therapeutic targets that in concert with other technologies allow patient stratification and identification of new treatment regimens. This review is intended to be a guide for researchers to cancer metabolism, highlighting current and emerging technologies, emphasizing advantages, disadvantages and applications with the potential of leading the development of innovative anti-cancer therapies

    A Multi-Cohort and Multi-Omics Meta-Analysis Framework to Identify Network-Based Gene Signatures

    Get PDF
    Although massive amounts of condition-specific molecular profiles are being accumulated in public repositories every day, meaningful interpretation of these data remains a major challenge. In an effort to identify the biomarkers that describe the key biological phenomena for a given condition, several approaches have been developed over the past few years. However, the majority of these approaches either (i) do not consider the known intermolecular interactions, or (ii) do not integrate molecular data of multiple types (e.g., genomics, transcriptomics, proteomics, epigenomics, etc.), and thus potentially fail to capture the true biological changes responsible for complex diseases (e.g., cancer). In addition, these approaches often ignore the heterogeneity and study bias present in independent molecular cohorts. In this manuscript, we propose a novel multi-cohort and multi-omics meta-analysis framework that overcomes all three limitations mentioned above in order to identify robust molecular subnetworks that capture the key dynamic nature of a given biological condition. Our framework integrates multiple independent gene expression studies, unmatched DNA methylation studies, and protein-protein interactions to identify methylation-driven subnetworks. We demonstrate the proposed framework by constructing subnetworks related to two complex diseases: glioblastoma and low-grade gliomas. We validate the identified subnetworks by showing their ability to predict patients' clinical outcome on multiple independent validation cohorts

    An intelligent management of integrated biomedical data for digital health via Network Medicine and its application to different human diseases

    Get PDF
    Personalized medicine aims to tailor the health care to each personโ€™s unique signature leading to better distinguish an individual patient from the others with similar clinical manifestation. Many different biomedical data types contribute to define this patientโ€™s unique signature, such as omics data produced trough next generation sequencing technologies. The integration of single-omics data, in a sequential or simultaneous manner, could help to understand the interplay of the different molecules thus helping to bridge the gap between genotype and phenotype. To this end, Network Medicine offers a promising formalism for multi-omics data integration by providing a holistic approach that look at the whole system at once rather than focusing on the single entities. This thesis regards the integration of various omics data following two different procedures within the framework of Network Medicine: A procedural multi-omics data integration, where a single omics was first selected to perform the main analysis, and then the other omics were used in cascade to molecularly characterize the results obtained in the main analysis. A parallel multi-omics data integration, where the result was given by the intersection of the results of each single-omics. The procedural multi-omics data integration was leveraged to study Colorectal and Breast Cancer. In the Colorectal Cancer case study, we defined the molecular signatures of a new subgroup of Colorectal Cancer possibly eligible for immune-checkpoint inhibitors therapy. Moreover, in the Breast Cancer case study we defined 11 prognostic biomarkers specific for the Basal-like subtype of Breast Cancer. Instead, the parallel multi-omics data integration was exploited to study COVID-19 and Chronic Obstructive Pulmonary Disease. In the COVID-19 case study, we defined a pool of drugs potentially repurposable for COVID-19. Whereas, in the Chronic Obstructive Pulmonary Disease case study, we discovered a group of differentially expressed and methylated genes that have a considerable biological specificity and could be related to the inflammatory pathological mechanism of Chronic Obstructive Pulmonary Disease
    • โ€ฆ
    corecore