3 research outputs found
Evidence amalgamation, plausibility, and cancer research
Cancer research is experiencing ‘paradigm instability’, since there are two rival theories of carcinogenesis which confront themselves, namely the Somatic Mutation Theory and the Tissue Organization Field Theory. Despite this theoretical uncertainty, a huge quantity of data is available thanks to the improvement of genome sequencing techniques. Some authors think that the development of new statistical tools will be able to overcome the lack of a shared theoretical perspective on cancer by amalgamating as many data as possible. We think instead that a deeper understanding of cancer can be achieved by means of more theoretical work, rather than by merely accumulating more data. To support our thesis, we introduce the analytic view of theory development, which rests on the concept of plausibility, and make clear in what sense plausibility and probability are distinct concepts. Then, the concept of plausibility is used to point out the ineliminable role played by the epistemic subject in the development of statistical tools and in the process of theory assessment. We then move to address a central issue in cancer research, namely the relevance of computational tools developed by bioinformaticists to detect driver mutations in the debate between the two main rival theories of carcinogenesis. Finally, we briefly extend our considerations on the role that plausibility plays in evidence amalgamation from cancer research to the more general issue of the divergences between frequentists and Bayesians in the philosophy of medicine and statistics. We argue that taking into account plausibility-based considerations can lead to clarify some epistemological shortcomings that afflict both these perspectives
Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancers
Recently, many "molecular profiling" projects have yielded vast amounts of genetic, epigenetic, transcription, protein expression, metabolic and drug response data for cancerous tumours, healthy tissues, and cell lines. We aim to facilitate a multi-scale understanding of these high-dimensional biological data and the complexity of the relationships between the different data types taken from human tumours. Further, we intend to identify molecular disease subtypes of various cancers, uncover the subtype-specific drug targets and identify sets of therapeutic molecules that could potentially be used to inhibit these targets. We collected data from over 20 publicly available resources. We then leverage integrative computational systems analyses, network analyses and machine learning, to gain insights into the pathophysiology of pancreatic cancer and 32 other human cancer types. Here, we uncover aberrations in multiple cell signalling and metabolic pathways that implicate regulatory kinases and the Warburg effect as the likely drivers of the distinct molecular signatures of three established pancreatic cancer subtypes. Then, we apply an integrative clustering method to four different types of molecular data to reveal that pancreatic tumours can be segregated into two distinct subtypes. We define sets of proteins, mRNAs, miRNAs and DNA methylation patterns that could serve as biomarkers to accurately differentiate between the two pancreatic cancer subtypes. Then we confirm the biological relevance of the identified biomarkers by showing that these can be used together with pattern-recognition algorithms to infer the drug sensitivity of pancreatic cancer cell lines accurately. Further, we evaluate the alterations of metabolic pathway genes across 32 human cancers. We find that while alterations of metabolic genes are pervasive across all human cancers, the extent of these gene alterations varies between them. Based on these gene alterations, we define two distinct cancer supertypes that tend to be associated with different clinical outcomes and show that these supertypes are likely to respond differently to anticancer drugs. Overall, we show that the time has already arrived where we can leverage available data resources to potentially elicit more precise and personalised cancer therapies that would yield better clinical outcomes at a much lower cost than is currently being achieved
Recommended from our members
Simultaneous modelling and clustering of visual field data
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonIn the health-informatics and bio-medical domains, clinicians produce an enormous amount of data which can be complex and high in dimensionality. This scenario includes visual field data, which are used for managing the second leading cause of blindness in the world: glaucoma. Visual field data are the most common type of data collected to diagnose glaucoma in patients, and usually the data consist of 54 or 76 variables (which are referred to as visual field locations). Due to the large number of variables, the six nerve fiber bundles (6NFB), which is a collection of visual field locations in groups, are the standard clusters used in visual field data to represent the physiological traits of the retina. However, with regard to classification accuracy of the data, this research proposes a technique to find other significant spatial clusters of visual field with higher classification accuracy than the 6NFB.
This thesis presents a novel clustering technique, namely, Simultaneous Modelling and Clustering (SMC). SMC performs clustering on data based on classification accuracy using heuristic search techniques. The method searches a collection of significant clusters of visual field locations that indicate visual field loss progression. The aim of this research is two-fold. Firstly, SMC algorithms are developed and tested on data to investigate the effectiveness and efficiency of the method using optimisation and classification methods. Secondly, a significant clustering arrangement of visual field, which highly interrelated visual field locations to represent progression of visual field loss with high classification accuracy, is searched to complement the 6NFB in diagnosis of glaucoma. A new clustering arrangement of visual field locations can be used by medical practitioners together with the 6NFB to complement each other in diagnosis of glaucoma in patients.
This research conducts extensive experiment work on both visual field and simulated data to evaluate the proposed method. The results obtained suggest the proposed method appears to be an effective and efficient method in clustering visual field data and
3
improving classification accuracy. The key contributions of this work are the novel model-based clustering of visual field data, effective and efficient algorithms for SMC, practical knowledge of visual field data in the diagnosis of glaucoma and the presentation a generic framework for modelling and clustering which is highly applicable to many other dataset/model combinations