7 research outputs found

    Identifying Driver Genomic Alterations in Cancers by Searching Minimum-Weight, Mutually Exclusive Sets

    Get PDF
    An important goal of cancer genomic research is to identify the driving pathways underlying disease mechanisms and the heterogeneity of cancers. It is well known that somatic genome alterations (SGAs) affecting the genes that encode the proteins within a common signaling pathway exhibit mutual exclusivity, in which these SGAs usually do not co-occur in a tumor. With some success, this characteristic has been utilized as an objective function to guide the search for driver mutations within a pathway. However, mutual exclusivity alone is not sufficient to indicate that genes affected by such SGAs are in common pathways. Here, we propose a novel, signal-oriented framework for identifying driver SGAs. First, we identify the perturbed cellular signals by mining the gene expression data. Next, we search for a set of SGA events that carries strong information with respect to such perturbed signals while exhibiting mutual exclusivity. Finally, we design and implement an efficient exact algorithm to solve an NP-hard problem encountered in our approach. We apply this framework to the ovarian and glioblastoma tumor data available at the TCGA database, and perform systematic evaluations. Our results indicate that the signal-oriented approach enhances the ability to find informative sets of driver SGAs that likely constitute signaling pathways

    An exact algorithm for finding cancer driver somatic genome alterations: The weighted mutually exclusive maximum set cover problem

    Get PDF
    Background: The mutual exclusivity of somatic genome alterations (SGAs), such as somatic mutations and copy number alterations, is an important observation of tumors and is widely used to search for cancer signaling pathways or SGAs related to tumor development. However, one problem with current methods that use mutual exclusivity is that they are not signal-based; another problem is that they use heuristic algorithms to handle the NP-hard problems, which cannot guarantee to find the optimal solutions of their models. Method: In this study, we propose a novel signal-based method that utilizes the intrinsic relationship between SGAs on signaling pathways and expression changes of downstream genes regulated by pathways to identify cancer signaling pathways using the mutually exclusive property. We also present a relatively efficient exact algorithm that can guarantee to obtain the optimal solution of the new computational model. Results: We have applied our new model and exact algorithm to the breast cancer data. The results reveal that our new approach increases the capability of finding better solutions in the application of cancer research. Our new exact algorithm has a time complexity of O* (1.325m)(Note: Following the recent convention, we use a star * to represent that the polynomial part of the time complexity is neglected), which has solved the NP-hard problem of our model efficiently. Conclusion: Our new method and algorithm can discover the true causes behind the phenotypes, such as what SGA events lead to abnormality of the cell cycle or make the cell metastasis lose control in tumors; thus, it identifies the target candidates for precision (or target) therapeutics

    Identifying Patterns of Cancer Disease Mechanisms by Mining Alternative Representations of Genomic Alterations

    Get PDF
    Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying combinatorial patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for guiding treatment and predicting outcome. However, identifying perturbed pathways is challenging, because different tumors can have the same perturbed pathways that are perturbed by different SGAs. We started off by designing a novel semantic representation that captures the functional similarity of distinct SGAs perturbing a common pathway in different tumors. This representation was used alongside the nested hierarchical Dirichlet process topic model in order to identify combinatorial patterns in altered signaling pathways. We found that the topic model was able to capture the functional relationships between topics. It was also able to identify cancer subtypes composed of tumors from different tissues of origin that exhibit different survival rates. These results led us to investigate the performance of the methodology on pan-cancer data, as well as in conjunction with cancer driver data. The results revealed that the framework was still able to identify clinically relevant features in pan-cancer. However, the addition of driver data decreased the noise in the data and improved the separation of tumors in the clustering results. This provided support for including the use of driver data in our methodology. In order to have gene representations independent of literature, we developed a biological representation that could identify functionally related genes. Its performance when used alongside topic modeling was tested. We found that the topic association patterns separated tumors by their tissue of origin. But, analyzing some of the cancer types on an individual basis still led to significant differences in survival. Our studies show the potential for using alternative representations in conjunction with topic modeling to investigate complex genomic diseases. With further research and refinement of this methodology, it has the potential to capture the relationship between pathways involved in cancer. This would contribute to a better understanding of cancer disease mechanisms and treatment

    On the investigation of the large-scale grouping constrained storage location assignment problem

    Get PDF
    The primary focus of this study is a novel optimisation problem, namely Storage Location Assignment Problem with Grouping Constraint (SLAP-GC). The problem stems from real-world applications and is significant in theoretical values and applicability in resource allocation tasks where groupings must be considered. The aim of this problem is to minimise the total operational cost in a warehouse through stock rearrangement. The problem consists of two interdependent subproblems, grouping same product items and assigning items to minimize picking distance. The interactions between these two subproblems make this problem significantly different from previous Storage Location Assignment Problems (SLAP), a well-studied field in logistics. Existing approaches for SLAP are not directly applicable for SLAP-GC. This dissertation lays a foundation for research on grouping constraints and other optimisation problems with similar interactions between subproblems. Firstly this study presents a formal definition of SLAP-GC. Then it others a formal proof of NP-completeness of SLAP-GC by reducing from a well-known 3-Partition problem to SLAP-GC. This suggests that the real-world instances of SLAP-GC should not be tackled with exact approaches, but with approximation and heuristic approaches. Then, we explored decomposition and modelling techniques for SLAP-GC and developed three types of promising heuristic approaches: a hyperheuristic approach, a metaheuristic approach and a matheuristic approach. Comprehensive experimental studies are conducted on both synthetic benchmark instances and real-world instances to examine their efficiency, efficacy, and scalability. Through the analysis of the experimental results, the suitability of proposed methods is verified on various SLAP-GC scenarios. In addition, we demonstrate in this study that with the proposed decomposition, large-scale SLAP-GC can be handled efficiently by the three proposed heuristic-based approaches
    corecore