3,048 research outputs found
A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database
Current tools and techniques devoted to examine the
content of large databases are often hampered by their inability
to support searches based on criteria that are meaningful to
their users. These shortcomings are particularly evident in data
banks storing representations of structural data such as biological
networks. Conceptual clustering techniques have demonstrated
to be appropriate for uncovering relationships between features
that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious
relations, but fail to discover the lessfrequent but more informative
underlying data associations. The combination of evolutionary
algorithms with multiobjective and multimodal optimization
techniques constitutes a suitable tool for solving this problem.
We propose a novel conceptual clustering methodology termed
evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We
apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models
can explain and predict phenotypes in the immunoinflammatory
response problem, similar to those provided by gene expression or
other genetic markers. The analysis of these results reveals that
our approach uncovers cohesive clusters, even those comprising a
small number of observations explained by several features, which
allows describing objects and their interactions from different
perspectives and at different levels of detail.Ministerio de Ciencia y Tecnología TIC-2003-00877Ministerio de Ciencia y Tecnología BIO2004-0270EMinisterio de Ciencia y Tecnología TIN2006-1287
Graphs in molecular biology
Graph theoretical concepts are useful for the description and analysis of interactions and relationships in biological systems. We give a brief introduction into some of the concepts and their areas of application in molecular biology. We discuss software that is available through the Bioconductor project and present a simple example application to the integration of a protein-protein interaction and a co-expression network
Dynamic Influence Networks for Rule-based Models
We introduce the Dynamic Influence Network (DIN), a novel visual analytics
technique for representing and analyzing rule-based models of protein-protein
interaction networks. Rule-based modeling has proved instrumental in developing
biological models that are concise, comprehensible, easily extensible, and that
mitigate the combinatorial complexity of multi-state and multi-component
biological molecules. Our technique visualizes the dynamics of these rules as
they evolve over time. Using the data produced by KaSim, an open source
stochastic simulator of rule-based models written in the Kappa language, DINs
provide a node-link diagram that represents the influence that each rule has on
the other rules. That is, rather than representing individual biological
components or types, we instead represent the rules about them (as nodes) and
the current influence of these rules (as links). Using our interactive DIN-Viz
software tool, researchers are able to query this dynamic network to find
meaningful patterns about biological processes, and to identify salient aspects
of complex rule-based models. To evaluate the effectiveness of our approach, we
investigate a simulation of a circadian clock model that illustrates the
oscillatory behavior of the KaiC protein phosphorylation cycle.Comment: Accepted to TVCG, in pres
Resolving Structure in Human Brain Organization: Identifying Mesoscale Organization in Weighted Network Representations
Human brain anatomy and function display a combination of modular and
hierarchical organization, suggesting the importance of both cohesive
structures and variable resolutions in the facilitation of healthy cognitive
processes. However, tools to simultaneously probe these features of brain
architecture require further development. We propose and apply a set of methods
to extract cohesive structures in network representations of brain connectivity
using multi-resolution techniques. We employ a combination of soft
thresholding, windowed thresholding, and resolution in community detection,
that enable us to identify and isolate structures associated with different
weights. One such mesoscale structure is bipartivity, which quantifies the
extent to which the brain is divided into two partitions with high connectivity
between partitions and low connectivity within partitions. A second,
complementary mesoscale structure is modularity, which quantifies the extent to
which the brain is divided into multiple communities with strong connectivity
within each community and weak connectivity between communities. Our methods
lead to multi-resolution curves of these network diagnostics over a range of
spatial, geometric, and structural scales. For statistical comparison, we
contrast our results with those obtained for several benchmark null models. Our
work demonstrates that multi-resolution diagnostic curves capture complex
organizational profiles in weighted graphs. We apply these methods to the
identification of resolution-specific characteristics of healthy weighted graph
architecture and altered connectivity profiles in psychiatric disease.Comment: Comments welcom
Identification of phenotype-specific networks from paired gene expression-cell shape imaging data
The morphology of breast cancer cells is often used as an indicator of tumor severity and prognosis. Additionally, morphology can be used to identify more fine-grained, molecular developments within a cancer cell, such as transcriptomic changes and signaling pathway activity. Delineating the interface between morphology and signaling is important to understand the mechanical cues that a cell processes in order to undergo epithelial-to-mesenchymal transition and consequently metastasize. However, the exact regulatory systems that define these changes remain poorly characterized. In this study, we used a network-systems approach to integrate imaging data and RNA-seq expression data. Our workflow allowed the discovery of unbiased and context-specific gene expression signatures and cell signaling subnetworks relevant to the regulation of cell shape, rather than focusing on the identification of previously known, but not always representative, pathways. By constructing a cell-shape signaling network from shape-correlated gene expression modules and their upstream regulators, we found central roles for developmental pathways such as WNT and Notch, as well as evidence for the fine control of NF-kB signaling by numerous kinase and transcriptional regulators. Further analysis of our network implicates a gene expression module enriched in the RAP1 signaling pathway as a mediator between the sensing of mechanical stimuli and regulation of NF-kB activity, with specific relevance to cell shape in breast cancer
Multimodal Data Fusion and Quantitative Analysis for Medical Applications
Medical big data is not only enormous in its size, but also heterogeneous and complex in its data structure, which makes conventional systems or algorithms difficult to process. These heterogeneous medical data include imaging data (e.g., Positron Emission Tomography (PET), Computerized Tomography (CT), Magnetic Resonance Imaging (MRI)), and non-imaging data (e.g., laboratory biomarkers, electronic medical records, and hand-written doctor notes). Multimodal data fusion is an emerging vital field to address this urgent challenge, aiming to process and analyze the complex, diverse and heterogeneous multimodal data. The fusion algorithms bring great potential in medical data analysis, by 1) taking advantage of complementary information from different sources (such as functional-structural complementarity of PET/CT images) and 2) exploiting consensus information that reflects the intrinsic essence (such as the genetic essence underlying medical imaging and clinical symptoms). Thus, multimodal data fusion benefits a wide range of quantitative medical applications, including personalized patient care, more optimal medical operation plan, and preventive public health.
Though there has been extensive research on computational approaches for multimodal fusion, there are three major challenges of multimodal data fusion in quantitative medical applications, which are summarized as feature-level fusion, information-level fusion and knowledge-level fusion:
• Feature-level fusion. The first challenge is to mine multimodal biomarkers from high-dimensional small-sample multimodal medical datasets, which hinders the effective discovery of informative multimodal biomarkers. Specifically, efficient dimension reduction algorithms are required to alleviate "curse of dimensionality" problem and address the criteria for discovering interpretable, relevant, non-redundant and generalizable multimodal biomarkers.
• Information-level fusion. The second challenge is to exploit and interpret inter-modal and intra-modal information for precise clinical decisions. Although radiomics and multi-branch deep learning have been used for implicit information fusion guided with supervision of the labels, there is a lack of methods to explicitly explore inter-modal relationships in medical applications. Unsupervised multimodal learning is able to mine inter-modal relationship as well as reduce the usage of labor-intensive data and explore potential undiscovered biomarkers; however, mining discriminative information without label supervision is an upcoming challenge. Furthermore, the interpretation of complex non-linear cross-modal associations, especially in deep multimodal learning, is another critical challenge in information-level fusion, which hinders the exploration of multimodal interaction in disease mechanism.
• Knowledge-level fusion. The third challenge is quantitative knowledge distillation from multi-focus regions on medical imaging. Although characterizing imaging features from single lesions using either feature engineering or deep learning methods have been investigated in recent years, both methods neglect the importance of inter-region spatial relationships. Thus, a topological profiling tool for multi-focus regions is in high demand, which is yet missing in current feature engineering and deep learning methods. Furthermore, incorporating domain knowledge with distilled knowledge from multi-focus regions is another challenge in knowledge-level fusion.
To address the three challenges in multimodal data fusion, this thesis provides a multi-level fusion framework for multimodal biomarker mining, multimodal deep learning, and knowledge distillation from multi-focus regions. Specifically, our major contributions in this thesis include:
• To address the challenges in feature-level fusion, we propose an Integrative Multimodal Biomarker Mining framework to select interpretable, relevant, non-redundant and generalizable multimodal biomarkers from high-dimensional small-sample imaging and non-imaging data for diagnostic and prognostic applications. The feature selection criteria including representativeness, robustness, discriminability, and non-redundancy are exploited by consensus clustering, Wilcoxon filter, sequential forward selection, and correlation analysis, respectively. SHapley Additive exPlanations (SHAP) method and nomogram are employed to further enhance feature interpretability in machine learning models.
• To address the challenges in information-level fusion, we propose an Interpretable Deep Correlational Fusion framework, based on canonical correlation analysis (CCA) for 1) cohesive multimodal fusion of medical imaging and non-imaging data, and 2) interpretation of complex non-linear cross-modal associations. Specifically, two novel loss functions are proposed to optimize the discovery of informative multimodal representations in both supervised and unsupervised deep learning, by jointly learning inter-modal consensus and intra-modal discriminative information. An interpretation module is proposed to decipher the complex non-linear cross-modal association by leveraging interpretation methods in both deep learning and multimodal consensus learning.
• To address the challenges in knowledge-level fusion, we proposed a Dynamic Topological Analysis framework, based on persistent homology, for knowledge distillation from inter-connected multi-focus regions in medical imaging and incorporation of domain knowledge. Different from conventional feature engineering and deep learning, our DTA framework is able to explicitly quantify inter-region topological relationships, including global-level geometric structure and community-level clusters. K-simplex Community Graph is proposed to construct the dynamic community graph for representing community-level multi-scale graph structure. The constructed dynamic graph is subsequently tracked with a novel Decomposed Persistence algorithm. Domain knowledge is incorporated into the Adaptive Community Profile, summarizing the tracked multi-scale community topology with additional customizable clinically important factors
Solutions to decision-making problems in management engineering using molecular computational algorithms and experimentations
制度:新 ; 報告番号:甲3368号 ; 学位の種類:博士(工学) ; 授与年月日:2011/5/23 ; 早大学位記番号:新568
- …