80 research outputs found

    Protein complex detection with semi-supervised learning in protein interaction networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interactions (PPIs) play fundamental roles in nearly all biological processes. The systematic analysis of PPI networks can enable a great understanding of cellular organization, processes and function. In this paper, we investigate the problem of protein complex detection from noisy protein interaction data, i.e., finding the subsets of proteins that are closely coupled via protein interactions. However, protein complexes are likely to overlap and the interaction data are very noisy. It is a great challenge to effectively analyze the massive data for biologically meaningful protein complex detection.</p> <p>Results</p> <p>Many people try to solve the problem by using the traditional unsupervised graph clustering methods. Here, we stand from a different point of view, redefining the properties and features for protein complexes and designing a “semi-supervised” method to analyze the problem. In this paper, we utilize the neural network with the “semi-supervised” mechanism to detect the protein complexes. By retraining the neural network model recursively, we could find the optimized parameters for the model, in such a way we can successfully detect the protein complexes. The comparison results show that our algorithm could identify protein complexes that are missed by other methods. We also have shown that our method achieve better precision and recall rates for the identified protein complexes than other existing methods. In addition, the framework we proposed is easy to be extended in the future.</p> <p>Conclusions</p> <p>Using a weighted network to represent the protein interaction network is more appropriate than using a traditional unweighted network. In addition, integrating biological features and topological features to represent protein complexes is more meaningful than using dense subgraphs. Last, the “semi-supervised” learning model is a promising model to detect protein complexes with more biological and topological features available.</p

    Recent advances in clustering methods for protein interaction networks

    Get PDF
    The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed

    A review of aligners for protein protein interaction networks

    Get PDF
    Protein Protein Interaction (PPI) can be considered as network. Alignment is the process of mapping nodes from one network to another network. The main objective of network alignment is to identify small, well defined interactome units such as protein complexes or conserved pathways that are analogous in the input network. Network alignment uncovers the relationship between protein complexes and functions. Similarity between two graph structures can be identified by evaluating the topology. Network alignment identifies either topological or sequential similarity. Gene annotations reveal the functional or sequential similarity and it can be evaluated based on semantic similarity. In this paper, we review the various network aligners and classify them according to the methodologies. We discuss the different evaluation metrics and the popular databases of protein interactions

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Thermal-Aware Networked Many-Core Systems

    Get PDF
    Advancements in IC processing technology has led to the innovation and growth happening in the consumer electronics sector and the evolution of the IT infrastructure supporting this exponential growth. One of the most difficult obstacles to this growth is the removal of large amount of heatgenerated by the processing and communicating nodes on the system. The scaling down of technology and the increase in power density is posing a direct and consequential effect on the rise in temperature. This has resulted in the increase in cooling budgets, and affects both the life-time reliability and performance of the system. Hence, reducing on-chip temperatures has become a major design concern for modern microprocessors. This dissertation addresses the thermal challenges at different levels for both 2D planer and 3D stacked systems. It proposes a self-timed thermal monitoring strategy based on the liberal use of on-chip thermal sensors. This makes use of noise variation tolerant and leakage current based thermal sensing for monitoring purposes. In order to study thermal management issues from early design stages, accurate thermal modeling and analysis at design time is essential. In this regard, spatial temperature profile of the global Cu nanowire for on-chip interconnects has been analyzed. It presents a 3D thermal model of a multicore system in order to investigate the effects of hotspots and the placement of silicon die layers, on the thermal performance of a modern ip-chip package. For a 3D stacked system, the primary design goal is to maximise the performance within the given power and thermal envelopes. Hence, a thermally efficient routing strategy for 3D NoC-Bus hybrid architectures has been proposed to mitigate on-chip temperatures by herding most of the switching activity to the die which is closer to heat sink. Finally, an exploration of various thermal-aware placement approaches for both the 2D and 3D stacked systems has been presented. Various thermal models have been developed and thermal control metrics have been extracted. An efficient thermal-aware application mapping algorithm for a 2D NoC has been presented. It has been shown that the proposed mapping algorithm reduces the effective area reeling under high temperatures when compared to the state of the art.Siirretty Doriast

    Computational Labeling, Partitioning, and Balancing of Molecular Networks

    Get PDF
    Recent advances in high throughput techniques enable large-scale molecular quantification with high accuracy, including mRNAs, proteins and metabolites. Differential expression of these molecules in case and control samples provides a way to select phenotype-associated molecules with statistically significant changes. However, given the significance ranking list of molecular changes, how those molecules work together to drive phenotype formation is still unclear. In particular, the changes in molecular quantities are insufficient to interpret the changes in their functional behavior. My study is aimed at answering this question by integrating molecular network data to systematically model and estimate the changes of molecular functional behaviors. We build three computational models to label, partition, and balance molecular networks using modern machine learning techniques. (1) Due to the incompleteness of protein functional annotation, we develop AptRank, an adaptive PageRank model for protein function prediction on bilayer networks. By integrating Gene Ontology (GO) hierarchy with protein-protein interaction network, our AptRank outperforms four state-of-the-art methods in a comprehensive evaluation using benchmark datasets. (2) We next extend our AptRank into a network partitioning method, BioSweeper, to identify functional network modules in which molecules share similar functions and also densely connect to each other. Compared to traditional network partitioning methods using only network connections, BioSweeper, which integrates the GO hierarchy, can automatically identify functionally enriched network modules. (3) Finally, we conduct a differential interaction analysis, namely difFBA, on protein-protein interaction networks by simulating protein fluxes using flux balance analysis (FBA). We test difFBA using quantitative proteomic data from colon cancer, and demonstrate that difFBA offers more insights into functional changes in molecular behavior than does protein quantity changes alone. We conclude that our integrative network model increases the observational dimensions of complex biological systems, and enables us to more deeply understand the causal relationships between genotypes and phenotypes

    Enabling knowledge-defined networks : deep reinforcement learning, graph neural networks and network analytics

    Get PDF
    Significant breakthroughs in the last decade in the Machine Learning (ML) field have ushered in a new era of Artificial Intelligence (AI). Particularly, recent advances in Deep Learning (DL) have enabled to develop a new breed of modeling and optimization tools with a plethora of applications in different fields like natural language processing, or computer vision. In this context, the Knowledge-Defined Networking (KDN) paradigm highlights the lack of adoption of AI techniques in computer networks and – as a result – proposes a novel architecture that relies on Software-Defined Networking (SDN) and modern network analytics techniques to facilitate the deployment of ML-based solutions for efficient network operation. This dissertation aims to be a step forward in the realization of Knowledge-Defined Networks. In particular, we focus on the application of AI techniques to control and optimize networks more efficiently and automatically. To this end, we identify two components within the KDN context whose development may be crucial to achieve self-operating networks in the future: (i) the automatic control module, and (ii) the network analytics platform. The first part of this thesis is devoted to the construction of efficient automatic control modules. First, we explore the application of Deep Reinforcement Learning (DRL) algorithms to optimize the routing configuration in networks. DRL has recently demonstrated an outstanding capability to solve efficiently decision-making problems in other fields. However, first DRL-based attempts to optimize routing in networks have failed to achieve good results, often under-performing traditional heuristics. In contrast to previous DRL-based solutions, we propose a more elaborate network representation that facilitates DRL agents to learn efficient routing strategies. Our evaluation results show that DRL agents using the proposed representation achieve better performance and learn faster how to route traffic in an Optical Transport Network (OTN) use case. Second, we lay the foundations on the use of Graph Neural Networks (GNN) to build ML-based network optimization tools. GNNs are a newly proposed family of DL models specifically tailored to operate and generalize over graphs of variable size and structure. In this thesis, we posit that GNNs are well suited to model the relationships between different network elements inherently represented as graphs (e.g., topology, routing). Particularly, we use a custom GNN architecture to build a routing optimization solution that – unlike previous ML-based proposals – is able to generalize well to topologies, routing configurations, and traffic never seen during the training phase. The second part of this thesis investigates the design of practical and efficient network analytics solutions in the KDN context. Network analytics tools are crucial to provide the control plane with a rich and timely view of the network state. However this is not a trivial task considering that all this information turns typically into big data in real-world networks. In this context, we analyze the main aspects that should be considered when measuring and classifying traffic in SDN (e.g., scalability, accuracy, cost). As a result, we propose a practical solution that produces flow-level measurement reports similar to those of NetFlow/IPFIX in traditional networks. The proposed system relies only on native features of OpenFlow – currently among the most established standards in SDN – and incorporates mechanisms to maintain efficiently flow-level statistics in commodity switches and report them asynchronously to the control plane. Additionally, a system that combines ML and Deep Packet Inspection (DPI) identifies the applications that generate each traffic flow.La evolución del campo del Aprendizaje Maquina (ML) en la última década ha dado lugar a una nueva era de la Inteligencia Artificial (AI). En concreto, algunos avances en el campo del Aprendizaje Profundo (DL) han permitido desarrollar nuevas herramientas de modelado y optimización con múltiples aplicaciones en campos como el procesado de lenguaje natural, o la visión artificial. En este contexto, el paradigma de Redes Definidas por Conocimiento (KDN) destaca la falta de adopción de técnicas de AI en redes y, como resultado, propone una nueva arquitectura basada en Redes Definidas por Software (SDN) y en técnicas modernas de análisis de red para facilitar el despliegue de soluciones basadas en ML. Esta tesis pretende representar un avance en la realización de redes basadas en KDN. En particular, investiga la aplicación de técnicas de AI para operar las redes de forma más eficiente y automática. Para ello, identificamos dos componentes en el contexto de KDN cuyo desarrollo puede resultar esencial para conseguir redes operadas autónomamente en el futuro: (i) el módulo de control automático y (ii) la plataforma de análisis de red. La primera parte de esta tesis aborda la construcción del módulo de control automático. En primer lugar, se explora el uso de algoritmos de Aprendizaje Profundo por Refuerzo (DRL) para optimizar el encaminamiento de tráfico en redes. DRL ha demostrado una capacidad sobresaliente para resolver problemas de toma de decisiones en otros campos. Sin embargo, los primeros trabajos que han aplicado DRL a la optimización del encaminamiento en redes no han conseguido rendimientos satisfactorios. Frente a dichas soluciones previas, proponemos una representación más elaborada de la red que facilita a los agentes DRL aprender estrategias de encaminamiento eficientes. Nuestra evaluación muestra que cuando los agentes DRL utilizan la representación propuesta logran mayor rendimiento y aprenden más rápido cómo encaminar el tráfico en un caso práctico en Redes de Transporte Ópticas (OTN). En segundo lugar, se presentan las bases sobre la utilización de Redes Neuronales de Grafos (GNN) para construir herramientas de optimización de red. Las GNN constituyen una nueva familia de modelos de DL específicamente diseñados para operar y generalizar sobre grafos de tamaño y estructura variables. Esta tesis destaca la idoneidad de las GNN para modelar las relaciones entre diferentes elementos de red que se representan intrínsecamente como grafos (p. ej., topología, encaminamiento). En particular, utilizamos una arquitectura GNN específicamente diseñada para optimizar el encaminamiento de tráfico que, a diferencia de las propuestas anteriores basadas en ML, es capaz de generalizar correctamente sobre topologías, configuraciones de encaminamiento y tráfico nunca vistos durante el entrenamiento La segunda parte de esta tesis investiga el diseño de herramientas de análisis de red eficientes en el contexto de KDN. El análisis de red resulta esencial para proporcionar al plano de control una visión completa y actualizada del estado de la red. No obstante, esto no es una tarea trivial considerando que esta información representa una cantidad masiva de datos en despliegues de red reales. Esta parte de la tesis analiza los principales aspectos a considerar a la hora de medir y clasificar el tráfico en SDN (p. ej., escalabilidad, exactitud, coste). Como resultado, se propone una solución práctica que genera informes de medidas de tráfico a nivel de flujo similares a los de NetFlow/IPFIX en redes tradicionales. El sistema propuesto utiliza sólo funciones soportadas por OpenFlow, actualmente uno de los estándares más consolidados en SDN, y permite mantener de forma eficiente estadísticas de tráfico en conmutadores con características básicas y enviarlas de forma asíncrona hacia el plano de control. Asimismo, un sistema que combina ML e Inspección Profunda de Paquetes (DPI) identifica las aplicaciones que generan cada flujo de tráfico.Postprint (published version

    Relational learning on temporal knowledge graphs

    Get PDF
    Over the last decade, there has been an increasing interest in relational machine learning (RML), which studies methods for the statistical analysis of relational or graph-structured data. Relational data arise naturally in many real-world applications, including social networks, recommender systems, and computational finance. Such data can be represented in the form of a graph consisting of nodes (entities) and labeled edges (relationships between entities). While traditional machine learning techniques are based on feature vectors, RML takes relations into account and permits inference among entities. Recently, performing prediction and learning tasks on knowledge graphs has become a main topic in RML. Knowledge graphs (KGs) are widely used resources for studying multi-relational data in the form of a directed graph, where each labeled edge describes a factual statement, such as (Munich, locatedIn, Germany). Traditionally, knowledge graphs are considered to represent stationary relationships, which do not change over time. In contrast, event-based multi-relational data exhibits complex temporal dynamics in addition to its multi-relational nature. For example, the political relationship between two countries would intensify because of trade fights; the president of a country may change after an election. To represent the temporal aspect, temporal knowledge graphs (tKGs) were introduced that store a temporal event as a quadruple by extending the static triple with a timestamp describing when this event occurred, i.e. (Barack Obama, visit, India, 2010-11-06). Thus, each edge in the graph has temporal information associated with it and may recur or evolve over time. Among various learning paradigms on KGs, knowledge representation learning (KRL), also known as knowledge graph embedding, has achieved great success. KRL maps entities and relations into low-dimensional vector spaces while capturing semantic meanings. However, KRL approaches have mostly been done for static KGs and lack the ability to utilize rich temporal dynamics available on tKGs. In this thesis, we study state-of-the-art representation learning techniques for temporal knowledge graphs that can capture temporal dependencies across entities in addition to their relational dependencies. We discover representations for two inference tasks, i.e., tKG forecasting and completion. The former is to forecast future events using historical observations up to the present time, while the latter predicts missing links at observed timestamps. For tKG forecasting, we show how to make the reasoning process interpretable while maintaining performance by employing a sequential reasoning process over local subgraphs. Besides, we propose a continuous-depth multi-relational graph neural network with a novel graph neural ordinary differential equation. It allows for learning continuous-time representations of tKGs, especially in cases with observations in irregular time intervals, as encountered in online analysis. For tKG completion, we systematically review multiple benchmark models. We thoroughly investigate the significance of the proposed temporal encoding technique in each model and provide the first unified open-source framework, which gathers the implementations of well-known tKG completion models. Finally, we discuss the power of geometric learning and show that learning evolving entity representations in a product of Riemannian manifolds can better reflect geometric structures on tKGs and achieve better performances than Euclidean embeddings while requiring significantly fewer model parameters

    Towards Optimal Application Mapping for Energy-Efficient Many-Core Platforms

    Get PDF
    Siirretty Doriast

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI
    corecore