289,627 research outputs found

    Algorithms and Software for the Analysis of Large Complex Networks

    Get PDF
    The work presented intersects three main areas, namely graph algorithmics, network science and applied software engineering. Each computational method discussed relates to one of the main tasks of data analysis: to extract structural features from network data, such as methods for community detection; or to transform network data, such as methods to sparsify a network and reduce its size while keeping essential properties; or to realistically model networks through generative models

    MuxViz: A Tool for Multilayer Analysis and Visualization of Networks

    Full text link
    Multilayer relationships among entities and information about entities must be accompanied by the means to analyze, visualize, and obtain insights from such data. We present open-source software (muxViz) that contains a collection of algorithms for the analysis of multilayer networks, which are an important way to represent a large variety of complex systems throughout science and engineering. We demonstrate the ability of muxViz to analyze and interactively visualize multilayer data using empirical genetic, neuronal, and transportation networks. Our software is available at https://github.com/manlius/muxViz.Comment: 18 pages, 10 figures (text of the accepted manuscript

    APPLICATION OF GROUP TESTING FOR ANALYZING NOISY NETWORKS

    Get PDF
    My dissertation focuses on developing scalable algorithms for analyzing large complex networks and evaluating how the results alter with changes to the network. Network analysis has become a ubiquitous and very effective tool in big data analysis, particularly for understanding the mechanisms of complex systems that arise in diverse disciplines such as cybersecurity [83], biology [15], sociology [5], and epidemiology [7]. However, data from real-world systems are inherently noisy because they are influenced by fluctuations in experiments, subjective interpretation of data, and limitation of computing resources. Therefore, the corresponding networks are also approximate. This research addresses these issues of obtaining accurate results from large noisy networks efficiently. My dissertation has four main components. The first component consists of developing efficient and scalable algorithms for centrality computations that produce reliable results on noisy networks. Two novel contributions I made in this area are the development of a group testing [16] based algorithm for identification of high centrality vertices which is extremely faster than current methods, and an algorithm for computing the betweenness centrality of a specific vertex. The second component consists of developing quantitative metrics to measure how different noise models affect the analysis results. We implemented a uniform perturbation model based on random addition/ deletion of edges of a network. To quantify the stability of a network we investigated the effect that perturbations have on the top-k ranked vertices and the local structure properties of the top ranked vertices. The third component consists of developing efficient software for network analysis. I have been part of the development of a software package, ESSENS (Extensible, Scalable Software for Evolving NetworkS) [76], that effectively supports our algorithms on large networks. The fourth component is a literature review of the various noise models that researchers have applied to networks and the methods they have used to quantify the stability, sensitivity, robustness, and reliability of networks. These four aspects together will lead to efficient, accurate, and highly scalable algorithms for analyzing noisy networks

    DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks

    Get PDF
    Background Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale. Results We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks. DIAMIN relies on distributed computing, and it is implemented in Java upon the framework Apache Spark. It delivers a set of functionalities implementing different tasks on an abstract representation of very large graphs, providing a built-in support for methods and algorithms commonly used to analyze these networks. DIAMIN has been tested on data retrieved from two of the most used molecular interactions databases, resulting to be highly efficient and scalable. As shown by different provided examples, DIAMIN can be exploited by users without any distributed programming experience, in order to perform various types of data analysis, and to implement new algorithms based on its primitives. Conclusions The proposed DIAMIN has been proved to be successful in allowing users to solve specific biological problems that can be modeled relying on biological networks, by using its functionalities. The software is freely available and this will hopefully allow its rapid diffusion through the scientific community, to solve both specific data analysis and more complex tasks

    Algorithms for effective querying of compound graph-based pathway databases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Graph-based pathway ontologies and databases are widely used to represent data about cellular processes. This representation makes it possible to programmatically integrate cellular networks and to investigate them using the well-understood concepts of graph theory in order to predict their structural and dynamic properties. An extension of this graph representation, namely hierarchically structured or compound graphs, in which a member of a biological network may recursively contain a sub-network of a somehow logically similar group of biological objects, provides many additional benefits for analysis of biological pathways, including reduction of complexity by decomposition into distinct components or modules. In this regard, it is essential to effectively query such integrated large compound networks to extract the sub-networks of interest with the help of efficient algorithms and software tools.</p> <p>Results</p> <p>Towards this goal, we developed a querying framework, along with a number of graph-theoretic algorithms from simple neighborhood queries to shortest paths to feedback loops, that is applicable to all sorts of graph-based pathway databases, from PPIs (protein-protein interactions) to metabolic and signaling pathways. The framework is unique in that it can account for compound or nested structures and ubiquitous entities present in the pathway data. In addition, the queries may be related to each other through "AND" and "OR" operators, and can be recursively organized into a tree, in which the result of one query might be a source and/or target for another, to form more complex queries. The algorithms were implemented within the querying component of a new version of the software tool P<smcaps>ATIKA</smcaps><it>web </it>(Pathway Analysis Tool for Integration and Knowledge Acquisition) and have proven useful for answering a number of biologically significant questions for large graph-based pathway databases.</p> <p>Conclusion</p> <p>The P<smcaps>ATIKA</smcaps> Project Web site is <url>http://www.patika.org</url>. P<smcaps>ATIKA</smcaps><it>web </it>version 2.1 is available at <url>http://web.patika.org</url>.</p

    Proceedings of the Conference on Human and Economic Resources

    Get PDF
    Recent development of information technologies and telecommunications have given rise to an extraordinary increase in the data transactions in the financial markets. In large and transparent markets, with lower transactions and information costs, financial participants react more rapidly to changes in the profitability of their assets, and in their perception of the risks of the different financial instruments. In this respect, if the rapidity of reaction of financial players is the main feature of globalized markets, then only advanced information technologies, which uses data resources efficiently are capable of reflecting these complex nature of financial markets. The aim of this paper is to show how the new information technologies affect modelling of financial markets and decisions by using limited data resources within an intelligent system. By using intelligent information systems, mainly neural networks, this paper tries to show how the the limited economic data can be used for efficient economic decisions in the global financial markets. Advances in microprocessors and software technologies make it possible to enable the development of increasingly powerful systems at reasonable costs. The new technologies have created artificial systems, which imitate people’s brain for efficient analysis of economic data. According to Hertz, Krogh and Palmer (1991), artificial neural networks which have a similar structure of the brain consist of nodes passing activation signals to each other. Within the nodes, if incoming activation signals from the others are combined some of the nodes will produce an activation signal modified by a connection weight between it and the node to which it is linked. By using financial data from international foreign exchange markets, namely daily time series of EUR/USD parity, and by employing certain neural network algorithms, it has showed that new information technologies have advantages on efficient usage of limited economic data in modeling. By investigating the “artificial” works on modeling of international financial markets, this paper is tried to show how limited information in the markets can be used for efficient economic decisions. By investigating certain neural networks algorithms, the paper displays how artificial neural networks have been used for efficient economic modeling and decisions in global F/X markets. New information technologies have many advantages over statistics methods in terms of efficient data modeling. They are capable of analyzing complex patterns quickly and with a high degree of accuracy. Since, “artificial” information systems do not make any assumptions about the nature of the distribution of the data, they are not biased in their analysis. By using different neural network algorithms, the economic data can be modeled in an efficient way. Especially if the markets are non-linear and complex, the intelligent systems are more powerful on explaining the market behavior in the chaotic environments. With more advanced information technologies, in the future, it will be possible to model all the complexity of the economic life. New researches in the future need a more strong interaction between economics and computer science.neural networks,knowledge, information technology, communication technology

    Efficiently Counting Complex Multilayer Temporal Motifs in Large-Scale Networks

    Get PDF
    This paper proposes novel algorithms for efficiently counting complex network motifs in dynamic networks that are changing over time. Network motifs are small characteristic configurations of a few nodes and edges, and have repeatedly been shown to provide insightful information for understanding the meso-level structure of a network. Here, we deal with counting more complex temporal motifs in large-scale networks that may consist of millions of nodes and edges. The first contribution is an efficient approach to count temporal motifs in multilayer networks and networks with partial timing, two prevalent aspects of many real-world complex networks. We analyze the complexity of these algorithms and empirically validate their performance on a number of real-world user communication networks extracted from online knowledge exchange platforms. Among other things, we find that the multilayer aspects provide significant insights in how complex user interaction patterns differ substantially between online platforms. The second contribution is an analysis of the viability of motif counting algorithms for motifs that are larger than the triad motifs studied in previous work. We provide a novel categorization of motifs of size four, and determine how and at what computational cost these motifs can still be counted efficiently. In doing so, we delineate the “computational frontier” of temporal motif counting algorithms.Algorithms and the Foundations of Software technolog

    ADAM: Analysis of Discrete Models of Biological Systems Using Computer Algebra

    Get PDF
    Abstract Background Many biological systems are modeled qualitatively with discrete models, such as probabilistic Boolean networks, logical models, Petri nets, and agent-based models, to gain a better understanding of them. The computational complexity to analyze the complete dynamics of these models grows exponentially in the number of variables, which impedes working with complex models. There exist software tools to analyze discrete models, but they either lack the algorithmic functionality to analyze complex models deterministically or they are inaccessible to many users as they require understanding the underlying algorithm and implementation, do not have a graphical user interface, or are hard to install. Efficient analysis methods that are accessible to modelers and easy to use are needed. Results We propose a method for efficiently identifying attractors and introduce the web-based tool Analysis of Dynamic Algebraic Models (ADAM), which provides this and other analysis methods for discrete models. ADAM converts several discrete model types automatically into polynomial dynamical systems and analyzes their dynamics using tools from computer algebra. Specifically, we propose a method to identify attractors of a discrete model that is equivalent to solving a system of polynomial equations, a long-studied problem in computer algebra. Based on extensive experimentation with both discrete models arising in systems biology and randomly generated networks, we found that the algebraic algorithms presented in this manuscript are fast for systems with the structure maintained by most biological systems, namely sparseness and robustness. For a large set of published complex discrete models, ADAM identified the attractors in less than one second. Conclusions Discrete modeling techniques are a useful tool for analyzing complex biological systems and there is a need in the biological community for accessible efficient analysis tools. ADAM provides analysis methods based on mathematical algorithms as a web-based tool for several different input formats, and it makes analysis of complex models accessible to a larger community, as it is platform independent as a web-service and does not require understanding of the underlying mathematics
    • …
    corecore