846 research outputs found

    Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data

    Get PDF
    Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations

    Power Law of Customers' Expenditures in Convenience Stores

    Full text link
    In a convenience store chain, a tail of the cumulative density function of the expenditure of a person during a single shopping trip follows a power law with an exponent of -2.5. The exponent is independent of the location of the store, the shopper's age, the day of week, and the time of day.Comment: 9 pages, 5 figures. Accepted for publication in Journal of the Physical Society of Japan Vol.77No.

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Full text link
    Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

    Degree distributions of growing networks

    Get PDF
    The in-degree and out-degree distributions of a growing network model are determined. The in-degree is the number of incoming links to a given node (and vice versa for out-degree. The network is built by (i) creation of new nodes which each immediately attach to a pre-existing node, and (ii) creation of new links between pre-existing nodes. This process naturally generates correlated in- and out-degree distributions. When the node and link creation rates are linear functions of node degree, these distributions exhibit distinct power-law forms. By tuning the parameters in these rates to reasonable values, exponents which agree with those of the web graph are obtained

    Truncation of power law behavior in "scale-free" network models due to information filtering

    Full text link
    We formulate a general model for the growth of scale-free networks under filtering information conditions--that is, when the nodes can process information about only a subset of the existing nodes in the network. We find that the distribution of the number of incoming links to a node follows a universal scaling form, i.e., that it decays as a power law with an exponential truncation controlled not only by the system size but also by a feature not previously considered, the subset of the network ``accessible'' to the node. We test our model with empirical data for the World Wide Web and find agreement.Comment: LaTeX2e and RevTeX4, 4 pages, 4 figures. Accepted for publication in Physical Review Letter

    Traffic on complex networks: Towards understanding global statistical properties from microscopic density fluctuations

    Get PDF
    We study the microscopic time fluctuations of traffic load and the global statistical properties of a dense traffic of particles on scale-free cyclic graphs. For a wide range of driving rates R the traffic is stationary and the load time series exhibits antipersistence due to the regulatory role of the superstructure associated with two hub nodes in the network. We discuss how the superstructure affects the functioning of the network at high traffic density and at the jamming threshold. The degree of correlations systematically decreases with increasing traffic density and eventually disappears when approaching a jamming density Rc. Already before jamming we observe qualitative changes in the global network-load distributions and the particle queuing times. These changes are related to the occurrence of temporary crises in which the network-load increases dramatically, and then slowly falls back to a value characterizing free flow

    A primer to common major gastrointestinal post-surgical anatomy on CT—a pictorial review

    Get PDF
    The post-operative abdomen can be challenging and knowledge of normal post-operative anatomy is important for diagnosing complications. The aim of this pictorial essay is to describe a few selected common, major gastrointestinal surgeries, their clinical indications and depict their normal post-operative computed tomography (CT) appearance. This essay provides some clues to identify the surgeries, which can be helpful especially when surgical history is lacking: recognition of the organ(s) involved, determination of what was resected and familiarity with the type of anastomoses used

    Large-scale structure of a nation-wide production network

    Full text link
    Production in an economy is a set of firms' activities as suppliers and customers; a firm buys goods from other firms, puts value added and sells products to others in a giant network of production. Empirical study is lacking despite the fact that the structure of the production network is important to understand and make models for many aspects of dynamics in economy. We study a nation-wide production network comprising a million firms and millions of supplier-customer links by using recent statistical methods developed in physics. We show in the empirical analysis scale-free degree distribution, disassortativity, correlation of degree to firm-size, and community structure having sectoral and regional modules. Since suppliers usually provide credit to their customers, who supply it to theirs in turn, each link is actually a creditor-debtor relationship. We also study chains of failures or bankruptcies that take place along those links in the network, and corresponding avalanche-size distribution.Comment: 17 pages with 8 figures; revised section VI and references adde

    Graph Metrics for Temporal Networks

    Get PDF
    Temporal networks, i.e., networks in which the interactions among a set of elementary units change over time, can be modelled in terms of time-varying graphs, which are time-ordered sequences of graphs over a set of nodes. In such graphs, the concepts of node adjacency and reachability crucially depend on the exact temporal ordering of the links. Consequently, all the concepts and metrics proposed and used for the characterisation of static complex networks have to be redefined or appropriately extended to time-varying graphs, in order to take into account the effects of time ordering on causality. In this chapter we discuss how to represent temporal networks and we review the definitions of walks, paths, connectedness and connected components valid for graphs in which the links fluctuate over time. We then focus on temporal node-node distance, and we discuss how to characterise link persistence and the temporal small-world behaviour in this class of networks. Finally, we discuss the extension of classic centrality measures, including closeness, betweenness and spectral centrality, to the case of time-varying graphs, and we review the work on temporal motifs analysis and the definition of modularity for temporal graphs.Comment: 26 pages, 5 figures, Chapter in Temporal Networks (Petter Holme and Jari Saram\"aki editors). Springer. Berlin, Heidelberg 201

    Random graphs with arbitrary degree distributions and their applications

    Full text link
    Recent work on the structure of social networks and the internet has focussed attention on graphs with distributions of vertex degree that are significantly different from the Poisson degree distributions that have been widely studied in the past. In this paper we develop in detail the theory of random graphs with arbitrary degree distributions. In addition to simple undirected, unipartite graphs, we examine the properties of directed and bipartite graphs. Among other results, we derive exact expressions for the position of the phase transition at which a giant component first forms, the mean component size, the size of the giant component if there is one, the mean number of vertices a certain distance away from a randomly chosen vertex, and the average vertex-vertex distance within a graph. We apply our theory to some real-world graphs, including the world-wide web and collaboration graphs of scientists and Fortune 1000 company directors. We demonstrate that in some cases random graphs with appropriate distributions of vertex degree predict with surprising accuracy the behavior of the real world, while in others there is a measurable discrepancy between theory and reality, perhaps indicating the presence of additional social structure in the network that is not captured by the random graph.Comment: 19 pages, 11 figures, some new material added in this version along with minor updates and correction
    • …
    corecore