846 research outputs found
Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data
Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations
Power Law of Customers' Expenditures in Convenience Stores
In a convenience store chain, a tail of the cumulative density function of
the expenditure of a person during a single shopping trip follows a power law
with an exponent of -2.5. The exponent is independent of the location of the
store, the shopper's age, the day of week, and the time of day.Comment: 9 pages, 5 figures. Accepted for publication in Journal of the
Physical Society of Japan Vol.77No.
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Retrieval pipelines commonly rely on a term-based search to obtain candidate
records, which are subsequently re-ranked. Some candidates are missed by this
approach, e.g., due to a vocabulary mismatch. We address this issue by
replacing the term-based search with a generic k-NN retrieval algorithm, where
a similarity function can take into account subtle term associations. While an
exact brute-force k-NN search using this similarity function is slow, we
demonstrate that an approximate algorithm can be nearly two orders of magnitude
faster at the expense of only a small loss in accuracy. A retrieval pipeline
using an approximate k-NN search can be more effective and efficient than the
term-based pipeline. This opens up new possibilities for designing effective
retrieval pipelines. Our software (including data-generating code) and
derivative data based on the Stack Overflow collection is available online
Degree distributions of growing networks
The in-degree and out-degree distributions of a growing network model are determined. The in-degree is the number of incoming links to a given node (and vice versa for out-degree. The network is built by (i) creation of new nodes which each immediately attach to a pre-existing node, and (ii) creation of new links between pre-existing nodes. This process naturally generates correlated in- and out-degree distributions. When the node and link creation rates are linear functions of node degree, these distributions exhibit distinct power-law forms. By tuning the parameters in these rates to reasonable values, exponents which agree with those of the web graph are obtained
Truncation of power law behavior in "scale-free" network models due to information filtering
We formulate a general model for the growth of scale-free networks under
filtering information conditions--that is, when the nodes can process
information about only a subset of the existing nodes in the network. We find
that the distribution of the number of incoming links to a node follows a
universal scaling form, i.e., that it decays as a power law with an exponential
truncation controlled not only by the system size but also by a feature not
previously considered, the subset of the network ``accessible'' to the node. We
test our model with empirical data for the World Wide Web and find agreement.Comment: LaTeX2e and RevTeX4, 4 pages, 4 figures. Accepted for publication in
Physical Review Letter
Traffic on complex networks: Towards understanding global statistical properties from microscopic density fluctuations
We study the microscopic time fluctuations of traffic load and the global statistical properties of a dense traffic of particles on scale-free cyclic graphs. For a wide range of driving rates R the traffic is stationary and the load time series exhibits antipersistence due to the regulatory role of the superstructure associated with two hub nodes in the network. We discuss how the superstructure affects the functioning of the network at high traffic density and at the jamming threshold. The degree of correlations systematically decreases with increasing traffic density and eventually disappears when approaching a jamming density Rc. Already before jamming we observe qualitative changes in the global network-load distributions and the particle queuing times. These changes are related to the occurrence of temporary crises in which the network-load increases dramatically, and then slowly falls back to a value characterizing free flow
A primer to common major gastrointestinal post-surgical anatomy on CT—a pictorial review
The post-operative abdomen can be challenging and knowledge of normal post-operative anatomy is important for diagnosing complications. The aim of this pictorial essay is to describe a few selected common, major gastrointestinal surgeries, their clinical indications and depict their normal post-operative computed tomography (CT) appearance. This essay provides some clues to identify the surgeries, which can be helpful especially when surgical history is lacking: recognition of the organ(s) involved, determination of what was resected and familiarity with the type of anastomoses used
Large-scale structure of a nation-wide production network
Production in an economy is a set of firms' activities as suppliers and
customers; a firm buys goods from other firms, puts value added and sells
products to others in a giant network of production. Empirical study is lacking
despite the fact that the structure of the production network is important to
understand and make models for many aspects of dynamics in economy. We study a
nation-wide production network comprising a million firms and millions of
supplier-customer links by using recent statistical methods developed in
physics. We show in the empirical analysis scale-free degree distribution,
disassortativity, correlation of degree to firm-size, and community structure
having sectoral and regional modules. Since suppliers usually provide credit to
their customers, who supply it to theirs in turn, each link is actually a
creditor-debtor relationship. We also study chains of failures or bankruptcies
that take place along those links in the network, and corresponding
avalanche-size distribution.Comment: 17 pages with 8 figures; revised section VI and references adde
Graph Metrics for Temporal Networks
Temporal networks, i.e., networks in which the interactions among a set of
elementary units change over time, can be modelled in terms of time-varying
graphs, which are time-ordered sequences of graphs over a set of nodes. In such
graphs, the concepts of node adjacency and reachability crucially depend on the
exact temporal ordering of the links. Consequently, all the concepts and
metrics proposed and used for the characterisation of static complex networks
have to be redefined or appropriately extended to time-varying graphs, in order
to take into account the effects of time ordering on causality. In this chapter
we discuss how to represent temporal networks and we review the definitions of
walks, paths, connectedness and connected components valid for graphs in which
the links fluctuate over time. We then focus on temporal node-node distance,
and we discuss how to characterise link persistence and the temporal
small-world behaviour in this class of networks. Finally, we discuss the
extension of classic centrality measures, including closeness, betweenness and
spectral centrality, to the case of time-varying graphs, and we review the work
on temporal motifs analysis and the definition of modularity for temporal
graphs.Comment: 26 pages, 5 figures, Chapter in Temporal Networks (Petter Holme and
Jari Saram\"aki editors). Springer. Berlin, Heidelberg 201
Random graphs with arbitrary degree distributions and their applications
Recent work on the structure of social networks and the internet has focussed
attention on graphs with distributions of vertex degree that are significantly
different from the Poisson degree distributions that have been widely studied
in the past. In this paper we develop in detail the theory of random graphs
with arbitrary degree distributions. In addition to simple undirected,
unipartite graphs, we examine the properties of directed and bipartite graphs.
Among other results, we derive exact expressions for the position of the phase
transition at which a giant component first forms, the mean component size, the
size of the giant component if there is one, the mean number of vertices a
certain distance away from a randomly chosen vertex, and the average
vertex-vertex distance within a graph. We apply our theory to some real-world
graphs, including the world-wide web and collaboration graphs of scientists and
Fortune 1000 company directors. We demonstrate that in some cases random graphs
with appropriate distributions of vertex degree predict with surprising
accuracy the behavior of the real world, while in others there is a measurable
discrepancy between theory and reality, perhaps indicating the presence of
additional social structure in the network that is not captured by the random
graph.Comment: 19 pages, 11 figures, some new material added in this version along
with minor updates and correction
- …