369 research outputs found

    Broadcasting on Random Directed Acyclic Graphs

    Full text link
    We study a generalization of the well-known model of broadcasting on trees. Consider a directed acyclic graph (DAG) with a unique source vertex XX, and suppose all other vertices have indegree dā‰„2d\geq 2. Let the vertices at distance kk from XX be called layer kk. At layer 00, XX is given a random bit. At layer kā‰„1k\geq 1, each vertex receives dd bits from its parents in layer kāˆ’1k-1, which are transmitted along independent binary symmetric channel edges, and combines them using a dd-ary Boolean processing function. The goal is to reconstruct XX with probability of error bounded away from 1/21/2 using the values of all vertices at an arbitrarily deep layer. This question is closely related to models of reliable computation and storage, and information flow in biological networks. In this paper, we analyze randomly constructed DAGs, for which we show that broadcasting is only possible if the noise level is below a certain degree and function dependent critical threshold. For dā‰„3d\geq 3, and random DAGs with layer sizes Ī©(logā”k)\Omega(\log k) and majority processing functions, we identify the critical threshold. For d=2d=2, we establish a similar result for NAND processing functions. We also prove a partial converse for odd dā‰„3d\geq 3 illustrating that the identified thresholds are impossible to improve by selecting different processing functions if the decoder is restricted to using a single vertex. Finally, for any noise level, we construct explicit DAGs (using expander graphs) with bounded degree and layer sizes Ī˜(logā”k)\Theta(\log k) admitting reconstruction. In particular, we show that such DAGs can be generated in deterministic quasi-polynomial time or randomized polylogarithmic time in the depth. These results portray a doubly-exponential advantage for storing a bit in DAGs compared to trees, where d=1d=1 but layer sizes must grow exponentially with depth in order to enable broadcasting.Comment: 33 pages, double column format. arXiv admin note: text overlap with arXiv:1803.0752

    Sequence queries on temporal graphs

    Get PDF
    Graphs that evolve over time are called temporal graphs. They can be used to describe and represent real-world networks, including transportation networks, social networks, and communication networks, with higher fidelity and accuracy. However, research is still limited on how to manage large scale temporal graphs and execute queries over these graphs efficiently and effectively. This thesis investigates the problems of temporal graph data management related to node and edge sequence queries. In temporal graphs, nodes and edges can evolve over time. Therefore, sequence queries on nodes and edges can be key components in managing temporal graphs. In this thesis, the node sequence query decomposes into two parts: graph node similarity and subsequence matching. For node similarity, this thesis proposes a modified tree edit distance that is metric and polynomially computable and has a natural, intuitive interpretation. Note that the proposed node similarity works even for inter-graph nodes and therefore can be used for graph de-anonymization, network transfer learning, and cross-network mining, among other tasks. The subsequence matching query proposed in this thesis is a framework that can be adopted to index generic sequence and time-series data, including trajectory data and even DNA sequences for subsequence retrieval. For edge sequence queries, this thesis proposes an efficient storage and optimized indexing technique that allows for efficient retrieval of temporal subgraphs that satisfy certain temporal predicates. For this problem, this thesis develops a lightweight data management engine prototype that can support time-sensitive temporal graph analytics efficiently even on a single PC

    Information and Decision Theoretic Approaches to Problems in Active Diagnosis.

    Full text link
    In applications such as active learning or disease/fault diagnosis, one often encounters the problem of identifying an unknown object while minimizing the number of ``yes" or ``no" questions (queries) posed about that object. This problem has been commonly referred to as object/entity identification or active diagnosis in the literature. In this thesis, we consider several extensions of this fundamental problem that are motivated by practical considerations in real-world, time-critical identification tasks such as emergency response. First, we consider the problem where the objects are partitioned into groups, and the goal is to identify only the group to which the object belongs. We then consider the case where the cost of identifying an object grows exponentially in the number of queries. To address these problems we show that a standard algorithm for object identification, known as the splitting algorithm or generalized binary search (GBS), may be viewed as a generalization of Shannon-Fano coding. We then extend this result to the group-based and the exponential cost settings, leading to new, improved algorithms. We then study the problem of active diagnosis under persistent query noise. Previous work in this area either assumed that the noise is independent or that the underlying query noise distribution is completely known. We make no such assumptions, and introduce an algorithm that returns a ranked list of objects, such that the expected rank of the true object is optimized. Finally, we study the problem of active diagnosis where multiple objects are present, such as in disease/fault diagnosis. Current algorithms in this area have an exponential time complexity making them slow and intractable. We address this issue by proposing an extension of our rank-based approach to the multiple object scenario, where we optimize the area under the ROC curve of the rank-based output. The AUC criterion allows us to make a simplifying assumption that significantly reduces the complexity of active diagnosis (from exponential to near quadratic), with little or no compromise on the performance quality. Further, we demonstrate the performance of the proposed algorithms through extensive experiments on both synthetic and real world datasets.Ph.D.Electrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91606/1/gowtham_1.pd

    Second-Price Proxy Auctions in Bidder-Seller Networks

    Get PDF
    We analyze a model of Internet auctions. Sellers each offer one item of a heterogeneous good to bidders who have unit-demand preferences. Items are sold in second-price proxy auctions. We derive a perfect Bayesian (epsilon-) equilibrium. Our experimental findings support the theoretical results. An analysis of sellers\u27 revenues in the Vickrey-auction reveals that they are non-monotonic in bids for substitutes valuations. We combine these results to investigate incomplete bidder-seller networks

    Networks and Learning in Game Theory.

    Get PDF
    This work concentrates on two topics, networks and game theory, and learning in games. The first part of this thesis looks at network games and the role of incomplete information in such games. It is assumed that players are located on a network and interact with their neighbors in the network. Players only have incomplete information on the network structure. The first part of this thesis studies how players' beliefs over the network they belong to affect game-theoretic outcomes, and develops a natural model for players' beliefs. The second part of this thesis focuses on learning in games. An intuitive learning model is introduced, and the predictions of this model are analyzed. Furthermore, learning in a class of congestion games is studied from different perspectives.

    Ordinal regression methods: survey and experimental study

    Get PDF
    Abstractā€”Ordinal regression problems are those machine learning problems where the objective is to classify patterns using a categorical scale which shows a natural order between the labels. Many real-world applications present this labelling structure and that has increased the number of methods and algorithms developed over the last years in this field. Although ordinal regression can be faced using standard nominal classification techniques, there are several algorithms which can specifically benefit from the ordering information. Therefore, this paper is aimed at reviewing the state of the art on these techniques and proposing a taxonomy based on how the models are constructed to take the order into account. Furthermore, a thorough experimental study is proposed to check if the use of the order information improves the performance of the models obtained, considering some of the approaches within the taxonomy. The results confirm that ordering information benefits ordinal models improving their accuracy and the closeness of the predictions to actual targets in the ordinal scal

    Analysing causal structures with entropy

    Get PDF
    A central question for causal inference is to decide whether a set of correlations fit a given causal structure. In general, this decision problem is computationally infeasible and hence several approaches have emerged that look for certificates of compatibility. Here we review several such approaches based on entropy. We bring together the key aspects of these entropic techniques with unified terminology, filling several gaps and establishing new connections regarding their relation, all illustrated with examples. We consider cases where unobserved causes are classical, quantum and post-quantum and discuss what entropic analyses tell us about the difference. This has applications to quantum cryptography, where it can be crucial to eliminate the possibility of classical causes. We discuss the achievements and limitations of the entropic approach in comparison to other techniques and point out the main open problems.Comment: 19 (+3) pages, 5 (+1) figures. A few minor updates and corrections. There is a small error in the published version of this manuscript: the claim in the last sentence of Section 2(a)(ii) should be restricted to four variables. This is correct in the arXiv versio
    • ā€¦
    corecore