5,706 research outputs found

    Finding the Graph of Epidemic Cascades

    Full text link
    We consider the problem of finding the graph on which an epidemic cascade spreads, given only the times when each node gets infected. While this is a problem of importance in several contexts -- offline and online social networks, e-commerce, epidemiology, vulnerabilities in infrastructure networks -- there has been very little work, analytical or empirical, on finding the graph. Clearly, it is impossible to do so from just one cascade; our interest is in learning the graph from a small number of cascades. For the classic and popular "independent cascade" SIR epidemics, we analytically establish the number of cascades required by both the global maximum-likelihood (ML) estimator, and a natural greedy algorithm. Both results are based on a key observation: the global graph learning problem decouples into nn local problems -- one for each node. For a node of degree dd, we show that its neighborhood can be reliably found once it has been infected O(d2log⁑n)O(d^2 \log n) times (for ML on general graphs) or O(dlog⁑n)O(d\log n) times (for greedy on trees). We also provide a corresponding information-theoretic lower bound of Ω(dlog⁑n)\Omega(d\log n); thus our bounds are essentially tight. Furthermore, if we are given side-information in the form of a super-graph of the actual graph (as is often the case), then the number of cascade samples required -- in all cases -- becomes independent of the network size nn. Finally, we show that for a very general SIR epidemic cascade model, the Markov graph of infection times is obtained via the moralization of the network graph.Comment: To appear in Proc. ACM SIGMETRICS/Performance 201

    Edit Distance: Sketching, Streaming and Document Exchange

    Full text link
    We show that in the document exchange problem, where Alice holds x∈{0,1}nx \in \{0,1\}^n and Bob holds y∈{0,1}ny \in \{0,1\}^n, Alice can send Bob a message of size O(K(log⁑2K+log⁑n))O(K(\log^2 K+\log n)) bits such that Bob can recover xx using the message and his input yy if the edit distance between xx and yy is no more than KK, and output "error" otherwise. Both the encoding and decoding can be done in time O~(n+poly(K))\tilde{O}(n+\mathsf{poly}(K)). This result significantly improves the previous communication bounds under polynomial encoding/decoding time. We also show that in the referee model, where Alice and Bob hold xx and yy respectively, they can compute sketches of xx and yy of sizes poly(Klog⁑n)\mathsf{poly}(K \log n) bits (the encoding), and send to the referee, who can then compute the edit distance between xx and yy together with all the edit operations if the edit distance is no more than KK, and output "error" otherwise (the decoding). To the best of our knowledge, this is the first result for sketching edit distance using poly(Klog⁑n)\mathsf{poly}(K \log n) bits. Moreover, the encoding phase of our sketching algorithm can be performed by scanning the input string in one pass. Thus our sketching algorithm also implies the first streaming algorithm for computing edit distance and all the edits exactly using poly(Klog⁑n)\mathsf{poly}(K \log n) bits of space.Comment: Full version of an article to be presented at the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2016

    Analisis Business Process Model Similarity Checking Menggunakan Teknik Greedy Graph Matching

    Get PDF
    Untuk menjadi perusahaan besar dibutuhkan strategi tepat seperti penggabungan usaha yang menghasilkan proses bisnis dalam jumlah besar sehingga seringkali terdapat proses bisnis sama dengan tujuan sama pula. Oleh sebab itu, dibutuhkan sistem yang dapat mengecek kesamaan proses bisnis. Business process similarity checking merupakan sistem yang dapat mengecek kesamaan dari dua proses bisnis sehingga menghasilkan nilai similarity. Dalam mengecek kesamaan, digunakan metode greedy graph matching. Metode ini mencari pasangan node dengan nilai matching score paling optimal. Setiap iterasi, greedy memilih pasangan node dengan nilai syntactic similarity paling tinggi, kemudian pasangan node tersebut dihapus. Dalam prosesnya, dihitung syntactic similarity, node insertions/deletions (sn), edge insertions/deletions (se) dan rata-rata node substitutions (sbv). Hasil dari penelitian ini adalah angka similarity JPO dan TDP yaitu 0,51837 serta TDP dan JPO, yaitu 0,51837. Berdasar hasil tersebut dapat dilihat bahwa menggunakan greedy graph matching menghasilkan angka similarity yang dimana jika kedua proses bisnis tersebut dibalik sebagai proses bisnis 1 maupun sebagai proses bisnis 2, tetap sama. Namun, greedy graph matching tidak efektif digunakan untuk menghitung graph edit distance similarity karena dalam prosesnya, setiap iterasi greedy hanya melihat nilai syntactic similarity tertinggi pada node yang sama, selanjutnya pasangan node yang telah dipilih kemudian dihapus. Padahal mungkin saja diiterasi selanjutnya node yang telah dihapus tersebut memiliki pasangan dengan nilai syntactic similarity yang lebih tinggi. Kata kunci : business process similarity checking, greedy graph matching, syntactic similarity, node insertions/deletions, edge insertions/deletions, node substitution
    • …
    corecore