Search CORE

1,104 research outputs found

Catching the head, tail, and everything in between: a streaming algorithm for the degree distribution

Author: McGregor Andrew
Seshadhri C.
Simpson Olivia
Publication venue
Publication date: 25/11/2015
Field of study

The degree distribution is one of the most fundamental graph properties of interest for real-world graphs. It has been widely observed in numerous domains that graphs typically have a tailed or scale-free degree distribution. While the average degree is usually quite small, the variance is quite high and there are vertices with degrees at all scales. We focus on the problem of approximating the degree distribution of a large streaming graph, with small storage. We design an algorithm headtail, whose main novelty is a new estimator of infrequent degrees using truncated geometric random variables. We give a mathematical analysis of headtail and show that it has excellent behavior in practice. We can process streams will millions of edges with storage less than 1% and get extremely accurate approximations for all scales in the degree distribution. We also introduce a new notion of Relative Hausdorff distance between tailed histograms. Existing notions of distances between distributions are not suitable, since they ignore infrequent degrees in the tail. The Relative Hausdorff distance measures deviations at all scales, and is a more suitable distance for comparing degree distributions. By tracking this new measure, we are able to give strong empirical evidence of the convergence of headtail

arXiv.org e-Print Archive

Crossref

Finding Matchings in the Streaming Model

Author: McGregor Andrew
Publication venue: ScholarlyCommons
Publication date: 01/01/2004
Field of study

This report presents algorithms for finding large matchings in the streaming model. In this model, applicable when dealing with massive graphs, edges are streamed-in in some arbitrary order rather than residing in randomly accessible memory. For ε \u3e 0, we achieve a 1/(1+ε) approximation for maximum cardinality matching and a 1/(2+ε) approximation to maximum weighted matching. Both algorithms use a constant number of passes

ScholarlyCommons@Penn

Trace Reconstruction: Generalized and Parameterized

Author: Krishnamurthy Akshay
Mazumdar Arya
McGregor Andrew
Pal Soumyabrata
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string x given random "traces" of x where each trace is generated by deleting each coordinate of x independently with probability p<1. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. Perhaps our most surprising results are: 1) We prove that exp(O(n^(1/4) sqrt{log n})) traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown sqrt{n} x sqrt{n} matrix is deleted independently with probability p. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is exp(O(n^(1/3))). 2) An optimal result for random matrix reconstruction: we show that Theta(log n) traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is exp({O}(log^(1/3) n)). 3) We show that exp(O(k^(1/3) log^(2/3) n)) traces suffice to reconstruct k-sparse strings, providing an improvement over the best known sequence reconstruction results when k = o(n/log^2 n). 4) We show that poly(n) traces suffice if x is k-sparse and we additionally have a "separation" promise, specifically that the indices of 1\u27s in x all differ by Omega(k log n)

Dagstuhl Research Online Publication Server

Information Cost Tradeoffs for Augmented Index and Streaming Language Recognition

Author: Chakrabarti Amit
Cormode Graham
Kondapally Ranganath
McGregor Andrew
Publication venue
Publication date: 01/01/2010
Field of study

This paper makes three main contributions to the theory of communication complexity and stream computation. First, we present new bounds on the information complexity of AUGMENTED-INDEX. In contrast to analogous results for INDEX by Jain, Radhakrishnan and Sen [J. ACM, 2009], we have to overcome the significant technical challenge that protocols for AUGMENTED-INDEX may violate the "rectangle property" due to the inherent input sharing. Second, we use these bounds to resolve an open problem of Magniez, Mathieu and Nayak [STOC, 2010] that asked about the multi-pass complexity of recognizing Dyck languages. This results in a natural separation between the standard multi-pass model and the multi-pass model that permits reverse passes. Third, we present the first passive memory checkers that verify the interaction transcripts of priority queues, stacks, and double-ended queues. We obtain tight upper and lower bounds for these problems, thereby addressing an important sub-class of the memory checking framework of Blum et al. [Algorithmica, 1994]

arXiv.org e-Print Archive

CiteSeerX

ScholarWorks@UMass Amherst

Dartmouth Digital Commons (Dartmouth College)

Design of a processor to support the teaching of computer systems

Author: Armstrong Dean Andrew
McGregor Anthony James
Pearson Murray W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Teaching computer systems, including computer architecture, assembly language programming and operating system implementation, is a challenging occupation. At the University of Waikato this is made doubly true because we require all computer science and information systems students study this material at second year. The challenges of teaching difficult material to a wide range of students have driven us to find ways of making the material more accessible. The corner stone of our strategy for delivering this material is the design and implementation of a custom CPU that meets the needs of teaching. This paper describes our motivation and these needs. We present the CPU and board design and describe the implementation of the CPU in an FPGA. The paper also includes some reflections on the use of a real CPU rather than a simulation environment. We conclude with a discussion of how the CPU can be used for advanced classes in computer architecture and a description of the current status of the project

Research Commons@Waikato

The Shock Production And Quantitative Spectroscopy Of The Magnesium-hydride Doublet-a-pi - Doublet-x-sigma-plus Delta(v)=0 Sequence

Author: Mcgregor Andrew Thomas
Publication venue: Scholarship@Western
Publication date: 01/01/1975
Field of study

Scholarship@Western