32 research outputs found
Random Set Partitions: Asymptotics of Subset Counts
AbstractWe study the asymptotics of subset counts for the uniformly random partition of the set [n]. It is known that typically most of the subsets of the random partition are of sizer, withrer=n. Confirming a conjecture formulated by Arratia and Tavaré, we prove that the counts of other subsets are close, in terms of the total variation distance, to the corresponding segments of a sequence {Zj} of independent, Poisson (rj/j!) distributed random variables. DeLaurentis and Pittel had proved that the finite–dimensional distributions of a continuous time process that counts the typical size subsets converge to those of the Brownian Bridge process. Combining the two results allows to prove a functional limit theorem which covers a broad class of the integral functionals. Among illustrations, we prove that the total number of refinements of a random partition is asymptotically lognormal
Dividing population genetic distance data with the software Partitioning Optimization with Restricted Growth Strings (PORGS): an application for Chinook salmon (Oncorhynchus tshawytscha), Vancouver Island, British Columbia
A new method of finding the optimal group membership and number of groupings to partition population genetic distance data is presented. The software program Partitioning Optimization with Restricted Growth Strings (PORGS), visits all possible set partitions and deems
acceptable partitions to be those that reduce mean intracluster distance. The optimal number of groups is determined with the gap statistic which compares PORGS results with a reference distribution. The PORGS method was validated by a simulated data set with a known distribution.
For efficiency, where values of n were larger, restricted growth strings (RGS) were used to bipartition populations during a nested search (bi-PORGS). Bi-PORGS was applied to a set of genetic data from 18 Chinook salmon (Oncorhynchus
tshawytscha) populations from the west coast of Vancouver Island. The optimal grouping of these populations
corresponded to four geographic locations: 1) Quatsino Sound, 2) Nootka Sound, 3) Clayoquot +Barkley sounds,
and 4) southwest Vancouver Island. However, assignment of populations to groups did not strictly reflect the geographical divisions; fish of Barkley Sound origin that had strayed into the Gold River and close genetic similarity
between transferred and donor populations meant groupings crossed geographic boundaries. Overall, stock structure determined by this partitioning method was similar to that
determined by the unweighted pair-group method with arithmetic averages (UPGMA), an agglomerative clustering algorithm
Central Limit Theorems for some Set Partition Statistics
We prove the conjectured limiting normality for the number of crossings of a
uniformly chosen set partition of [n] = {1,2,...,n}. The arguments use a novel
stochastic representation and are also used to prove central limit theorems for
the dimension index and the number of levels
The first order convergence law fails for random perfect graphs
We consider first order expressible properties of random perfect graphs. That
is, we pick a graph uniformly at random from all (labelled) perfect
graphs on vertices and consider the probability that it satisfies some
graph property that can be expressed in the first order language of graphs. We
show that there exists such a first order expressible property for which the
probability that satisfies it does not converge as .Comment: 11 pages. Minor corrections since last versio