72 research outputs found
Combinatorial approach to Modularity
Communities are clusters of nodes with a higher than average density of
internal connections. Their detection is of great relevance to better
understand the structure and hierarchies present in a network. Modularity has
become a standard tool in the area of community detection, providing at the
same time a way to evaluate partitions and, by maximizing it, a method to find
communities. In this work, we study the modularity from a combinatorial point
of view. Our analysis (as the modularity definition) relies on the use of the
configurational model, a technique that given a graph produces a series of
randomized copies keeping the degree sequence invariant. We develop an approach
that enumerates the null model partitions and can be used to calculate the
probability distribution function of the modularity. Our theory allows for a
deep inquiry of several interesting features characterizing modularity such as
its resolution limit and the statistics of the partitions that maximize it.
Additionally, the study of the probability of extremes of the modularity in the
random graph partitions opens the way for a definition of the statistical
significance of network partitions.Comment: 8 pages, 4 figure
Determinants of response to a parent questionnaire about development and behaviour in 3 year olds: European multicentre study of congenital toxoplasmosis.
Background:
We aimed to determine how response to a parent-completed postal questionnaire measuring development, behaviour, impairment, and parental concerns and anxiety, varies in different European centres.
Methods:
Prospective cohort study of 3 year old children, with and without congenital toxoplasmosis, who were identified by prenatal or neonatal screening for toxoplasmosis in 11 centres in 7 countries. Parents were mailed a questionnaire that comprised all or part of existing validated tools. We determined the effect of characteristics of the centre and child on response, age at questionnaire completion, and response to child drawing tasks.
Results:
The questionnaire took 21 minutes to complete on average. 67% (714/1058) of parents responded. Few parents (60/1058) refused to participate. The strongest determinants of response were the score for organisational attributes of the study centre (such as direct involvement in follow up and access to an address register), and infection with congenital toxoplasmosis. Age at completion was associated with study centre, presence of neurological abnormalities in early infancy, and duration of prenatal treatment. Completion rates for individual questions exceeded 92% except for child completed drawings of a man (70%), which were completed more by girls, older children, and in certain centres.
Conclusion:
Differences in response across European centres were predominantly related to the organisation of follow up and access to correct addresses. The questionnaire was acceptable in all six countries and offers a low cost tool for assessing development, behaviour, and parental concerns and anxiety, in multinational studies
Testing for an Unusual Distribution of Rare Variants
Technological advances make it possible to use high-throughput sequencing as a primary discovery tool of medical genetics, specifically for assaying rare variation. Still this approach faces the analytic challenge that the influence of very rare variants can only be evaluated effectively as a group. A further complication is that any given rare variant could have no effect, could increase risk, or could be protective. We propose here the C-alpha test statistic as a novel approach for testing for the presence of this mixture of effects across a set of rare variants. Unlike existing burden tests, C-alpha, by testing the variance rather than the mean, maintains consistent power when the target set contains both risk and protective variants. Through simulations and analysis of case/control data, we demonstrate good power relative to existing methods that assess the burden of rare variants in individuals
The log-likelihood ratio for sparse multinomial mixtures
The log likelihood ratio is expanded for testing a sequence of multinomial null hypotheses against a sequence of multinomial mixture close alternative hypothesis. As the number of categories grows without limit, the sample size increases and the variances of the mixing distributions tend to zero. The limiting form of the log likelihood ratio is functionally different from previously studied goodness of fit statistics. The statistic derived here exhibits moderate asymptotic power when Pearson's chi-square is biased.
Likelihood ratio tests for central mixtures
The log likelihood ratio is expanded for testing a simple null hypothesis against a sequence of alternative hypotheses in which the observations are sampled from a mixture of distributions located near the null hypothesis. The test criteria depends on the mixing distribution only through its variance and is approximately normally distributed under both hypotheses.mixture of distributions tests of homogeneity likelihood ratio tests close alternative hypothesis
- …