199,016 research outputs found
Anchorage Community Survey 2007 Survey Sampling Design: Power and Sample Size
This working paper documents the power analysis, literature review, and precision considerations contemplated in designing the Anchorage Community Surveyâs (ACS) 2007 sampling design. The ACS will obtain at least 30 completed surveys from individuals in each of the 55 census tracts that make up the Anchorage Municipality, allowing us to discern a fairly small effect size of 0.30 with our smallest anticipated intraclass correlation and a moderate effect size of 0.40 with our largest anticipated intraclass correlation, both at 0.80 power level. This cluster sample size and number of clusters should yield sufficient precision to allow good estimation of variance components and standard errors, acceptable reliability estimates, and reasonable aggregated measures of constructed neighborhood variables from individual survey item responses.Abstract /
Introduction /
Number of clusters (J) = 55 /
Cluster Size (n) = 30 /
Intraclass correlation (Ï)=.10 to .20 /
Effect size (ÎŽ)=.30 or greater /
Power Graphs /
Support from the Literature /
A Note on Precision /
Reference
Large Low-Diameter Graphs are Good Expanders
We revisit the classical question of the relationship between the diameter of a graph and its expansion properties. One direction is well understood: expander graphs exhibit essentially the lowest possible diameter. We focus on the reverse direction, showing that "sufficiently large" graphs of fixed diameter and degree must be "good" expanders. We prove this statement for various definitions of "sufficiently large" (multiplicative/additive factor from the largest possible size), for different forms of expansion (edge, vertex, and spectral expansion), and for both directed and undirected graphs. A recurring theme is that the lower the diameter of the graph and (more importantly) the larger its size, the better the expansion guarantees. Aside from inherent theoretical interest, our motivation stems from the domain of network design. Both low-diameter networks and expanders are prominent approaches to designing high-performance networks in parallel computing, HPC, datacenter networking, and beyond. Our results establish that these two approaches are, in fact, inextricably intertwined. We leave the reader with many intriguing questions for future research
Pliability and approximating max-CSPs
We identify a sufficient condition, treewidth-pliability, that gives a polynomial-time
algorithm for an arbitrarily good approximation of the optimal value in a large class of
Max-2-CSPs parameterised by the class of allowed constraint graphs (with arbitrary constraints on an unbounded alphabet). Our result applies more generally to the maximum
homomorphism problem between two rational-valued structures.
The condition unifies the two main approaches for designing a polynomial-time approximation scheme. One is Bakerâs layering technique, which applies to sparse graphs
such as planar or excluded-minor graphs. The other is based on SzemerÂŽediâs regularity
lemma and applies to dense graphs. We extend the applicability of both techniques to
new classes of Max-CSPs. On the other hand, we prove that the condition cannot be used
to find solutions (as opposed to approximating the optimal value) in general.
Treewidth-pliability turns out to be a robust notion that can be defined in several
equivalent ways, including characterisations via size, treedepth, or the Hadwiger number.
We show connections to the notions of fractional-treewidth-fragility from structural graph
theory, hyperfiniteness from the area of property testing, and regularity partitions from
the theory of dense graph limits. These may be of independent interest. In particular
we show that a monotone class of graphs is hyperfinite if and only if it is fractionallytreewidth-fragile and has bounded degree
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
- âŠ