81,741 research outputs found

    Statistical significance of variables driving systematic variation

    Full text link
    There are a number of well-established methods such as principal components analysis (PCA) for automatically capturing systematic variation due to latent variables in large-scale genomic data. PCA and related methods may directly provide a quantitative characterization of a complex biological variable that is otherwise difficult to precisely define or model. An unsolved problem in this context is how to systematically identify the genomic variables that are drivers of systematic variation captured by PCA. Principal components (and other estimates of systematic variation) are directly constructed from the genomic variables themselves, making measures of statistical significance artificially inflated when using conventional methods due to over-fitting. We introduce a new approach called the jackstraw that allows one to accurately identify genomic variables that are statistically significantly associated with any subset or linear combination of principal components (PCs). The proposed method can greatly simplify complex significance testing problems encountered in genomics and can be utilized to identify the genomic variables significantly associated with latent variables. Using simulation, we demonstrate that our method attains accurate measures of statistical significance over a range of relevant scenarios. We consider yeast cell-cycle gene expression data, and show that the proposed method can be used to straightforwardly identify statistically significant genes that are cell-cycle regulated. We also analyze gene expression data from post-trauma patients, allowing the gene expression data to provide a molecularly-driven phenotype. We find a greater enrichment for inflammatory-related gene sets compared to using a clinically defined phenotype. The proposed method provides a useful bridge between large-scale quantifications of systematic variation and gene-level significance analyses.Comment: 35 pages, 1 table, 6 main figures, 7 supplementary figure

    Clustering of Local Group distances: publication bias or correlated measurements? I. The Large Magellanic Cloud

    Full text link
    The distance to the Large Magellanic Cloud (LMC) represents a key local rung of the extragalactic distance ladder. Yet, the galaxy's distance modulus has long been an issue of contention, in particular in view of claims that most newly determined distance moduli cluster tightly - and with a small spread - around the "canonical" distance modulus, (m-M)_0 = 18.50 mag. We compiled 233 separate LMC distance determinations published between 1990 and 2013. Our analysis of the individual distance moduli, as well as of their two-year means and standard deviations resulting from this largest data set of LMC distance moduli available to date, focuses specifically on Cepheid and RR Lyrae variable-star tracer populations, as well as on distance estimates based on features in the observational Hertzsprung-Russell diagram. We conclude that strong publication bias is unlikely to have been the main driver of the majority of published LMC distance moduli. However, for a given distance tracer, the body of publications leading to the tightly clustered distances is based on highly non-independent tracer samples and analysis methods, hence leading to significant correlations among the LMC distances reported in subsequent articles. Based on a careful, weighted combination, in a statistical sense, of the main stellar population tracers, we recommend that a slightly adjusted canonical distance modulus of (m-M)_0 = 18.49 +- 0.09 mag be used for all practical purposes that require a general distance scale without the need for accuracies of better than a few percent.Comment: 35 pages (AASTeX preprint format), 5 postscript figures; AJ, in press. For full database of LMC distance moduli, see http://astro-expat.info/Data/pubbias.htm

    Perceived Diversity of Complex Environmental Systems: Multidimensional Measurement and Synthetic Indicators

    Get PDF
    The general attitude towards the sustainable management of environmental resources is evolving towards the implementation of ‘participatory’ (as opposed to the classical ‘command and control’) and, especially at local scale, ‘bottom up’ (as opposed to the classical ‘top down’) approaches. This progress pushes a major interest in the development and application of methodologies able to ‘discover’ and ‘measure’ how environmental systems tend to be perceived by the different Stakeholders. Due to the ‘nature’ of the investigated systems, often too ‘complex’ to be treated through a classical deterministic approach, as typical for ‘hard’ physical/mathematical sciences, any ‘measurement’ has necessarily to be multidimensional. In the present report an approach, more typical of ‘soft’ social sciences, is presented and applied to the analysis of the sustainable management of water resources in seven Southern and Eastern Mediterranean Watersheds. The methodology is based on the development and analysis (explorative factor analysis, multidimensional scaling) of a questionnaire and is aimed at the ‘discovery’ and ‘measurement’ of a latent multidimensional ‘underlying structure’ (‘conceptual map’). It is the opinion of the authors, that the identification of a set of ‘consistent’, ‘independent’, ‘bottom up’ and ‘shared’ synthetic indicators (aggregated indices) could be strongly facilitated by the interpretation of the dimensions of the emerging ‘underlying structure’.Participative Approach, Cognitive Map, Factor Analysis, Indicators of Sustainability, Sustainable Water, Management

    Women's Opportunities under Different Constellations of Family Policies in Western Countries: Inequality Tradeoffs Re-examined

    Get PDF
    Women’s rising labor force participation since the 1960’s was long seen as heralding decreasing gender inequalities. According to influential social science writings this view has now to be revised; “women friendly” policies bringing women into the workforce are held to create major inequality tradeoffs between quantity and quality in women’s jobs. Unintendedly, such policies increase employer statistical discrimination and create glass ceilings impeding women’s access to influential positions and high wages. This paper re-examines theoretical and empirical bases in analysis of family policy effects on gender inequalities. Including capabilities as well as earnings in definitions of gender inequality, we improve possibilities for causal analyses by mapping institutional constellations of separate dimensions of family policies in Western countries. Reflecting conflicting political forces as well as religion, contrary to accepted assumptions of uni-dimensionality, family policies are multi-dimensional, with main distinctions favoring traditional families, mother’s employment, or market reliance. Using multilevel analyses and broad sets of outcome variables, we show that methodological mistakes largely invalidate earlier causal interpretations of major tradeoffs between quantity and quality in women’s labor force participation. Positive policy effects facilitate work-family reconciliation and combine women’s increased labor force participation with relatively high fertility. While major negative policy effects for women with tertiary education are difficult to find, family policies clearly differ in the extent to which they improve opportunities for women without university degrees.-

    Preserving the impossible: conservation of soft-sediment hominin footprint sites and strategies for three-dimensional digital data capture.

    Get PDF
    Human footprints provide some of the most publically emotive and tangible evidence of our ancestors. To the scientific community they provide evidence of stature, presence, behaviour and in the case of early hominins potential evidence with respect to the evolution of gait. While rare in the geological record the number of footprint sites has increased in recent years along with the analytical tools available for their study. Many of these sites are at risk from rapid erosion, including the Ileret footprints in northern Kenya which are second only in age to those at Laetoli (Tanzania). Unlithified, soft-sediment footprint sites such these pose a significant geoconservation challenge. In the first part of this paper conservation and preservation options are explored leading to the conclusion that to 'record and digitally rescue' provides the only viable approach. Key to such strategies is the increasing availability of three-dimensional data capture either via optical laser scanning and/or digital photogrammetry. Within the discipline there is a developing schism between those that favour one approach over the other and a requirement from geoconservationists and the scientific community for some form of objective appraisal of these alternatives is necessary. Consequently in the second part of this paper we evaluate these alternative approaches and the role they can play in a 'record and digitally rescue' conservation strategy. Using modern footprint data, digital models created via optical laser scanning are compared to those generated by state-of-the-art photogrammetry. Both methods give comparable although subtly different results. This data is evaluated alongside a review of field deployment issues to provide guidance to the community with respect to the factors which need to be considered in digital conservation of human/hominin footprints

    2D and 3D Dense-Fluid Shear Flows via Nonequilibrium Molecular Dynamics. Comparison of Time-and-Space-Averaged Tensor Temperature and Normal Stresses from Doll's, Sllod, and Boundary-Driven Shear Algorithms

    Full text link
    Homogeneous shear flows (with constant strainrate du/dy) are generated with the Doll's and Sllod algorithms and compared to corresponding inhomogeneous boundary-driven flows. We use one-, two-, and three-dimensional smooth-particle weight functions for computing instantaneous spatial averages. The nonlinear stress differences are small, but significant, in both two and three space dimensions. In homogeneous systems the sign and magnitude of the shearplane stress difference, P(xx) - P(yy), depend on both the thermostat type and the chosen shearflow algorithm. The Doll's and Sllod algorithms predict opposite signs for this stress difference, with the Sllod approach definitely wrong, but somewhat closer to the (boundary-driven) truth. Neither of the homogeneous shear algorithms predicts the correct ordering of the kinetic temperatures, T(xx) > T(zz) > T(yy).Comment: 34 pages with 12 figures, under consideration by Physical Review

    Civic Associations That Work: The Contributions of Leadership to Organizational Effectiveness

    Get PDF
    Why are some civic associations more effective at advancing their public agendas, engaging members, and developing leaders? We introduce a multi-dimensional framework for analyzing the comparative effectiveness of member-based civic associations in terms of public influence, member engagement, and leader development. Theoretical expectations in organization studies, sociology, political science, and industrial relations hold that organizations benefiting from either a favorable environment or abundant resources will be most effective. Using systematic data on the Sierra Clubs 400 local organizations, we assess these factors alongside an alternative approach focusing on the role of leaders, how they work together, and the activities they carry out to build capacity and conduct programs. While we find modest support for the importance of an organizations available resources and external environment, we find strong evidence for each of our three outcomes supporting our claim that effectiveness in civic associations depends to a large degree on internal organizational practices.This publication is Hauser Center Working Paper No. 36. The Hauser Center Working Paper Series was launched during the summer of 2000. The Series enables the Hauser Center to share with a broad audience important works-in-progress written by Hauser Center scholars and researchers
    corecore