16,549 research outputs found

    Normalized Affymetrix expression data are biased by G-quadruplex formation

    Get PDF
    Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG-U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14 of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∌15 of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal. © 2011 The Author(s)

    Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review

    Get PDF
    Background: Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. Methods: We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. Results: For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Conclusions: Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses

    Labour Market Racial Discrimination in South Africa Revisited

    Get PDF
    Discrimination is a significant issue in labour market economics across developed as well as developing countries. In this paper, we inquire the actual size of wage discrimination in the Republic of South Africa, accounting for large differences in individual endowments. We apply the Oaxaca-Blinder decomposition as well as propensity score matching to adequately determine the role of discrimination in the wage gaps observed. Although the size of the absolute racial wage gap is enormous, amounting for more than 500%, the actual estimated effect non-attributable to other factors ranges between 45%-55%. This estimator, however, assumes homogenous discrimination across the wage distribution, while data suggest that there are significant educational, sectoral and occupational differentials. To account for these effects, we implement propensity score matching by finding “statistical twins” of the White population among the Black population, thus we demonstrate how wages differ between these groups in comparable labour market situations. Here too we find that wages for the White are on average approximately 30% higher, while the effects vary at quartiles of the wage distribution.discrimination, Oaxaca-Blinder decomposition, propensity score matching, Republic of South Africa, racial wage gap

    Real-time Planning as Decision-making Under Uncertainty

    Get PDF
    In real-time planning, an agent must select the next action to take within a fixed time bound. Many popular real-time heuristic search methods approach this by expanding nodes using time-limited A* and selecting the action leading toward the frontier node with the lowest f value. In this thesis, we reconsider real-time planning as a problem of decision-making under uncertainty. We treat heuristic values as uncertain evidence and we explore several backup methods for aggregating this evidence. We then propose a novel lookahead strategy that expands nodes to minimize risk, the expected regret in case a non-optimal action is chosen. We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods. This work illustrates how uncertainty can arise even when solving deterministic planning problems, due to the inherent ignorance of time-limited search algorithms about those portions of the state space that they have not computed, and how an agent can benefit from explicitly meta-reasoning about this uncertainty

    FindFoci: a focus detection algorithm with automated parameter training that closely matches human assignments, reduces human inconsistencies and increases speed of analysis

    Get PDF
    Accurate and reproducible quantification of the accumulation of proteins into foci in cells is essential for data interpretation and for biological inferences. To improve reproducibility, much emphasis has been placed on the preparation of samples, but less attention has been given to reporting and standardizing the quantification of foci. The current standard to quantitate foci in open-source software is to manually determine a range of parameters based on the outcome of one or a few representative images and then apply the parameter combination to the analysis of a larger dataset. Here, we demonstrate the power and utility of using machine learning to train a new algorithm (FindFoci) to determine optimal parameters. FindFoci closely matches human assignments and allows rapid automated exploration of parameter space. Thus, individuals can train the algorithm to mirror their own assignments and then automate focus counting using the same parameters across a large number of images. Using the training algorithm to match human assignments of foci, we demonstrate that applying an optimal parameter combination from a single image is not broadly applicable to analysis of other images scored by the same experimenter or by other experimenters. Our analysis thus reveals wide variation in human assignment of foci and their quantification. To overcome this, we developed training on multiple images, which reduces the inconsistency of using a single or a few images to set parameters for focus detection. FindFoci is provided as an open-source plugin for ImageJ

    K+a galaxies in the zCOSMOS Survey: Physical properties of systems in their post-starburst phase

    Get PDF
    The identities of the main processes triggering and quenching star-formation in galaxies remain unclear. A key stage in evolution, however, appears to be represented by post-starburst galaxies. To investigate their impact on galaxy evolution, we initiated a multiwavelength study of galaxies with k+a spectral features in the COSMOS field. We examine a mass-selected sample of k+a galaxies at z=0.48-1.2 using the spectroscopic zCOSMOS sample. K+a galaxies occupy the brightest tail of the luminosity distribution. They are as massive as quiescent galaxies and populate the green valley in the colour versus luminosity (or stellar mass) distribution. A small percentage (<8%) of these galaxies have radio and/or X-ray counterparts (implying an upper limit to the SFR of ~8Msun/yr). Over the entire redshift range explored, the class of k+a galaxies is morphologically a heterogeneous population with a similar incidence of bulge-dominated and disky galaxies. This distribution does not vary with the strength of the Hdelta absorption line but instead with stellar mass in a way reminiscent of the well-known mass-morphology relation. Although k+a galaxies are also found in underdense regions, they appear to reside typically in a similarly rich environment as quiescent galaxies on a physical scale of ~2-8Mpc, and in groups they show a morphological early-to-late type ratio similar to the quiescent galaxy class. With the current data set, we do not find evidence of statistical significant evolution in either the number/mass density of k+a galaxies at intermediate redshift with respect to the local values, or the spectral properties. Those galaxies, which are affected by a sudden quenching of their star-formation activity, may increase the stellar mass of the red-sequence by up to a non-negligible level of ~10%.Comment: 17 pages, 9 figures. Accepted for publication in Astronomy and Astrophysics on 09/09/2009 (no changes wrt v1
    • 

    corecore