1,508 research outputs found

    Quantifying Selective Reporting and the Proteus Phenomenon for Multiple Datasets with Similar Bias

    Get PDF
    Meta-analyses play an important role in synthesizing evidence from diverse studies and datasets that address similar questions. A major obstacle for meta-analyses arises from biases in reporting. In particular, it is speculated that findings which do not achieve formal statistical significance are less likely reported than statistically significant findings. Moreover, the patterns of bias can be complex and may also depend on the timing of the research results and their relationship with previously published work. In this paper, we present an approach that is specifically designed to analyze large-scale datasets on published results. Such datasets are currently emerging in diverse research fields, particularly in molecular medicine. We use our approach to investigate a dataset on Alzheimer's disease (AD) that covers 1167 results from case-control studies on 102 genetic markers. We observe that initial studies on a genetic marker tend to be substantially more biased than subsequent replications. The chances for initial, statistically non-significant results to be published are estimated to be about 44% (95% CI, 32% to 63%) relative to statistically significant results, while statistically non-significant replications have almost the same chance to be published as statistically significant replications (84%; 95% CI, 66% to 107%). Early replications tend to be biased against initial findings, an observation previously termed Proteus phenomenon: The chances for non-significant studies going in the same direction as the initial result are estimated to be lower than the chances for non-significant studies opposing the initial result (73%; 95% CI, 55% to 96%). Such dynamic patters in bias are difficult to capture by conventional methods, where typically simple publication bias is assumed to operate. Our approach captures and corrects for complex dynamic patterns of bias, and thereby helps generating conclusions from published results that are more robust against the presence of different coexisting types of selective reporting

    Decision-Making in Research Tasks with Sequential Testing

    Get PDF
    Background: In a recent controversial essay, published by JPA Ioannidis in PLoS Medicine, it has been argued that in some research fields, most of the published findings are false. Based on theoretical reasoning it can be shown that small effect sizes, error-prone tests, low priors of the tested hypotheses and biases in the evaluation and publication of research findings increase the fraction of false positives. These findings raise concerns about the reliability of research. However, they are based on a very simple scenario of scientific research, where single tests are used to evaluate independent hypotheses. Methodology/Principal Findings: In this study, we present computer simulations and experimental approaches for analyzing more realistic scenarios. In these scenarios, research tasks are solved sequentially, i.e. subsequent tests can be chosen depending on previous results. We investigate simple sequential testing and scenarios where only a selected subset of results can be published and used for future rounds of test choice. Results from computer simulations indicate that for the tasks analyzed in this study, the fraction of false among the positive findings declines over several rounds of testing if the most informative tests are performed. Our experiments show that human subjects frequently perform the most informative tests, leading to a decline of false positives as expected from the simulations. Conclusions/Significance: For the research tasks studied here, findings tend to become more reliable over time. We also find that the performance in those experimental settings where not all performed tests could be published turned out to be surprisingly inefficient. Our results may help optimize existing procedures used in the practice of scientific research and provide guidance for the development of novel forms of scholarly communication.Engineering and Applied SciencesPsycholog

    Measuring co-authorship and networking-adjusted scientific impact

    Get PDF
    Appraisal of the scientific impact of researchers, teams and institutions with productivity and citation metrics has major repercussions. Funding and promotion of individuals and survival of teams and institutions depend on publications and citations. In this competitive environment, the number of authors per paper is increasing and apparently some co-authors don't satisfy authorship criteria. Listing of individual contributions is still sporadic and also open to manipulation. Metrics are needed to measure the networking intensity for a single scientist or group of scientists accounting for patterns of co-authorship. Here, I define I1 for a single scientist as the number of authors who appear in at least I1 papers of the specific scientist. For a group of scientists or institution, In is defined as the number of authors who appear in at least In papers that bear the affiliation of the group or institution. I1 depends on the number of papers authored Np. The power exponent R of the relationship between I1 and Np categorizes scientists as solitary (R>2.5), nuclear (R=2.25-2.5), networked (R=2-2.25), extensively networked (R=1.75-2) or collaborators (R<1.75). R may be used to adjust for co-authorship networking the citation impact of a scientist. In similarly provides a simple measure of the effective networking size to adjust the citation impact of groups or institutions. Empirical data are provided for single scientists and institutions for the proposed metrics. Cautious adoption of adjustments for co-authorship and networking in scientific appraisals may offer incentives for more accountable co-authorship behaviour in published articles.Comment: 25 pages, 5 figure

    Reporting of Human Genome Epidemiology (HuGE) association studies: An empirical assessment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several thousand human genome epidemiology association studies are published every year investigating the relationship between common genetic variants and diverse phenotypes. Transparent reporting of study methods and results allows readers to better assess the validity of study findings. Here, we document reporting practices of human genome epidemiology studies.</p> <p>Methods</p> <p>Articles were randomly selected from a continuously updated database of human genome epidemiology association studies to be representative of genetic epidemiology literature. The main analysis evaluated 315 articles published in 2001–2003. For a comparative update, we evaluated 28 more recent articles published in 2006, focusing on issues that were poorly reported in 2001–2003.</p> <p>Results</p> <p>During both time periods, most studies comprised relatively small study populations and examined one or more genetic variants within a single gene. Articles were inconsistent in reporting the data needed to assess selection bias and the methods used to minimize misclassification (of the genotype, outcome, and environmental exposure) or to identify population stratification. Statistical power, the use of unrelated study participants, and the use of replicate samples were reported more often in articles published during 2006 when compared with the earlier sample.</p> <p>Conclusion</p> <p>We conclude that many items needed to assess error and bias in human genome epidemiology association studies are not consistently reported. Although some improvements were seen over time, reporting guidelines and online supplemental material may help enhance the transparency of this literature.</p

    International ranking systems for universities and institutions: a critical appraisal

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ranking of universities and institutions has attracted wide attention recently. Several systems have been proposed that attempt to rank academic institutions worldwide.</p> <p>Methods</p> <p>We review the two most publicly visible ranking systems, the Shanghai Jiao Tong University 'Academic Ranking of World Universities' and the Times Higher Education Supplement 'World University Rankings' and also briefly review other ranking systems that use different criteria. We assess the construct validity for educational and research excellence and the measurement validity of each of the proposed ranking criteria, and try to identify generic challenges in international ranking of universities and institutions.</p> <p>Results</p> <p>None of the reviewed criteria for international ranking seems to have very good construct validity for both educational and research excellence, and most don't have very good construct validity even for just one of these two aspects of excellence. Measurement error for many items is also considerable or is not possible to determine due to lack of publication of the relevant data and methodology details. The concordance between the 2006 rankings by Shanghai and Times is modest at best, with only 133 universities shared in their top 200 lists. The examination of the existing international ranking systems suggests that generic challenges include adjustment for institutional size, definition of institutions, implications of average measurements of excellence versus measurements of extremes, adjustments for scientific field, time frame of measurement and allocation of credit for excellence.</p> <p>Conclusion</p> <p>Naïve lists of international institutional rankings that do not address these fundamental challenges with transparent methods are misleading and should be abandoned. We make some suggestions on how focused and standardized evaluations of excellence could be improved and placed in proper context.</p

    Experimental evaluation of train and test split strategies in link prediction

    Get PDF
    In link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint and independent train and test sets are needed. However, objects in a real-world network are inherently related to each other. Therefore, it is far from trivial to separate candidate links into these disjoint sets.Here we characterize and empirically investigate the two dominant approaches from the literature for creating separate train and test sets in link prediction, referred to as random and temporal splits. Comparing the performance of these two approaches on several large temporal network datasets, we find evidence that random splits may result in too optimistic results, whereas a temporal split may give a more fair and realistic indication of performance. Results appear robust to the selection of temporal intervals. These findings will be of interest to researchers that employ link prediction or other machine learning tasks in networks.Computer Systems, Imagery and Medi

    Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network

    Get PDF
    Although current electronic methods of scientific publishing offer increased opportunities for publishing all research studies and describing them in sufficient detail, health research literature still suffers from many shortcomings. These shortcomings seriously undermine the value and utility of the literature and waste scarce resources invested in the research. In recent years there have been several positive steps aimed at improving this situation, such as a strengthening of journals' policies on research publication and the wide requirement to register clinical trials

    Effectiveness of strategies to increase the validity of findings from association studies: size vs. replication

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The capacity of multiple comparisons to produce false positive findings in genetic association studies is abundantly clear. To address this issue, the concept of false positive report probability (FPRP) measures "the probability of no true association between a genetic variant and disease given a statistically significant finding". This concept involves the notion of prior probability of an association between a genetic variant and a disease, making it difficult to achieve acceptable levels for the FPRP when the prior probability is low. Increasing the sample size is of limited efficiency to improve the situation.</p> <p>Methods</p> <p>To further clarify this problem, the concept of true report probability (TRP) is introduced by analogy to the positive predictive value (PPV) of diagnostic testing. The approach is extended to consider the effects of replication studies. The formula for the TRP after k replication studies is mathematically derived and shown to be only dependent on prior probability, alpha, power, and number of replication studies.</p> <p>Results</p> <p>Case-control association studies are used to illustrate the TRP concept for replication strategies. Based on power considerations, a relationship is derived between TRP after k replication studies and sample size of each individual study. That relationship enables study designers optimization of study plans. Further, it is demonstrated that replication is efficient in increasing the TRP even in the case of low prior probability of an association and without requiring very large sample sizes for each individual study.</p> <p>Conclusions</p> <p>True report probability is a comprehensive and straightforward concept for assessing the validity of positive statistical testing results in association studies. By its extension to replication strategies it can be demonstrated in a transparent manner that replication is highly effective in distinguishing spurious from true associations. Based on the generalized TRP method for replication designs, optimal research strategy and sample size planning become possible.</p

    The influence of the team in conducting a systematic review

    Get PDF
    There is an increasing body of research documenting flaws in many published systematic reviews' methodological and reporting conduct. When good systematic review practice is questioned, attention is rarely turned to the composition of the team that conducted the systematic review. This commentary highlights a number of relevant articles indicating how the composition of the review team could jeopardise the integrity of the systematic review study and its conclusions. Key biases require closer attention such as sponsorship bias and researcher allegiance, but there may also be less obvious affiliations in teams conducting secondary evidence-syntheses. The importance of transparency and disclosure are now firmly on the agenda for clinical trials and primary research, but the meta-biases that systematic reviews may be at risk from now require further scrutiny

    Systematic meta-analyses and field synopsis of genetic association studies of violence and aggression

    Get PDF
    A large number of candidate gene studies for aggression and violence have been conducted. Successful identification of associations between genetic markers and aggression would contribute to understanding the neurobiology of antisocial behavior and potentially provide useful tools for risk prediction and therapeutic targets for high-risk groups of patients and offenders. We systematically reviewed the literature and assessed the evidence on genetic association studies of aggression and related outcomes in order to provide a field synopsis. We searched PubMed and Huge Navigator databases and sought additional data through reviewing reference lists and correspondence with investigators. Genetic association studies were included if outcome data on aggression or violent behavior either as a binary outcome or as a quantitative trait were provided. From 1331 potentially relevant investigations, 185 studies constituting 277 independent associations on 31 genes fulfilled the predetermined selection criteria. Data from variants investigated in three or more samples were combined in meta-analyses and potential sources of heterogeneity were investigated using subgroup analyses. In the primary analyses, which used relaxed inclusion criteria, we found no association between any polymorphism analyzed and aggression at the 5% level of significance. Subgroup analyses, including by severity of outcome, age group, characteristics of the sample and ethnicity, did not demonstrate any consistent findings. Current evidence does not support the use of such genes to predict dangerousness or as markers for therapeutic interventions
    • …
    corecore