60 research outputs found

    Is omission of free text records a possible source of data loss and bias in Clinical Practice Research Datalink studies? A case-control study.

    Get PDF
    This is the final version of the article. Available from the publisher via the DOI in this record.OBJECTIVES: To estimate data loss and bias in studies of Clinical Practice Research Datalink (CPRD) data that restrict analyses to Read codes, omitting anything recorded as text. DESIGN: Matched case-control study. SETTING: Patients contributing data to the CPRD. PARTICIPANTS: 4915 bladder and 3635 pancreatic, cancer cases diagnosed between 1 January 2000 and 31 December 2009, matched on age, sex and general practitioner practice to up to 5 controls (bladder: n=21 718; pancreas: n=16 459). The analysis period was the year before cancer diagnosis. PRIMARY AND SECONDARY OUTCOME MEASURES: Frequency of haematuria, jaundice and abdominal pain, grouped by recording style: Read code or text-only (ie, hidden text). The association between recording style and case-control status (χ(2) test). For each feature, the odds ratio (OR; conditional logistic regression) and positive predictive value (PPV; Bayes' theorem) for cancer, before and after addition of hidden text records. RESULTS: Of the 20 958 total records of the features, 7951 (38%) were recorded in hidden text. Hidden text recording was more strongly associated with controls than with cases for haematuria (140/336=42% vs 556/3147=18%) in bladder cancer (χ(2) test, p<0.001), and for jaundice (21/31=67% vs 463/1565=30%, p<0.0001) and abdominal pain (323/1126=29% vs 397/1789=22%, p<0.001) in pancreatic cancer. Adding hidden text records corrected PPVs of haematuria for bladder cancer from 4.0% (95% CI 3.5% to 4.6%) to 2.9% (2.6% to 3.2%), and of jaundice for pancreatic cancer from 12.8% (7.3% to 21.6%) to 6.3% (4.5% to 8.7%). Adding hidden text records did not alter the PPV of abdominal pain for bladder (codes: 0.14%, 0.13% to 0.16% vs codes plus hidden text: 0.14%, 0.13% to 0.15%) or pancreatic (0.23%, 0.21% to 0.25% vs 0.21%, 0.20% to 0.22%) cancer. CONCLUSIONS: Omission of text records from CPRD studies introduces bias that inflates outcome measures for recognised alarm symptoms. This potentially reinforces clinicians' views of the known importance of these symptoms, marginalising the significance of 'low-risk but not no-risk' symptoms.SJP is funded by a University of Exeter PhD studentship. This report presents independent research part funded by the National Institute for Health Research Programme Grants for Applied Research programme (RP-PG-0608- 10045). The views expressed are those of the authors and not necessarily those of the National Health Service, the National Institute for Health Research, or the Department of Health

    Mixed-marker approach suggests maternal philopatry and sex-biased behaviours of narrow sawfish Anoxypristis cuspidata

    Get PDF
    ABSTRACT: The narrow sawfish Anoxypristis cuspidata belongs to the most endangered family of chondrichthyan fishes, the sawfishes (Pristidae). This species has undergone significant declines in geographic range and abundance due to anthropogenic activities including fishing and habitat destruction. Very little is known of adult movements within its distribution. In order to better manage and protect this endangered species, understanding patterns of habitat use, connectivity and behaviour is important. Using a combination of partial mitochondrial sequences (control region [CR] and NADH dehydrogenase 4 [ND4]) and nuclear markers (microsatellites), this study assessed the genetic population structure of A. cuspidata in Australia and Papua New Guinea. Significant population structuring using mitochondrial DNA was found between the east Australian coast, Gulf of Papua and Gulf of Carpentaria (using concatenated CR and ND4 markers) (analysis of molecular variance [AMOVA], ΦST = 0.082, p = FST = 0.012, p = 1.000). Our results suggest that a combination of historic genetic drift, maternal natal philopatry and possible male-biased dispersal likely drive the genetic patterns observed. Given the endangered status and lack of knowledge for A. cuspidata, this study presents important insights that may be used to inform management efforts

    A compendium and functional characterization of mammalian genes involved in adaptation to Arctic or Antarctic environments

    Get PDF
    Many mammals are well adapted to surviving in extremely cold environments. These species have likely accumulated genetic changes that help them efficiently cope with low temperatures. It is not known whether the same genes related to cold adaptation in one species would be under selection in another species. The aims of this study therefore were: to create a compendium of mammalian genes related to adaptations to a low temperature environment; to identify genes related to cold tolerance that have been subjected to independent positive selection in several species; to determine promising candidate genes/pathways/organs for further empirical research on cold adaptation in mammals

    Identification and Analysis of Co-Occurrence Networks with NetCutter

    Get PDF
    BACKGROUND: Co-occurrence analysis is a technique often applied in text mining, comparative genomics, and promoter analysis. The methodologies and statistical models used to evaluate the significance of association between co-occurring entities are quite diverse, however. METHODOLOGY/PRINCIPAL FINDINGS: We present a general framework for co-occurrence analysis based on a bipartite graph representation of the data, a novel co-occurrence statistic, and software performing co-occurrence analysis as well as generation and analysis of co-occurrence networks. We show that the overall stringency of co-occurrence analysis depends critically on the choice of the null-model used to evaluate the significance of co-occurrence and find that random sampling from a complete permutation set of the bipartite graph permits co-occurrence analysis with optimal stringency. We show that the Poisson-binomial distribution is the most natural co-occurrence probability distribution when vertex degrees of the bipartite graph are variable, which is usually the case. Calculation of Poisson-binomial P-values is difficult, however. Therefore, we propose a fast bi-binomial approximation for calculation of P-values and show that this statistic is superior to other measures of association such as the Jaccard coefficient and the uncertainty coefficient. Furthermore, co-occurrence analysis of more than two entities can be performed using the same statistical model, which leads to increased signal-to-noise ratios, robustness towards noise, and the identification of implicit relationships between co-occurring entities. Using NetCutter, we identify a novel protein biosynthesis related set of genes that are frequently coordinately deregulated in human cancer related gene expression studies. NetCutter is available at http://bio.ifom-ieo-campus.it/NetCutter/). CONCLUSION: Our approach can be applied to any set of categorical data where co-occurrence analysis might reveal functional relationships such as clinical parameters associated with cancer subtypes or SNPs associated with disease phenotypes. The stringency of our approach is expected to offer an advantage in a variety of applications

    Do diagnostic delays in cancer matter?

    Get PDF
    background: The United Kingdom has poorer cancer outcomes than many other countries due partly to delays in diagnosing symptomatic cancer, leading to more advanced stage at diagnosis. Delays can occur at the level of patients, primary care, systems and secondary care. There is considerable potential for interventions to minimise delays and lead to earlier-stage diagnosis. methods: Scoping review of the published studies, with a focus on methodological issues. results: Trial data in this area are lacking and observational studies often show no association or negative ones. This review offers methodological explanations for these counter-intuitive findings. conclusion: While diagnostic delays do matter, their importance is uncertain and must be determined through more sophisticated methods

    A multilocus assay reveals high nucleotide diversity and limited differentiation among Scandinavian willow grouse (Lagopus lagopus)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is so far very little data on autosomal nucleotide diversity in birds, except for data from the domesticated chicken and some passerines species. Estimates of nucleotide diversity reported so far in birds have been high (~10<sup>-3</sup>) and a likely explanation for this is the generally higher effective population sizes compared to mammals. In this study, the level of nucleotide diversity has been examined in the willow grouse, a non-domesticated bird species from the order Galliformes, which also holds the chicken. The willow grouse (<it>Lagopus lagopus</it>) has an almost circumpolar distribution but is absent from Greenland and the north Atlantic islands. It primarily inhabits tundra, forest edge habitats and sub-alpine vegetation. Willow grouse are hunted throughout its range, and regionally it is a game bird of great cultural and economical importance.</p> <p>Results</p> <p>We sequenced 18 autosomal protein coding loci from approximately 15–18 individuals per population. We found a total of 127 SNP's, which corresponds to 1 SNP every 51 bp. 26 SNP's were amino acid replacement substitutions. Total nucleotide diversity (<it>π</it><sub><it>t</it></sub>) was between 1.30 × 10<sup>-4 </sup>and 7.66 × 10<sup>-3 </sup>(average <it>π</it><sub><it>t </it></sub>= 2.72 × 10<sup>-3 </sup>± 2.06 × 10<sup>-3</sup>) and silent nucleotide diversity varied between 4.20 × 10<sup>-4</sup>and 2.76 × 10<sup>-2 </sup>(average <it>π</it><sub><it>S </it></sub>= 9.22 × 10<sup>-3 </sup>± 7.43 × 10<sup>-4</sup>). The synonymous diversity is approximately 20 times higher than in humans and two times higher than in chicken. Non-synonymous diversity was on average 18 times lower than the synonymous diversity and varied between 0 and 4.90 × 10<sup>-3 </sup>(average <it>π</it><sub><it>a </it></sub>= 5.08 × 10<sup>-4 </sup>± 7.43 × 10<sup>3</sup>), which suggest that purifying selection is strong in these genes. <it>F</it><sub>ST </sub>values based on synonymous SNP's varied between -5.60 × 10<sup>-4 </sup>and 0.20 among loci and revealed low levels of differentiation among the four localities, with an overall value of <it>F</it><sub>ST </sub>= 0.03 (95% CI: 0.006 – 0.057) over 60 unlinked loci. Non-synonymous SNP's gave similar results. Low levels of linkage disequilibrium were observed within genes, with an average r<sup>2 </sup>= 0.084 ± 0.110, which is expected for a large outbred population with no population differentiation. The mean per site per generation recombination parameter (ρ) was comparably high (0.028 ± 0.018), indicating high recombination rates in these genes.</p> <p>Conclusion</p> <p>We found unusually high levels of nucleotide diversity in the Scandinavian willow grouse as well as very little population structure among localities with up to 1647 km distance. There are also low levels of linkage disequilibrium within the genes and the population recombination rate is high, which is indicative of an old panmictic population, where recombination has had time to break up any haplotype blocks. The non-synonymous nucleotide diversity is low compared with the silent, which is in agreement with effective purifying selection, possibly due to the large effective population size.</p

    Actinomycete integrative and conjugative elements

    Get PDF
    This paper reviews current knowledge on actinomycete integrative and conjugative elements (AICEs). The best characterised AICEs, pSAM2 of Streptomyces ambofaciens (10.9 kb), SLP1 (17.3 kb) of Streptomyces coelicolor and pMEA300 of Amycolatopsis methanolica (13.3 kb), are present as integrative elements in specific tRNA genes, and are capable of conjugative transfer. These AICEs have a highly conserved structural organisation, with functional modules for excision/integration, replication, conjugative transfer, and regulation. Recently, it has been shown that pMEA300 and the related elements pMEA100 of Amycolatopsis mediterranei and pSE211 of Saccharopolyspora erythraea form a novel group of AICEs, the pMEA-elements, based on the unique characteristics of their replication initiator protein RepAM. Evaluation of a large collection of Amycolatopsis isolates has allowed identification of multiple pMEA-like elements. Our data show that, as AICEs, they mainly coevolved with their natural host in an integrated form, rather than being dispersed via horizontal gene transfer. The pMEA-like elements could be separated into two distinct populations from different geographical origins. One group was most closely related to pMEA300 and was found in isolates from Australia and Asia and pMEA100-related sequences were present in European isolates. Genome sequence data have enormously contributed to the recent insight that AICEs are present in many actinomycete genera. The sequence data also provide more insight into their evolutionary relationships, revealing their modular composition and their likely combined descent from bacterial plasmids and bacteriophages. Evidence is accumulating that AICEs act as modulators of host genome diversity and are also involved in the acquisition of secondary metabolite clusters and foreign DNA via horizontal gene transfer. Although still speculative, these AICEs may play a role in the spread of antibiotic resistance factors into pathogenic bacteria. The novel insights on AICE characteristics presented in this review may be used for the effective construction of new vectors that allows us to engineer and optimise strains for the production of commercially and medically interesting secondary metabolites, and bioactive proteins

    Optimization of Muscle Activity for Task-Level Goals Predicts Complex Changes in Limb Forces across Biomechanical Contexts

    Get PDF
    Optimality principles have been proposed as a general framework for understanding motor control in animals and humans largely based on their ability to predict general features movement in idealized motor tasks. However, generalizing these concepts past proof-of-principle to understand the neuromechanical transformation from task-level control to detailed execution-level muscle activity and forces during behaviorally-relevant motor tasks has proved difficult. In an unrestrained balance task in cats, we demonstrate that achieving task-level constraints center of mass forces and moments while minimizing control effort predicts detailed patterns of muscle activity and ground reaction forces in an anatomically-realistic musculoskeletal model. Whereas optimization is typically used to resolve redundancy at a single level of the motor hierarchy, we simultaneously resolved redundancy across both muscles and limbs and directly compared predictions to experimental measures across multiple perturbation directions that elicit different intra- and interlimb coordination patterns. Further, although some candidate task-level variables and cost functions generated indistinguishable predictions in a single biomechanical context, we identified a common optimization framework that could predict up to 48 experimental conditions per animal (n = 3) across both perturbation directions and different biomechanical contexts created by altering animals' postural configuration. Predictions were further improved by imposing experimentally-derived muscle synergy constraints, suggesting additional task variables or costs that may be relevant to the neural control of balance. These results suggested that reduced-dimension neural control mechanisms such as muscle synergies can achieve similar kinetics to the optimal solution, but with increased control effort (≈2×) compared to individual muscle control. Our results are consistent with the idea that hierarchical, task-level neural control mechanisms previously associated with voluntary tasks may also be used in automatic brainstem-mediated pathways for balance

    Assignment of PolyProline II Conformation and Analysis of Sequence – Structure Relationship

    Get PDF
    International audienceBACKGROUND: Secondary structures are elements of great importance in structural biology, biochemistry and bioinformatics. They are broadly composed of two repetitive structures namely α-helices and β-sheets, apart from turns, and the rest is associated to coil. These repetitive secondary structures have specific and conserved biophysical and geometric properties. PolyProline II (PPII) helix is yet another interesting repetitive structure which is less frequent and not usually associated with stabilizing interactions. Recent studies have shown that PPII frequency is higher than expected, and they could have an important role in protein - protein interactions. METHODOLOGY/PRINCIPAL FINDINGS: A major factor that limits the study of PPII is that its assignment cannot be carried out with the most commonly used secondary structure assignment methods (SSAMs). The purpose of this work is to propose a PPII assignment methodology that can be defined in the frame of DSSP secondary structure assignment. Considering the ambiguity in PPII assignments by different methods, a consensus assignment strategy was utilized. To define the most consensual rule of PPII assignment, three SSAMs that can assign PPII, were compared and analyzed. The assignment rule was defined to have a maximum coverage of all assignments made by these SSAMs. Not many constraints were added to the assignment and only PPII helices of at least 2 residues length are defined. CONCLUSIONS/SIGNIFICANCE: The simple rules designed in this study for characterizing PPII conformation, lead to the assignment of 5% of all amino as PPII. Sequence - structure relationships associated with PPII, defined by the different SSAMs, underline few striking differences. A specific study of amino acid preferences in their N and C-cap regions was carried out as their solvent accessibility and contact patterns. Thus the assignment of PPII can be coupled with DSSP and thus opens a simple way for further analysis in this field

    Population genomics of marine zooplankton

    Get PDF
    Author Posting. © The Author(s), 2017. This is the author's version of the work. It is posted here for personal use, not for redistribution. The definitive version was published in Bucklin, Ann et al. "Population Genomics of Marine Zooplankton." Population Genomics: Marine Organisms. Ed. Om P. Rajora and Marjorie Oleksiak. Springer, 2018. doi:10.1007/13836_2017_9.The exceptionally large population size and cosmopolitan biogeographic distribution that distinguish many – but not all – marine zooplankton species generate similarly exceptional patterns of population genetic and genomic diversity and structure. The phylogenetic diversity of zooplankton has slowed the application of population genomic approaches, due to lack of genomic resources for closelyrelated species and diversity of genomic architecture, including highly-replicated genomes of many crustaceans. Use of numerous genomic markers, especially single nucleotide polymorphisms (SNPs), is transforming our ability to analyze population genetics and connectivity of marine zooplankton, and providing new understanding and different answers than earlier analyses, which typically used mitochondrial DNA and microsatellite markers. Population genomic approaches have confirmed that, despite high dispersal potential, many zooplankton species exhibit genetic structuring among geographic populations, especially at large ocean-basin scales, and have revealed patterns and pathways of population connectivity that do not always track ocean circulation. Genomic and transcriptomic resources are critically needed to allow further examination of micro-evolution and local adaptation, including identification of genes that show evidence of selection. These new tools will also enable further examination of the significance of small-scale genetic heterogeneity of marine zooplankton, to discriminate genetic “noise” in large and patchy populations from local adaptation to environmental conditions and change.Support was provided by the US National Science Foundation to AB and RJO (PLR-1044982) and to RJO (MCB-1613856); support to IS and MC was provided by Nord University (Norway)
    corecore