266 research outputs found
Mining protein function from text using term-based support vector machines
<p>Abstract</p> <p>Background</p> <p>Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents.</p> <p>Results</p> <p>The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent.</p> <p>Conclusion</p> <p>A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2.</p
Symptom lead times in lung and colorectal cancers: What are the benefits of symptom-based approaches to early diagnosis?
This is the final version of the article. Available from Cancer Research UK via the DOI in this record.Background: Individuals with undiagnosed lung and colorectal cancers present with non-specific symptoms in primary care more often than matched controls. Increased access to diagnostic services for patients with symptoms generates more early-stage diagnoses, but the mechanisms for this are only partially understood. Methods: We re-analysed a UK-based case-control study to estimate the Symptom Lead Time (SLT) distribution for a range of potential symptom criteria for investigation. Symptom Lead Time is the time between symptoms caused by cancer and eventual diagnosis, and is analogous to Lead Time in a screening programme. We also estimated the proportion of symptoms in lung and colorectal cancer cases that are actually caused by the cancer. Results: Mean Symptom Lead Times were between 4.1 and 6.0 months, with medians between 2.0 and 3.2 months. Symptom Lead Time did not depend on stage at diagnosis, nor which criteria for investigation are adopted. Depending on the criteria, an estimated 27-48% of symptoms in individuals with as yet undiagnosed lung cancer, and 12-32% with undiagnosed colorectal cancer are not caused by the cancer. Conclusions: In most cancer cases detected by a symptom-based programme, the symptoms are caused by cancer. These cases have a short lead time and benefit relatively little. However, in a significant minority of cases cancer detection is serendipitous. This group experiences the benefits of a standard screening programme, a substantial mean lead time and a higher probability of early-stage diagnosis.This work was supported by the National Institute for Health Research (NIHR) Programme Grants for Applied Research Programme, RP-PG-0608-10045
Promoting mental health and well-being in schools: examining mindfulness, relaxation and strategies for safety and well-being in English primary and secondary schools—study protocol for a multi-school, cluster randomised controlled trial (INSPIRE)
There are increasing rates of internalising difficulties, particularly anxiety and depression, being reported in children and young people in England. School-based universal prevention programmes are thought to be one way of helping tackle such difficulties. This paper describes an update to a four-arm cluster randomised controlled trial (http://www.isrctn.com/ISRCTN16386254), investigating the effectiveness of three different interventions when compared to usual provision, in English primary and secondary pupils. Due to the COVID-19 pandemic, the trial was put on hold and subsequently prolonged. Data collection will now run until 2024. The key changes to the trial outlined here include clarification of the inclusion and exclusion criteria, an amended timeline reflecting changes to the recruitment period of the trial due to the COVID-19 pandemic and clarification of the data that will be included in the statistical analysis, since the second wave of the trial was disrupted due to COVID-19. Trial registration ISRCTN Registry ISRCTN16386254. Registered on 30 August 2018
Unboxing mutations: Connecting mutation types with evolutionary consequences
A key step in understanding the genetic basis of different evolutionary outcomes (e.g., adaptation) is to determine the roles played by different mutation types (e.g., SNPs, translocations and inversions). To do this we must simultaneously consider different mutation types in an evolutionary framework. Here, we propose a research framework that directly utilizes the most important characteristics of mutations, their population genetic effects, to determine their relative evolutionary significance in a given scenario. We review known population genetic effects of different mutation types and show how these may be connected to different evolutionary outcomes. We provide examples of how to implement this framework and pinpoint areas where more data, theory and synthesis are needed. Linking experimental and theoretical approaches to examine different mutation types simultaneously is a critical step towards understanding their evolutionary significance
Developing a community-based genetic nomenclature for anole lizards
Background: Comparative studies of amniotes have been hindered by a dearth of reptilian molecular sequences. With the genomic assembly of the green anole, Anolis carolinensis available, non-avian reptilian genes can now be compared to mammalian, avian, and amphibian homologs. Furthermore, with more than 350 extant species in the genus Anolis, anoles are an unparalleled example of tetrapod genetic diversity and divergence. As an important ecological, genetic and now genomic reference, it is imperative to develop a standardized Anolis gene nomenclature alongside associated vocabularies and other useful metrics. Results: Here we report the formation of the Anolis Gene Nomenclature Committee (AGNC) and propose a standardized evolutionary characterization code that will help researchers to define gene orthology and paralogy with tetrapod homologs, provide a system for naming novel genes in Anolis and other reptiles, furnish abbreviations to facilitate comparative studies among the Anolis species and related iguanid squamates, and classify the geographical origins of Anolis subpopulations. Conclusions: This report has been generated in close consultation with members of the Anolis and genomic research communities, and using public database resources including NCBI and Ensembl. Updates will continue to be regularly posted to new research community websites such as lizardbase. We anticipate that this standardized gene nomenclature will facilitate the accessibility of reptilian sequences for comparative studies among tetrapods and will further serve as a template for other communities in their sequencing and annotation initiatives.Organismic and Evolutionary BiologyOther Research Uni
Defining the stock structure of northern Australia's threadfin salmon species
The requirement for Queensland, Northern Territory and Western Australian jurisdictions to ensure sustainable harvest of fish resources relies on robust information on the resource status. In northern Australia management of inshore fisheries that target blue threadfin (Eleutheronema tetradactylum) and king threadfin (Polydactylus macrochir) is independent for each of these jurisdictions. However, the lack of information on the stock structure and biology of threadfins means that the appropriate spatial scale of management is not known and assessment of the resource status is not possible. Establishing the stock structure of blue and king threadfin would also immensely improve the relevance of future resource assessments for fishery management of threadfins across northern Australia. This highlighted the urgent need for stock structure information for this species
Symptom lead times in lung and colorectal cancers: what are the benefits of symptom-based approaches to early diagnosis?
BACKGROUND: Individuals with undiagnosed lung and colorectal cancers present with non-specific symptoms in primary care more often than matched controls. Increased access to diagnostic services for patients with symptoms generates more early-stage diagnoses, but the mechanisms for this are only partially understood. METHODS: We re-analysed a UK-based case–control study to estimate the Symptom Lead Time (SLT) distribution for a range of potential symptom criteria for investigation. Symptom Lead Time is the time between symptoms caused by cancer and eventual diagnosis, and is analogous to Lead Time in a screening programme. We also estimated the proportion of symptoms in lung and colorectal cancer cases that are actually caused by the cancer. RESULTS: Mean Symptom Lead Times were between 4.1 and 6.0 months, with medians between 2.0 and 3.2 months. Symptom Lead Time did not depend on stage at diagnosis, nor which criteria for investigation are adopted. Depending on the criteria, an estimated 27–48% of symptoms in individuals with as yet undiagnosed lung cancer, and 12–32% with undiagnosed colorectal cancer are not caused by the cancer. CONCLUSIONS: In most cancer cases detected by a symptom-based programme, the symptoms are caused by cancer. These cases have a short lead time and benefit relatively little. However, in a significant minority of cases cancer detection is serendipitous. This group experiences the benefits of a standard screening programme, a substantial mean lead time and a higher probability of early-stage diagnosis
Comparative BAC-based mapping in the white-throated sparrow, a novel behavioral genomics model, using interspecies overgo hybridization
BACKGROUND
The genomics era has produced an arsenal of resources from sequenced organisms allowing researchers to target species that do not have comparable mapping and sequence information. These new "non-model" organisms offer unique opportunities to examine environmental effects on genomic patterns and processes. Here we use comparative mapping as a first step in characterizing the genome organization of a novel animal model, the white-throated sparrow (Zonotrichia albicollis), which occurs as white or tan morphs that exhibit alternative behaviors and physiology. Morph is determined by the presence or absence of a complex chromosomal rearrangement. This species is an ideal model for behavioral genomics because the association between genotype and phenotype is absolute, making it possible to identify the genomic bases of phenotypic variation.
FINDINGS
We initiated a genomic study in this species by characterizing the white-throated sparrow BAC library via filter hybridization with overgo probes designed for the chicken, turkey, and zebra finch. Cross-species hybridization resulted in 640 positive sparrow BACs assigned to 77 chicken loci across almost all macro-and microchromosomes, with a focus on the chromosomes associated with morph. Out of 216 overgos, 36% of the probes hybridized successfully, with an average number of 3.0 positive sparrow BACs per overgo.
CONCLUSIONS
These data will be utilized for determining chromosomal architecture and for fine-scale mapping of candidate genes associated with phenotypic differences. Our research confirms the utility of interspecies hybridization for developing comparative maps in other non-model organisms
- …