64 research outputs found

    Semi-automatic conversion of BioProp semantic annotation to PASBio annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator.</p> <p>Results</p> <p>Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs.</p> <p>Conclusion</p> <p>Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.</p

    GWAS analysis of handgrip and lower body strength in older adults in the CHARGE consortium

    Get PDF
    Decline in muscle strength with aging is an important predictor of health trajectory in the elderly. Several factors, including genetics, are proposed contributors to variability in muscle strength. To identify genetic contributors to muscle strength, a meta-analysis of genomewide association studies of handgrip was conducted. Grip strength was measured using a handheld dynamometer in 27 581 individuals of European descent over 65 years of age from 14 cohort studies. Genomewide association analysis was conducted on ~2.7 million imputed and genotyped variants (SNPs). Replication of the most significant findings was conducted using data from 6393 individuals from three cohorts. GWAS of lower body strength was also characterized in a subset of cohorts. Two genomewide significant (P-value< 5 × 10−8) and 39 suggestive (P-value< 5 × 10−5) associations were observed from meta-analysis of the discovery cohorts. After meta-analysis with replication cohorts, genomewide significant association was observed for rs752045 on chromosome 8 (β = 0.47, SE = 0.08, P-value = 5.20 × 10−10). This SNP is mapped to an intergenic region and is located within an accessible chromatin region (DNase hypersensitivity site) in skeletal muscle myotubes differentiated from the human skeletal muscle myoblasts cell line. This locus alters a binding motif of the CCAAT/enhancer-binding protein-β (CEBPB) that is implicated in muscle repair mechanisms. GWAS of lower body strength did not yield significant results. A common genetic variant in a chromosomal region that regulates myotube differentiation and muscle repair may contribute to variability in grip strength in the elderly. Further studies are needed to uncover the mechanisms that link this genetic variant with muscle strength

    Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)

    Get PDF
    In 2008 we published the first set of guidelines for standardizing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in different organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes. For example, a key point that needs to be emphasized is that there is a difference between measurements that monitor the numbers or volume of autophagic elements (e.g., autophagosomes or autolysosomes) at any stage of the autophagic process versus those that measure fl ux through the autophagy pathway (i.e., the complete process including the amount and rate of cargo sequestered and degraded). In particular, a block in macroautophagy that results in autophagosome accumulation must be differentiated from stimuli that increase autophagic activity, defi ned as increased autophagy induction coupled with increased delivery to, and degradation within, lysosomes (inmost higher eukaryotes and some protists such as Dictyostelium ) or the vacuole (in plants and fungi). In other words, it is especially important that investigators new to the fi eld understand that the appearance of more autophagosomes does not necessarily equate with more autophagy. In fact, in many cases, autophagosomes accumulate because of a block in trafficking to lysosomes without a concomitant change in autophagosome biogenesis, whereas an increase in autolysosomes may reflect a reduction in degradative activity. It is worth emphasizing here that lysosomal digestion is a stage of autophagy and evaluating its competence is a crucial part of the evaluation of autophagic flux, or complete autophagy. Here, we present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macroautophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes. These guidelines are not meant to be a formulaic set of rules, because the appropriate assays depend in part on the question being asked and the system being used. In addition, we emphasize that no individual assay is guaranteed to be the most appropriate one in every situation, and we strongly recommend the use of multiple assays to monitor autophagy. Along these lines, because of the potential for pleiotropic effects due to blocking autophagy through genetic manipulation it is imperative to delete or knock down more than one autophagy-related gene. In addition, some individual Atg proteins, or groups of proteins, are involved in other cellular pathways so not all Atg proteins can be used as a specific marker for an autophagic process. In these guidelines, we consider these various methods of assessing autophagy and what information can, or cannot, be obtained from them. Finally, by discussing the merits and limits of particular autophagy assays, we hope to encourage technical innovation in the field

    The 5p15.33 Locus Is Associated with Risk of Lung Adenocarcinoma in Never-Smoking Females in Asia

    Get PDF
    Genome-wide association studies of lung cancer reported in populations of European background have identified three regions on chromosomes 5p15.33, 6p21.33, and 15q25 that have achieved genome-wide significance with p-values of 10−7 or lower. These studies have been performed primarily in cigarette smokers, raising the possibility that the observed associations could be related to tobacco use, lung carcinogenesis, or both. Since most women in Asia do not smoke, we conducted a genome-wide association study of lung adenocarcinoma in never-smoking females (584 cases, 585 controls) among Han Chinese in Taiwan and found that the most significant association was for rs2736100 on chromosome 5p15.33 (p = 1.30×10−11). This finding was independently replicated in seven studies from East Asia totaling 1,164 lung adenocarcinomas and 1,736 controls (p = 5.38×10−11). A pooled analysis achieved genome-wide significance for rs2736100. This SNP marker localizes to the CLPTM1L-TERT locus on chromosome 5p15.33 (p = 2.60×10−20, allelic risk = 1.54, 95% Confidence Interval (CI) 1.41–1.68). Risks for heterozygote and homozygote carriers of the minor allele were 1.62 (95% CI; 1.40–1.87), and 2.35 (95% CI: 1.95–2.83), respectively. In summary, our results show that genetic variation in the CLPTM1L-TERT locus of chromosome 5p15.33 is directly associated with the risk of lung cancer, most notably adenocarcinoma

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Large meta-analysis of genome-wide association studies identifies five loci for lean body mass

    Get PDF
    Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 x 10(-8)) or suggestively genome wide (p < 2.3 x 10(-6)). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/ near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/ near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass

    Genome Sequence of the Anaerobic, Thermophilic, and Cellulolytic Bacterium “Anaerocellum thermophilum” DSM 6725▿

    No full text
    “Anaerocellum thermophilum” DSM 6725 is a strictly anaerobic bacterium that grows optimally at 75°C. It uses a variety of polysaccharides, including crystalline cellulose and untreated plant biomass, and has potential utility in biomass conversion. Here we report its complete genome sequence of 2.97 Mb, which is contained within one chromosome and two plasmids (of 8.3 and 3.6 kb). The genome encodes a broad set of cellulolytic enzymes, transporters, and pathways for sugar utilization and compared to those of other saccharolytic, anaerobic thermophiles is most similar to that of Caldicellulosiruptor saccharolyticus DSM 8903

    Concomitant loss of NDH complex-related genes within chloroplast and nuclear genomes in some orchids

    No full text
    The chloroplast NAD(P)H dehydrogenase-like (NDH) complex consists of about 30 subunits from both the nuclear and chloroplast genomes and is ubiquitous across most land plants. In some orchids, such as Phalaenopsis equestris, Dendrobium officinale and Dendrobium catenatum, most of the 11 chloroplast genomeencoded ndh genes (cp-ndh) have been lost. Here we investigated whether functional cp-ndh genes have been completely lost in these orchids or whether they have been transferred and retained in the nuclear genome. Further, we assessed whether both cp-ndh genes and nucleus-encoded NDH-related genes can be lost, resulting in the absence of the NDH complex. Comparative analyses of the genome of Apostasia odor-ata, an orchid species with a complete complement of cp-ndh genes which represents the sister lineage to all other orchids, and three published orchid genome sequences for P. equestris, D. officinale and D. catenatum, which are all missing cp-ndh genes, indicated that copies of cp-ndh genes are not present in any of these four nuclear genomes. This observation suggests that the NDH complex is not necessary for some plants. Comparative genomic/transcriptomic analyses of currently available plastid genome sequences and nuclear transcriptome data showed that 47 out of 660 photoautotrophic plants and all the heterotrophic plants are missing plastid-encoded cp-ndh genes and exhibit no evidence for maintenance of a functional NDH complex. Our data indicate that the NDH complex can be lost in photoautotrophic plant species. Further, the loss of the NDH complex may increase the probability of transition from a photoautotrophic to a heterotrophic life history
    corecore