13 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

    Fitness differences associated with Pgi SNP genotypes in the Glanville fritillary butterfly (Melitaea cinxia)

    No full text
    Allozyme variation at the phosphoglucose isomerase (PGI) locus in the Glanville fritillary butterfly (Melitaea cinxia) is associated with variation in flight metabolic rate, dispersal rate, fecundity and local population growth rate. To map allozyme to DNA variation and to survey putative functional variation in genomic DNA, we cloned the coding sequence of Pgi and identified nonsynonymous variable sites that determine the most common allozyme alleles. We show that these single-nucleotide polymorphisms (SNPs) exhibit significant excess of heterozygotes in field-collected population samples as well as in laboratory crosses. This is in contrast to previous results for the same species in which other allozymes and SNPs were in Hardy-Weinberg equilibrium or exhibited an excess of homozygotes. Our results suggest that viability selection favours Pgi heterozygotes. Although this is consistent with direct overdominance at Pgi, we cannot exclude the possibility that heterozygote advantage is caused by the presence of one or more deleterious alleles at linked loci.status: publishe

    The U11/U12 snRNP 65K protein acts as a molecular bridge, binding the U12 snRNA and U11-59K protein

    No full text
    U11 and U12 interact cooperatively with the 5′ splice site and branch site of pre-mRNA as a stable preformed di-snRNP complex, thereby bridging the 5′ and 3′ ends of the intron within the U12-dependent prespliceosome. To identify proteins contributing to di-snRNP formation and intron bridging, we investigated protein–protein and protein–RNA interactions between components of the U11/U12 snRNP. We demonstrate that the U11/U12-65K protein possesses dual binding activity, interacting directly with U12 snRNA via its C-terminal RRM and the U11-associated 59K protein via its N-terminal half. We provide evidence that, in contrast to the previously published U12 snRNA secondary structure model, the 3′ half of U12 forms an extended stem-loop with a highly conserved seven-nucleotide loop and that the latter serves as the 65K binding site. Addition of an oligonucleotide comprising the 65K binding site to an in vitro splicing reaction inhibited U12-dependent, but not U2-dependent, pre-mRNA splicing. Taken together, these data suggest that U11/U12-65K and U11-59K contribute to di-snRNP formation and intron bridging in the minor prespliceosome

    Missing-in-metastasis MIM/MTSS1 promotes actin assembly at intercellular junctions and is required for integrity of kidney epithelia.

    No full text
    MIM/MTSS1 is a tissue-specific regulator of plasma membrane dynamics, whose altered expression levels have been linked to cancer metastasis. MIM deforms phosphoinositide-rich membranes through its I-BAR domain and interacts with actin monomers through its WH2 domain. Recent work proposed that MIM also potentiates Sonic hedgehog (Shh)-induced gene expression. Here, we generated MIM mutant mice and found that full-length MIM protein is dispensable for embryonic development. However, MIM-deficient mice displayed a severe urinary concentration defect caused by compromised integrity of kidney epithelia intercellular junctions, which led to bone abnormalities and end-stage renal failure. In cultured kidney epithelial (MDCK) cells, MIM displayed dynamic localization to adherens junctions, where it promoted Arp2/3-mediated actin filament assembly. This activity was dependent on the ability of MIM to interact with both membranes and actin monomers. Furthermore, results from the mouse model and cell culture experiments suggest that full-length MIM is not crucial for Shh signaling, at least during embryogenesis. Collectively, these data demonstrate that MIM modulates interplay between the actin cytoskeleton and plasma membrane to promote the maintenance of intercellular contacts in kidney epithelia

    An LKB1 AT-AC intron mutation causes Peutz-Jeghers syndrome via splicing at noncanonical cryptic splice sites

    No full text
    Peutz-Jeghers syndrome (PJS) is an autosomal dominant disorder associated with gastrointestinal polyposis and an increased cancer risk. PJS is caused by germline mutations in the tumor suppressor gene LKB1. One such mutation, IVS2+1A>G, alters the second intron 5' splice site, which has sequence features of a U12-type AT-AC intron. We report that in patients, LKB1 RNA splicing occurs from the mutated 5' splice site to several cryptic, noncanonical 3' splice sites immediately adjacent to the normal 3' splice site. In vitro splicing analysis demonstrates that this aberrant splicing is mediated by the U12-dependent spliceosome. The results indicate that the minor spliceosome can use a variety of 3' splice site sequences to pair to a given 5' splice site, albeit with tight constraints for maintaining the 3' splice site position. The unusual splicing defect associated with this PJS-causing mutation uncovers differences in splice-site recognition between the major and minor pre-mRNA splicing pathways
    corecore