8 research outputs found

    The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

    Get PDF
    BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them.RESULTS:A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89 and the best AUC iP/R was 68. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35) the macro-averaged precision ranged between 50 and 80, with a maximum F-Score of 55. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows

    The First Post-Kepler Brightness Dips of KIC 8462852

    Get PDF
    We present a photometric detection of the first brightness dips of the unique variable star KIC 8462852 since the end of the Kepler space mission in 2013 May. Our regular photometric surveillance started in October 2015, and a sequence of dipping began in 2017 May continuing on through the end of 2017, when the star was no longer visible from Earth. We distinguish four main 1-2.5% dips, named "Elsie," "Celeste," "Skara Brae," and "Angkor", which persist on timescales from several days to weeks. Our main results so far are: (i) there are no apparent changes of the stellar spectrum or polarization during the dips; (ii) the multiband photometry of the dips shows differential reddening favoring non-grey extinction. Therefore, our data are inconsistent with dip models that invoke optically thick material, but rather they are in-line with predictions for an occulter consisting primarily of ordinary dust, where much of the material must be optically thin with a size scale <<1um, and may also be consistent with models invoking variations intrinsic to the stellar photosphere. Notably, our data do not place constraints on the color of the longer-term "secular" dimming, which may be caused by independent processes, or probe different regimes of a single process

    The First Post-Kepler Brightness Dips of KIC 8462852

    Full text link

    Identification of 15 new psoriasis susceptibility loci highlights the role of innate immunity

    Get PDF
    To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of 3 genome-wide association studies (GWAS) and 2 independent data sets genotyped on the Immunochip, including 10,588 cases and 22,806 controls. We identified 15 new susceptibility loci, increasing to 36 the number associated with psoriasis in European individuals. We also identified, using conditional analyses, five independent signals within previously known loci. The newly identified loci shared with other autoimmune diseases include candidate genes with roles in regulating T-cell function (such as RUNX3, TAGAP and STAT3). Notably, they included candidate genes whose products are involved in innate host defense, including interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C) and nuclear factor (NF)-κB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense

    Seven prostate cancer susceptibility loci identified by a multi-stage genome-wide association study

    No full text
    Prostate cancer (PrCa) is the most frequently diagnosed male cancer in developed countries. We conducted a multi-stage genome-wide association study for PrCa and previously reported the results of the first two stages, which identified 16 PrCa susceptibility loci. We report here the results of stage 3, in which we evaluated 1,536 SNPs in 4,574 individuals with prostate cancer (cases) and 4,164 controls. We followed up ten new association signals through genotyping in 51,311 samples in 30 studies from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium. In addition to replicating previously reported loci, we identified seven new prostate cancer susceptibility loci on chromosomes 2p11, 3q23, 3q26, 5p12, 6p21, 12q13 and Xq12 (P = 4.0 x 10(-8) to P = 2.7 x 10(-24)). We also identified a SNP in TERT more strongly associated with PrCa than that previously reported. More than 40 PrCa susceptibility loci, explaining similar to 25% of the familial risk in this disease, have now been identified
    corecore