210,027 research outputs found

    Denotational Aspects of Untyped Normalization by Evaluation

    Get PDF
    We show that the standard normalization-by-evaluation construction for the simply-typed lambda_{beta eta}-calculus has a natural counterpart for the untyped lambda_beta-calculus, with the central type-indexed logical relation replaced by a "recursively defined'' invariant relation, in the style of Pitts. In fact, the construction can be seen as generalizing a computational-adequacy argument for an untyped, call-by-name language to normalization instead of evaluation. In the untyped setting, not all terms have normal forms, so the normalization function is necessarily partial. We establish its correctness in the senses of soundness (the output term, if any, is in normal form and beta-equivalent to the input term); identification ( beta-equivalent terms are mapped to the same result); and completeness (the function is defined for all terms that do have normal forms). We also show how the semantic construction enables a simple yet formal correctness proof for the normalization algorithm, expressed as a functional program in an ML-like call-by-value language. Finally, we generalize the construction to produce an infinitary variant of normal forms, namely Böhm trees. We show that the three-part characterization of correctness, as well as the proofs, extend naturally to this generalization

    XML document design via GN-DTD

    Get PDF
    Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples

    Cell line name recognition in support of the identification of synthetic lethality in cancer from text

    Get PDF
    Motivation: The recognition and normalization of cell line names in text is an important task in biomedical text mining research, facilitating for instance the identification of synthetically lethal genes from the literature. While several tools have previously been developed to address cell line recognition, it is unclear whether available systems can perform sufficiently well in realistic and broad-coverage applications such as extracting synthetically lethal genes from the cancer literature. In this study, we revisit the cell line name recognition task, evaluating both available systems and newly introduced methods on various resources to obtain a reliable tagger not tied to any specific subdomain. In support of this task, we introduce two text collections manually annotated for cell line names: the broad-coverage corpus Gellus and CLL, a focused target domain corpus. Results: We find that the best performance is achieved using NERsuite, a machine learning system based on Conditional Random Fields, trained on the Gellus corpus and supported with a dictionary of cell line names. The system achieves an F-score of 88.46% on the test set of Gellus and 85.98% on the independently annotated CLL corpus. It was further applied at large scale to 24 302 102 unannotated articles, resulting in the identification of 5 181 342 cell line mentions, normalized to 11 755 unique cell line database identifiers
    • …
    corecore