5 research outputs found

    CSI-OMIM - Clinical Synopsis Search in OMIM

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The OMIM database is a tool used daily by geneticists. Syndrome pages include a Clinical Synopsis section containing a list of known phenotypes comprising a clinical syndrome. The phenotypes are in free text and different phrases are often used to describe the same phenotype, the differences originating in spelling variations or typing errors, varying sentence structures and terminological variants.</p> <p>These variations hinder searching for syndromes or using the large amount of phenotypic information for research purposes. In addition, negation forms also create false positives when searching the textual description of phenotypes and induce noise in text mining applications.</p> <p>Description</p> <p>Our method allows efficient and complete search of OMIM phenotypes as well as improved data-mining of the OMIM phenome. Applying natural language processing, each phrase is tagged with additional semantic information using UMLS and MESH. Using a grammar based method, annotated phrases are clustered into groups denoting similar phenotypes. These groups of synonymous expressions enable precise search, as query terms can be matched with the many variations that appear in OMIM, while avoiding over-matching expressions that include the query term in a negative context. On the basis of these clusters, we computed pair-wise similarity among syndromes in OMIM. Using this new similarity measure, we identified 79,770 new connections between syndromes, an average of 16 new connections per syndrome. Our project is Web-based and available at <url>http://fohs.bgu.ac.il/s2g/csiomim</url></p> <p>Conclusions</p> <p>The resulting enhanced search functionality provides clinicians with an efficient tool for diagnosis. This search application is also used for finding similar syndromes for the candidate gene prioritization tool S2G.</p> <p>The enhanced OMIM database we produced can be further used for bioinformatics purposes such as linking phenotypes and genes based on syndrome similarities and the known genes in Morbidmap.</p

    The Minimal Seed Set Problem

    No full text
    This paper defines and studies a new, interesting, and challenging benchmark problem that originates in systems biology. The minimal seed-set problem is defined as follows: given a description of the metabolic reactions of an organism, characterize the minimal set of nutrients with which it could synthesize all nutrients it is capable of synthesizing. Current methods used in systems biology yield only approximate solutions. And although it is natural to cast it as a planning problem, current optimal planners are unable to solve it, while non-optimal planners return plans that are very far from optimal. As a planning problem, it is inherently delete-free, has many zero-cost actions, all propositions are landmarks, and many legal permutations of the plan exist. We show how a simple uninformed search algorithm that exploits inherent independence between sub-goals can solve it optimally by reducing the branching factor drastically

    Pruning Methods for Optimal Delete-Free Planning

    No full text
    Delete-free planning underlies many popular relaxation (h+) based heuristics used in state-of-the-art planners; it provides a simpler setting for exploring new pruning methods and other ideas; and a number of interesting recent planning domains are naturally delete-free. In this paper we explore new pruning methods for planning in delete-free planning domains. First, we observe that optimal delete-free plans can be composed from contiguous sub-plans that focus on one fact landmark at a time. Thus, instead of attempting to achieve the goal, the planner can focus on more easily achievable landmarks at each stage. Then, we suggest a number of complementary pruning techniques that are made more powerful with this observation. To carry out these pruning techniques efficiently, we make heavy use of an And/Or graph depicting the planning problem. We empirically evaluate these ideas using the FD framework, and show that they lead to clear improvements
    corecore