26 research outputs found

    BR-Explorer: An FCA-based algorithm for Information Retrieval

    Get PDF
    In this paper we present BR-Explorer, an FCA-based algorithm that addresses the problem of retrieving the relevant objects for a given query. Initially, a formal context representing the relation between a set of objects and the corresponding set of attributes is given, and the associated concept lattice is built. BR-Explorer starts by generating a formal concept representing the considered query, and classifies this query concept in the concept lattice. Then, BR-Explorer tries to locate the so-called ``pivot'' concept in the concept lattice, for building step by step the query result (considering the pivot superconcepts in the concept lattice). Finally, BR-Explorer returns a set of objects ranked with respect to their relevance w.r.t. the query

    Kbdock - Searching and organising the structural space of protein-protein interactions

    Get PDF
    International audienceBig data is a recurring problem in structural bioinformatics where even a single experimentally determined protein structure can contain several different interacting protein domains and often involves many tens of thousands of 3D atomic coordinates. If we consider all protein structures that have ever been solved, the immense structural space of protein-protein interactions needs to be organised systematically in order to make sense of the many functional and evolutionary relationships that exist between different protein families and their interactions. This article describes some new developments in Kbdock, a knowledge-based approach for classifying and annotating protein interactions at the protein domain level

    Neighborhood-Based Label Propagation in Large Protein Graphs

    Get PDF
    International audienceUnderstanding protein function is one of the keys to understanding life at the molecular level. It is also important in several scenarios including human disease and drug discovery. In this age of rapid and affordable biological sequencing, the number of sequences accumulating in databases is rising with an increasing rate. This presents many challenges for biologists and computer scientists alike. In order to make sense of this huge quantity of data, these sequences should be annotated with functional properties. UniProtKB consists of two components: i) the UniProtKB/Swiss-Prot database containing protein sequences with reliable information manually reviewed by expert bio-curators and ii) the UniProtKB/TrEMBL database that is used for storing and processing the unknown sequences. Hence, for all proteins we have available the sequence along with few more information such as the taxon and some structural domains. Pairwise similarity can be defined and computed on proteins based on such attributes. Other important attributes, while present for proteins in Swiss-Prot, are often missing for proteins in TrEMBL, such as their function and cellular localization. The enormous number of protein sequences now in TrEMBL calls for rapid procedures to annotate them automatically. In this work, we present DistNBLP, a novel Distributed Neighborhood-Based Label Propagation approach for large-scale annotation of proteins. To do this, the functional annotations of reviewed proteins are used to predict those of non-reviewed proteins using label propagation on a graph representation of the protein database. DistNBLP is built on top of the "akka" toolkit for building resilient distributed message-driven applications

    Discovering ADE associations from EHRs using pattern structures and ontologies

    Get PDF
    International audiencePatient Electronic Health Records (EHRs) constitute an essential resource for studying Adverse Drug Events (ADEs). We explore an original approach to identify frequently associated ADEs in subgroups of patients. Because ADEs have complex manifestations, we use formal concept analysis and its pattern structures, a mathematical framework that allows generalization, while taking into account domain knowledge formalized in medical ontologies. Results obtained with three different settings show that this approach is flexible and allows extraction of association rules at various levels of generalization

    Modélisation de vignes à partir d'une séquence d'images

    Get PDF
    National audienceCet article présente des travaux sur la modélisation de plantes à géométries fortement contraintes à partir d'images. A partir de séquences d'images acquises dans un vignoble, nous instancions un modèle paramétré des parcelles, des rangs, et des pieds de vignes. Le modèle est déduit des connaissances a priori ; à partir des images, des paramètres sont extraits. Ces paramètres sont ensuite fournis au modèle qui génère une représentation de la plante, du rang ou de la parcelle filmée

    SNP-Ontology for semantic integration of genomic variation data

    Get PDF
    PosterA formal ontology is proposed as a means for guiding data selection and for semantically integrating data about genomic variations. The designed SNP-Ontology is used for initializing a SNP-dedicated knowledge base which integrates information on genomic variations whatever their original representations

    Extraction d'association d'EIM à partir de dossiers patients : expérimentation avec les structures de patrons et les ontologies

    Get PDF
    National audienceLes Dossiers Médicaux Electroniques (DME) constituent une ressource de grand intérêt pour étudier les Evènements Indésirables Médicamenteux (EIM). Nous proposons ici de fouiller les DME pour identifier des EIM fréquemment associés dans des sous-groupes de patients. Les EIM ayant des manifestations complexes, nous utilisons l'analyse formelle de concepts et ses structures de patrons, un cadre mathématique permettant la généralisation, en exploitant les connaissances du domaine médical formalisées dans des ontologies. Les résultats obtenus dans trois expériences montrent que cette approche est flexible et permet d'extraire des règles d'association à divers niveaux de généralisation

    Reduced structural flexibility of eplet amino acids in HLA proteins

    Get PDF
    International audienceThe proteins encoded in the HLA (Human Leukocyte Antigen) system are largely responsible for the compatibility in organ transplants. To date, the molecular determinants involved in recognizing HLA antigens by recipient antibodies are unknown. Here we explore flexibility as a potential determinant. For this purpose, we compare in terms of N-RMSF (Normalized Root Mean Square Fluctuation) amino acids labeled as confirmed eplets (regions defined around polymorphic amino acids) against amino acids that have not been reported as eplets. We found that eplet amino acids tend to be less flexible than non-eplet amino acids, which would indicate that the antibodies would have a preference for binding with less mobile regions
    corecore