10 research outputs found
OBIRS-feedback, une méthode de reformulation utilisant une ontologie de domaine
National audienceLes performances d'un systĂšme de recherche d'information (SRI) peuvent ĂȘtre dĂ©gradĂ©es en termes de prĂ©cision du fait de la difficultĂ© pour des utilisateurs Ă formuler prĂ©cisĂ©ment leurs besoins en information. La reformulation ou l'expansion de requĂȘtes constitue une des rĂ©ponses Ă ce problĂšme dans le cadre des SRI. Dans cet article, nous proposons une nouvelle mĂ©thode de reformulation de requĂȘtes conceptuelles qui, Ă partir de documents jugĂ©s pertinents par l'utilisateur et d'une ontologie de domaine, cherche un ensemble de concepts maximisant les performances du SRI. Celles-ci sont Ă©valuĂ©es, de maniĂšre originale, Ă l'aide d'indicateurs dont une formalisation est proposĂ©e. Cette mĂ©thode a Ă©tĂ© Ă©valuĂ©e en utilisant notre moteur OBIRS, l'ontologie de domaine MeSH et la collection de tests MuCHMORE
How ontology based information retrieval systems may benefit from lexical text analysis
International audienceThe exponential growth of available electronic data is almost useless without efficient tools to retrieve the right information at the right time. It is now widely acknowledged that information retrieval systems need to take semantics into account to enhance the use of available information. However, there is still a gap between the amounts of relevant information that can be accessed through optimized IRSs on the one hand, and users' ability to grasp and process a handful of relevant data at once on the other. This chapter shows how conceptual and lexical approaches may be jointly used to enrich document description. After a survey on semantic based methodologies designed to efficiently retrieve and exploit information, hybrid approaches are discussed. The original approach presented here benefits from both lexical and ontological document description, and combines them in a software architecture dedicated to information retrieval and rendering in specific domains
ONTOLOGY BASED INFORMATION RETRIEVAL
Les ontologies offrent une modĂ©lisation des connaissances d'un domaine basĂ©e sur une hiĂ©rarchie des concepts clefs de ce domaine. Leur utilisation dans le cadre des SystĂšmes de Recherche d'Information (SRI), tant pour indexer les documents que pour exprimer une requĂȘte, permet notamment d'Ă©viter les ambiguĂŻtĂ©s du langage naturel qui pĂ©nalisent les SRI classiques. Les travaux de cette thĂšse portent essentiellement sur l'utilisation d'ontologies lors du processus d'appariement durant lequel les SRI ordonnent les documents d'une collection en fonction de leur pertinence par rapport Ă une requĂȘte utilisateur. Nous proposons de calculer cette pertinence Ă l'aide d'une stratĂ©gie d'agrĂ©gation de scores Ă©lĂ©mentaires entre chaque document et chaque concept de la requĂȘte. Cette agrĂ©gation, simple et intuitive, intĂšgre un modĂšle de prĂ©fĂ©rences dĂ©pendant de l'utilisateur et une mesure de similaritĂ© sĂ©mantique associĂ©e Ă l'ontologie. L'intĂ©rĂȘt majeur de cette approche est qu'elle permet d'expliquer Ă l'utilisateur pourquoi notre SRI, OBIRS, estime que les documents qu'il a sĂ©lectionnĂ©s sont pertinents. Nous proposons de renforcer cette justification grĂące Ă une visualisation originale oĂč les rĂ©sultats sont reprĂ©sentĂ©s par des pictogrammes, rĂ©sumant leurs pertinences Ă©lĂ©mentaires, puis disposĂ©s sur une carte sĂ©mantique en fonction de leur pertinence globale. La Recherche d'Information Ă©tant un processus itĂ©ratif, il est nĂ©cessaire de permettre Ă l'utilisateur d'interagir avec le SRI, de comprendre et d'Ă©valuer les rĂ©sultats et de le guider dans sa reformulation de requĂȘte. Nous proposons une stratĂ©gie de reformulation de requĂȘtes conceptuelles basĂ©e sur la transposition d'une mĂ©thode Ă©prouvĂ©e dans le cadre de SRI vectoriels. La reformulation devient alors un problĂšme d'optimisation utilisant les retours faits par l'utilisateur sur les premiers rĂ©sultats proposĂ©s comme base d'apprentissage. Nous avons dĂ©veloppĂ© une heuristique permettant de s'approcher d'une requĂȘte optimale en ne testant qu'un sous-espace des requĂȘtes conceptuelles possibles. Nous montrons que l'identification efficace des concepts de ce sous-espace dĂ©coule de deux propriĂ©tĂ©s qu'une grande partie des mesures de similaritĂ© sĂ©mantique vĂ©rifient, et qui suffisent Ă garantir la connexitĂ© du voisinage sĂ©mantique d'un concept.Les modĂšles que nous proposons sont validĂ©s tant sur la base de performances obtenues sur des jeux de tests standards, que sur la base de cas d'Ă©tudes impliquant des experts biologistes.Domain ontologies provide a knowledge model where the main concepts of a domain are organized through hierarchical relationships. In conceptual Information Retrieval Systems (IRS), where they are used to index documents as well as to formulate a query, their use allows to overcome some ambiguities of classical IRSs based on natural language processes.One of the contributions of this study consists in the use of ontologies within IRSs, in particular to assess the relevance of documents with respect to a given query. For this matching process, a simple and intuitive aggregation approach is proposed, that incorporates user dependent preferences model on one hand, and semantic similarity measures attached to a domain ontology on the other hand. This matching strategy allows justifying the relevance of the results to the user. To complete this explanation, semantic maps are built, to help the user to grasp the results at a glance. Documents are displayed as icons that detail their elementary scores. They are organized so that their graphical distance on the map reflects their relevance to a query represented as a probe. As Information Retrieval is an iterative process, it is necessary to involve the users in the control loop of the results relevancy in order to better specify their information needs. Inspired by experienced strategies in vector models, we propose, in the context of conceptual IRS, to formalize ontology based relevance feedback. This strategy consists in searching a conceptual query that optimizes a tradeoff between relevant documents closeness and irrelevant documents remoteness, modeled through an objective function. From a set of concepts of interest, a heuristic is proposed that efficiently builds a near optimal query. This heuristic relies on two simple properties of semantic similarities that are proved to ensure semantic neighborhood connectivity. Hence, only an excerpt of the ontology dag structure is explored during query reformulation.These approaches have been implemented in OBIRS, our ontological based IRS and validated in two ways: automatic assessment based on standard collections of tests, and case studies involving experts from biomedical domain
Utilisation d'ontologies comme support Ă la recherche et Ă la navigation dans une collection de documents
Domain ontologies provide conceptual formalization of domain knowledge. One contribution of this study consists in using them in conceptual Information Retrieval Systems (IRS), in particular to assess the relevance of documents with respect to a given query. For this matching process a model is proposed that incorporates both user preferences and semantic similarity measures attached to domain ontology. Our approach allows justifying the relevance of the results to the user, using visualization tools. As Information Retrieval is an iterative process, users may be involved in the control loop of the results relevancy to better specify their information needs. We propose to formalize ontology based relevance feedback using an objective function and a heuristic that efficiently builds a near optimal query. These approaches have been validated in two ways: automatic assessment based on standard collections of tests, and case studies involving experts from biomedical domain.Les ontologies modĂ©lisent la connaissance d'un domaine avec une hiĂ©rarchie de concepts. Cette thĂšse porte sur leur utilisation dans les SystĂšmes de Recherche d'Information (SRI) pour estimer la pertinence des documents par rapport Ă une requĂȘte. Nous calculons cette pertinence Ă l'aide d'un modĂšle des prĂ©fĂ©rences de l'utilisateur et d'une mesure de similaritĂ© sĂ©mantique associĂ©e Ă l'ontologie. Cette approche permet d'expliquer Ă l'utilisateur pourquoi les documents sĂ©lectionnĂ©s sont pertinents grĂące Ă une visualisation originale. La RI Ă©tant un processus itĂ©ratif, l'utilisateur doit ĂȘtre guidĂ© dans sa reformulation de requĂȘte. Une stratĂ©gie de reformulation de requĂȘtes conceptuelles est formalisĂ©e en un problĂšme d'optimisation utilisant les retours faits par l'utilisateur sur les premiers rĂ©sultats proposĂ©s comme base d'apprentissage. Nos modĂšles sont validĂ©s sur la base de performances obtenues sur des jeux de tests standards et de cas d'Ă©tudes impliquant des experts biologistes
Utilisation d'ontologies comme support Ă la recherche et Ă la navigation dans une collection de documents
Les ontologies offrent une modĂ©lisation des connaissances d'un domaine basĂ©e sur une hiĂ©rarchie des concepts clefs de ce domaine. Leur utilisation dans le cadre des SystĂšmes de Recherche d'Information (SRI), tant pour indexer les documents que pour exprimer une requĂȘte, permet notamment d'Ă©viter les ambiguĂŻtĂ©s du langage naturel qui pĂ©nalisent les SRI classiques. Les travaux de cette thĂšse portent essentiellement sur l'utilisation d'ontologies lors du processus d'appariement durant lequel les SRI ordonnent les documents d'une collection en fonction de leur pertinence par rapport Ă une requĂȘte utilisateur. Nous proposons de calculer cette pertinence Ă l'aide d'une stratĂ©gie d'agrĂ©gation de scores Ă©lĂ©mentaires entre chaque document et chaque concept de la requĂȘte. Cette agrĂ©gation, simple et intuitive, intĂšgre un modĂšle de prĂ©fĂ©rences dĂ©pendant de l'utilisateur et une mesure de similaritĂ© sĂ©mantique associĂ©e Ă l'ontologie. L'intĂ©rĂȘt majeur de cette approche est qu'elle permet d'expliquer Ă l'utilisateur pourquoi notre SRI, OBIRS, estime que les documents qu'il a sĂ©lectionnĂ©s sont pertinents. Nous proposons de renforcer cette justification grĂące Ă une visualisation originale oĂč les rĂ©sultats sont reprĂ©sentĂ©s par des pictogrammes, rĂ©sumant leurs pertinences Ă©lĂ©mentaires, puis disposĂ©s sur une carte sĂ©mantique en fonction de leur pertinence globale. La Recherche d'Information Ă©tant un processus itĂ©ratif, il est nĂ©cessaire de permettre Ă l'utilisateur d'interagir avec le SRI, de comprendre et d'Ă©valuer les rĂ©sultats et de le guider dans sa reformulation de requĂȘte. Nous proposons une stratĂ©gie de reformulation de requĂȘtes conceptuelles basĂ©e sur la transposition d'une mĂ©thode Ă©prouvĂ©e dans le cadre de SRI vectoriels. La reformulation devient alors un problĂšme d'optimisation utilisant les retours faits par l'utilisateur sur les premiers rĂ©sultats proposĂ©s comme base d'apprentissage. Nous avons dĂ©veloppĂ© une heuristique permettant de s'approcher d'une requĂȘte optimale en ne testant qu'un sous-espace des requĂȘtes conceptuelles possibles. Nous montrons que l'identification efficace des concepts de ce sous-espace dĂ©coule de deux propriĂ©tĂ©s qu'une grande partie des mesures de similaritĂ© sĂ©mantique vĂ©rifient, et qui suffisent Ă garantir la connexitĂ© du voisinage sĂ©mantique d'un concept.Les modĂšles que nous proposons sont validĂ©s tant sur la base de performances obtenues sur des jeux de tests standards, que sur la base de cas d'Ă©tudes impliquant des experts biologistes.Domain ontologies provide a knowledge model where the main concepts of a domain are organized through hierarchical relationships. In conceptual Information Retrieval Systems (IRS), where they are used to index documents as well as to formulate a query, their use allows to overcome some ambiguities of classical IRSs based on natural language processes.One of the contributions of this study consists in the use of ontologies within IRSs, in particular to assess the relevance of documents with respect to a given query. For this matching process, a simple and intuitive aggregation approach is proposed, that incorporates user dependent preferences model on one hand, and semantic similarity measures attached to a domain ontology on the other hand. This matching strategy allows justifying the relevance of the results to the user. To complete this explanation, semantic maps are built, to help the user to grasp the results at a glance. Documents are displayed as icons that detail their elementary scores. They are organized so that their graphical distance on the map reflects their relevance to a query represented as a probe. As Information Retrieval is an iterative process, it is necessary to involve the users in the control loop of the results relevancy in order to better specify their information needs. Inspired by experienced strategies in vector models, we propose, in the context of conceptual IRS, to formalize ontology based relevance feedback. This strategy consists in searching a conceptual query that optimizes a tradeoff between relevant documents closeness and irrelevant documents remoteness, modeled through an objective function. From a set of concepts of interest, a heuristic is proposed that efficiently builds a near optimal query. This heuristic relies on two simple properties of semantic similarities that are proved to ensure semantic neighborhood connectivity. Hence, only an excerpt of the ontology dag structure is explored during query reformulation.These approaches have been implemented in OBIRS, our ontological based IRS and validated in two ways: automatic assessment based on standard collections of tests, and case studies involving experts from biomedical domain.MONTPELLIER-BU Sciences (341722106) / SudocSudocFranceF
Utilisation de proximités sémantiques pour améliorer la recherche et le rendu d'information
12 pagesNational audiencePour exploiter efficacement des corpus documentaires toujours plus volumineux, les moteurs de recherche doivent Ă©voluer. Leurs limites actuelles concernent principalement le fait que la mesure de la pertinence d'un document par rapport Ă une requĂȘte est souvent non-explicite et que l'interaction avec la liste des rĂ©ponses est limitĂ©e. Nous proposons une mĂ©thode et un environnement de requĂȘtage basĂ©s sur les ontologies, qui utilisent des opĂ©rateurs d'agrĂ©gation pour calculer une mesure de pertinence globale, fonction de la proximitĂ© sĂ©mantique des documents du corpus avec chaque concept de la requĂȘte d'une part, et des prĂ©fĂ©rences de l'utilisateur, d'autre part. Nous construisons ensuite une carte sĂ©mantique qui reflĂšte la pertinence des documents sĂ©lectionnĂ©s et explicite leur adĂ©quation avec la requĂȘte. Cette interface homme/machine laisse envisager un processus de requĂȘtage itĂ©ratif et interactif
User centered and ontology based information retrieval system for life science - OBIRS
International audienceBecause of the increasing number of electronic data, designing efficient tools to retrieve and exploit documents is a major challenge. Current search engines suffer from two main drawbacks: there is limited interaction with the list of retrieved documents and no explanation for their adequacy to the query. Users may thus be confused by the selection and have no idea how to adapt their query so that the results match their expectations. This paper describes a request method and an environment based on aggregating models to assess the relevance of documents annotated by concepts of ontology. The selection of documents is then displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive exploration of data corpus
User centered and ontology based information retrieval system for life sciences
<p>Abstract</p> <p>Background</p> <p>Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations.</p> <p>Results</p> <p>This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway.</p> <p>Conclusions</p> <p>The ontology based information retrieval system described in this paper (OBIRS) is freely available at: <url>http://www.ontotoolkit.mines-ales.fr/ObirsClient/</url>. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.</p
The Neuron Phenotype Ontology: A FAIR Approach to Proposing and Classifying Neuronal Types.
The challenge of defining and cataloging the building blocks of the brain requires a standardized approach to naming neurons and organizing knowledge about their properties. The US Brain Initiative Cell Census Network, Human Cell Atlas, Blue Brain Project, and others are generating vast amounts of data and characterizing large numbers of neurons throughout the nervous system. The neuroscientific literature contains many neuron names (e.g. parvalbumin-positive interneuron or layer 5 pyramidal cell) that are commonly used and generally accepted. However, it is often unclear how such common usage types relate to many evidence-based types that are proposed based on the results of new techniques. Further, comparing different types across labs remains a significant challenge. Here, we propose an interoperable knowledge representation, the Neuron Phenotype Ontology (NPO), that provides a standardized and automatable approach for naming cell types and normalizing their constituent phenotypes using identifiers from community ontologies as a common language. The NPO provides a framework for systematically organizing knowledge about cellular properties and enables interoperability with existing neuron naming schemes. We evaluate the NPO by populating a knowledge base with three independent cortical neuron classifications derived from published data sets that describe neurons according to molecular, morphological, electrophysiological, and synaptic properties. Competency queries to this knowledge base demonstrate that the NPO knowledge model enables interoperability between the three test cases and neuron names commonly used in the literature