140,222 research outputs found

    Concept coupling learning for improving concept lattice-based document retrieval

    Full text link
    © 2017 Elsevier Ltd The semantic information in any document collection is critical for query understanding in information retrieval. Existing concept lattice-based retrieval systems mainly rely on the partial order relation of formal concepts to index documents. However, the methods used by these systems often ignore the explicit semantic information between the formal concepts extracted from the collection. In this paper, a concept coupling relationship analysis model is proposed to learn and aggregate the intra- and inter-concept coupling relationships. The intra-concept coupling relationship employs the common terms of formal concepts to describe the explicit semantics of formal concepts. The inter-concept coupling relationship adopts the partial order relation of formal concepts to capture the implicit dependency of formal concepts. Based on the concept coupling relationship analysis model, we propose a concept lattice-based retrieval framework. This framework represents user queries and documents in a concept space based on fuzzy formal concept analysis, utilizes a concept lattice as a semantic index to organize documents, and ranks documents with respect to the learned concept coupling relationships. Experiments are performed on the text collections acquired from the SMART information retrieval system. Compared with classic concept lattice-based retrieval methods, our proposed method achieves at least 9%, 8% and 15% improvement in terms of average MAP, IAP@11 and P@10 respectively on all the collections

    A model for information retrieval driven by conceptual spaces

    Get PDF
    A retrieval model describes the transformation of a query into a set of documents. The question is: what drives this transformation? For semantic information retrieval type of models this transformation is driven by the content and structure of the semantic models. In this case, Knowledge Organization Systems (KOSs) are the semantic models that encode the meaning employed for monolingual and cross-language retrieval. The focus of this research is the relationship between these meanings’ representations and their role and potential in augmenting existing retrieval models effectiveness. The proposed approach is unique in explicitly interpreting a semantic reference as a pointer to a concept in the semantic model that activates all its linked neighboring concepts. It is in fact the formalization of the information retrieval model and the integration of knowledge resources from the Linguistic Linked Open Data cloud that is distinctive from other approaches. The preprocessing of the semantic model using Formal Concept Analysis enables the extraction of conceptual spaces (formal contexts)that are based on sub-graphs from the original structure of the semantic model. The types of conceptual spaces built in this case are limited by the KOSs structural relations relevant to retrieval: exact match, broader, narrower, and related. They capture the definitional and relational aspects of the concepts in the semantic model. Also, each formal context is assigned an operational role in the flow of processes of the retrieval system enabling a clear path towards the implementations of monolingual and cross-lingual systems. By following this model’s theoretical description in constructing a retrieval system, evaluation results have shown statistically significant results in both monolingual and bilingual settings when no methods for query expansion were used. The test suite was run on the Cross-Language Evaluation Forum Domain Specific 2004-2006 collection with additional extensions to match the specifics of this model

    FCAIR 2012 Formal Concept Analysis Meets Information Retrieval Workshop co-located with the 35th European Conference on Information Retrieval (ECIR 2013) March 24, 2013, Moscow, Russia

    Get PDF
    International audienceFormal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classifiation. The area came into being in the early 1980s and has since then spawned over 10000 scientific publications and a variety of practically deployed tools. FCA allows one to build from a data table with objects in rows and attributes in columns a taxonomic data structure called concept lattice, which can be used for many purposes, especially for Knowledge Discovery and Information Retrieval. The Formal Concept Analysis Meets Information Retrieval (FCAIR) workshop collocated with the 35th European Conference on Information Retrieval (ECIR 2013) was intended, on the one hand, to attract researchers from FCA community to a broad discussion of FCA-based research on information retrieval, and, on the other hand, to promote ideas, models, and methods of FCA in the community of Information Retrieval

    Using the Web 1T 5-Gram Database for Attribute Selection in Formal Concept Analysis to Correct Overstemmed Clusters

    Full text link
    Information retrieval is the process of finding information from an unstructured collection of data. The process of information retrieval involves building an index, commonly called an inverted file. As part of the inverted file, information retrieval algorithms often stem words to a common root. Stemming involves reducing a document term to its root. There are many ways to stem a word: affix removal and successor variety are two common categories of stemmers. The Porter Stemming Algorithm is a suffix removal stemmer that operates as a rule-based process on English words. We can think of stemming as a way to cluster related words together according to one common stem. However, sometimes Porter includes words in a cluster that are un-related. This experiment attempts to correct these stemming errors through the use of Formal Concept Analysis (FCA). FCA is the process of formulating formal concepts from a given formal context. A formal context consists of a set of objects, G, a set of attributes, M, and a binary relation I that indicates the attributes possessed by each object. A formal concept is formed by computing the closure of a subset of objects and attributes. Attribute selection is of critical importance in FCA; using the Cranfield document collection, this experiment attempted to view attributes as a function of word-relatedness and crafted a comparison measure between each word in the stemmed cluster using the Google Web 1T 5-gram data set. Using FCA to correct the clusters, the results showed a varying level of success for precision and recall values dependent upon the error threshold allowed

    Cheating to achieve Formal Concept Analysis over a large formal context

    Get PDF
    International audienceResearchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents)

    C++ class retrieval using formal concept analysis.

    Get PDF
    Type is used as a search key in component retrieval. In retrieving a class in an object-oriented language such as C++ or Java, the type information of a class can be represented as a set of types of functions and variables defined in the class. Formal Concept Analysis is applied to construct a structured class library in which search can be much faster than in an unstructured class library. In this thesis, we present a method of representing a C++ class based on type information, and constructing a C++ class library using Formal Concept Analysis, and a retrieval process based on the structured library called a concept lattice. If the desired class that exactly matches with the query which the user provided does not exist, then more generic template classes are retrieved. A prototype retrieval system for C++ classes is implemented based on the proposed retrieval method. This prototype system constructs a concept lattice of the classes and provides a user interface that supports the retrieval process. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2002 .A395. Source: Masters Abstracts International, Volume: 41-04, page: 1099. Adviser: Young G. Park. Thesis (M.Sc.)--University of Windsor (Canada), 2002

    Semantic Indexing and Retrieval based on Formal Concept Analysis

    Get PDF
    Semantic indexing and retrieval has become an important research area, as the available amount of information on the Web is growing more and more. In this paper, we introduce an original approach to semantic indexing and retrieval based on Formal Concept Analysis. The concept lattice is used as a semantic index and we propose an original algorithm for traversing the lattice and answering user queries. This framework has been used and evaluated on song datasets

    Exploring a Geographical Dataset with GEOLIS

    Get PDF
    International audienceGeographical data are mainly structured in layers of information. However, this model of organisation is not convenient for navigation inside a dataset, and so limits geographical data exploration to querying. We think information retrieval could be made easier in GIS by the introduction of a navigation based on geographical object properties. For this purpose, we propose a prototype, GEOLIS1, which tightly combines querying and navigation in the search process of geographical data. GEOLIS relies on Logical Information Systems (LIS), which are based on Formal Concept Analysis (FCA) and logics. In this paper, we detail data organisation and navigation process in GEOLIS. We also present the results of an experimentation led on a real dataset
    • …
    corecore