12 research outputs found

    Expanding Database Keyword Search for Database Exploration

    Get PDF
    AbstractDatabase keyword search (DB KWS) has received a lot of attention in database research community. Although much of the research has been motivated by improving performance, recent research has also paid increased attention to its role in database contents exploration or data mining. In this paper we explore aspects related to DB KWS in two steps: First, we expand DB KWS by incorporating ontologies to better capture users’ intention. Furthermore, we examine how KWS or ontology-enriched KWS can offer useful hints for better understanding of the data and in-depth analysis of the data contents, or data mining

    Adaptation of language model of Information Retrieval for empty answers Problem in databases

    Get PDF
    International audienceInformation over the web is increasingly retrieved from relational databases in which the query language is based on exact matching, data fulfil completely the query or not. The results returned to the user contain only tuples that satisfy the conditions of the query. Thereby, the user can be confronted to the problem of empty answers in the case of too selective query. To overcome this problem, several approaches have been proposed in the literature in particularly those based on query conditions relaxation. Others works suggest the use of fuzzy sets theory to introduce a flexible queries. Another line of research proposes the adaptation of information retrieval (IR) approaches to get an approximate matching in databases. We discuss in this paper, an adaptation of language model of IR to deal with empty answers. The main idea behind our approach is that instead of returning an empty response to the user, a ranked list of tuples that have the most similar values to those specified in user's query is returned

    Weakening of fuzzy relational queries: an absolute proximity relation-based approach

    Get PDF
    In this paper we address the problem of query failure in the context of flexible querying. We propose a fuzzy set–based approach for relaxing queries involving gradual predicates. This approach relies on the notion of proximity relation which is defined in an absolute way. We show how such proximity relation allows for transforming a given predicate into an enlarged one. The resulting predicate is semantically not far from the original one and it is obtained by a simple fuzzy arithmetic operation. The main features of the weakening mechanism are investigated and a comparative study with some methods proposed for the purpose of fuzzy query weakening is presented as well. Last, an example is provided to illustrate our proposal in the case of conjunctive queries.Peer Reviewe

    No-But-Semantic-Match: Computing Semantically Matched XML Keyword Search Results

    Get PDF
    Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-kk results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches.Comment: 24 pages, 21 figures, 6 tables, submitted to The VLDB Journal for possible publicatio

    PERSONALIZED INDEXING OF MUSIC BY EMOTIONS

    Get PDF
    How a person interprets music and what prompts a person to feel certain emotions are two very subjective things. This dissertation presents a method where a system can learn and track a user’s listening habits with the purpose of recommending songs that fit the user’s specific way of interpreting music and emotions. First a literature review is presented which shows an overview of the current state of recommender systems, as well as describing classifiers; then the process of collecting user data is discussed; then the process of training and testing personalized classifiers is described; finally a system combining the personalized classifiers with clustered data into a hierarchy of recommender systems is presented

    Portal de Emprego Inteligente

    Get PDF
    A importância da internet é hoje uma realidade cuja facilidade que nos traz de aceder a produtos ouserviços, informação ou até mesmo aproximar pessoas, torna-­‐a ainda mais indispensável. Cada vez mais, a nossa vida é feita através internet. Seja uma simples consulta de informação de horário de funcionamento de uma loja, a compra de produtos de uma loja usando plataformas de venda online, transações bancárias ou até operações fiscais, a internet faz parte das nossas vidas. Até novas áreas de negócio surgem com a massificação do uso da internet. Naturalmente, o surgimento de plataformas que repliquem o mundo real no mundo virtual torna­‐se bastante óbvio e cada vez mais desejado. A pesquisa de emprego é algo bastante comum no mundo real. Naturalmente, com a internet, surgiram e surgem plataformas dedicadas a esta área. As empresas que disponibilizam empregos recorrem-­‐se destas plataformas pois, estão ao alcance de muitos utilizadores e, geralmente, são gratuitas, juntando o melhor de dois mundos. A necessidade de atingir com maior eficácia o publico alvo leva a que surjam plataformas com maior granularidade de áreas de emprego ou então especializadas em determinadas áreas. Contudo, a pesquisa nestas plataformas fica aquém do desejado pois não tem em consideração a relevância de um emprego para o utilizador apresentando resultados irrelevantes. No sentido de oferecer um novo paradigma de pesquisa de empregos, criou-­se uma plataforma, dotada de conhecimento, que estende a pesquisa o tipo de pesquisa tradicional obtendo mais resultados com muita relevância para o utilizador

    Qualitätskontrolle mittels semantischer Technologien in digitalen Bibliotheken

    Get PDF
    Controlled content quality especially in terms of indexing is one of the major ad-vantages of using digital libraries in contrast to general Web sources or Web search engines. Therefore, more and more digital libraries offer corpora related to a specialized domain. Beyond simple keyword based searches the resulting infor-mation systems often rely on entity centered searches. For being able to offer this kind of search, a high quality document processing is essential. However, considering today’s information flood the mostly manual effort in ac-quiring new sources and creating suitable (semantic) metadata for content indexing and retrieval is already prohibitive. A recent solution is given by automatic genera-tion of metadata, where mostly statistical techniques like e.g. document classifica-tion and entity extraction currently become more widespread. But in this case neglecting quality assurance is even more problematic, because heuristic genera-tion often fails and the resulting low-quality metadata will directly diminish the quality of service that a digital library provides. Thus, the quality assessment of information system’s metadata annotations used for subsequent querying of collections has to be enabled. In this thesis we discuss the importance of metadata quality assessment for information systems and the benefits gained from controlled and guaranteed quality.Eine kontrollierte Qualität der Metadaten ist einer der wichtigsten Vorteile bei der Verwendung von digitalen Bibliotheken im Vergleich zu Web Suchmaschinen. Auf diesen hochqualitativen Inhalten werden immer mehr fachspezifische Portale durch die digitalen Bibliotheken erzeugt. Die so entstehenden Informationssysteme bieten oftmals neben einer simplen Stichwortsuche auch Objekt zentrierte Suchen an. Um solch eine Objekt-Suche zu ermöglichen, ist aber eine hochqualitative Verarbeitung der zugrunde liegenden Dokumente notwendig. Betrachtet man hingegen die heutige Informationsflut, so stellt man fest, dass der Aufwand für eine manuelle Erschließung von neuen Quellen und die Erzeugung von (semantischen) Metadaten für die Indexierung schon heute unerschwinglich ist. Eine aktuelle Lösung für dieses Problem ist die zumeist automatische Erzeugung von (semantischen) Metadaten, durch statistische Methoden, wie die automatische Dokumenten Klassifizierung Entitäten Extraktion. Aber bei der Verwendung sol-cher Methoden ist die Vernachlässigung der Qualität noch problematischer, da eine heuristische Erzeugung oftmals fehlerbehaftet ist. Diese schlechte Qualität der so erzeugten Metadaten wird dabei direkt die Servicequalität einer digitalen Biblio-thek herabmindern. Somit muss eine Qualitätsbewertung der Metadaten garantiert werden. In dieser Arbeit diskutieren wir die Bedeutung von Metadaten Qualität für Digitale Bibliotheken und die Chancen die aus kontrollierter und garantierter Qua-lität gewonnen werden können

    Enhancing the Usability of XML keyword Search

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Similarity-aware query refinement for data exploration

    Get PDF
    corecore