12 research outputs found
Expanding Database Keyword Search for Database Exploration
AbstractDatabase keyword search (DB KWS) has received a lot of attention in database research community. Although much of the research has been motivated by improving performance, recent research has also paid increased attention to its role in database contents exploration or data mining. In this paper we explore aspects related to DB KWS in two steps: First, we expand DB KWS by incorporating ontologies to better capture users’ intention. Furthermore, we examine how KWS or ontology-enriched KWS can offer useful hints for better understanding of the data and in-depth analysis of the data contents, or data mining
Adaptation of language model of Information Retrieval for empty answers Problem in databases
International audienceInformation over the web is increasingly retrieved from relational databases in which the query language is based on exact matching, data fulfil completely the query or not. The results returned to the user contain only tuples that satisfy the conditions of the query. Thereby, the user can be confronted to the problem of empty answers in the case of too selective query. To overcome this problem, several approaches have been proposed in the literature in particularly those based on query conditions relaxation. Others works suggest the use of fuzzy sets theory to introduce a flexible queries. Another line of research proposes the adaptation of information retrieval (IR) approaches to get an approximate matching in databases. We discuss in this paper, an adaptation of language model of IR to deal with empty answers. The main idea behind our approach is that instead of returning an empty response to the user, a ranked list of tuples that have the most similar values to those specified in user's query is returned
Weakening of fuzzy relational queries: an absolute proximity relation-based approach
In this paper we address the problem of query failure in the context of flexible querying. We propose a fuzzy set–based approach for relaxing queries involving gradual predicates. This approach relies on the notion of proximity relation which is defined in an absolute way. We show how such proximity relation allows for transforming a given predicate into an enlarged one. The resulting predicate is semantically not far from the original one and it is obtained by a simple fuzzy arithmetic operation. The main features of the weakening mechanism are investigated and a comparative study with some methods proposed for the purpose of fuzzy query weakening is presented as well. Last, an example is provided to illustrate our proposal in the case of conjunctive queries.Peer Reviewe
No-But-Semantic-Match: Computing Semantically Matched XML Keyword Search Results
Users are rarely familiar with the content of a data source they are
querying, and therefore cannot avoid using keywords that do not exist in the
data source. Traditional systems may respond with an empty result, causing
dissatisfaction, while the data source in effect holds semantically related
content. In this paper we study this no-but-semantic-match problem on XML
keyword search and propose a solution which enables us to present the top-k
semantically related results to the user. Our solution involves two steps: (a)
extracting semantically related candidate queries from the original query and
(b) processing candidate queries and retrieving the top-k semantically related
results. Candidate queries are generated by replacement of non-mapped keywords
with candidate keywords obtained from an ontological knowledge base. Candidate
results are scored using their cohesiveness and their similarity to the
original query. Since the number of queries to process can be large, with each
result having to be analyzed, we propose pruning techniques to retrieve the
top- results efficiently. We develop two query processing algorithms based
on our pruning techniques. Further, we exploit a property of the candidate
queries to propose a technique for processing multiple queries in batch, which
improves the performance substantially. Extensive experiments on two real
datasets verify the effectiveness and efficiency of the proposed approaches.Comment: 24 pages, 21 figures, 6 tables, submitted to The VLDB Journal for
possible publicatio
PERSONALIZED INDEXING OF MUSIC BY EMOTIONS
How a person interprets music and what prompts a person to feel certain emotions are two very subjective things. This dissertation presents a method where a system can learn and track a user’s listening habits with the purpose of recommending songs that fit the user’s specific way of interpreting music and emotions. First a literature review is presented which shows an overview of the current state of recommender systems, as well as describing classifiers; then the process of collecting user data is discussed; then the process of training and testing personalized classifiers is described; finally a system combining the personalized classifiers with clustered data into a hierarchy of recommender systems is presented
Portal de Emprego Inteligente
A importância da internet é hoje uma realidade cuja facilidade que nos traz de aceder a produtos ouserviços, informação ou até mesmo aproximar pessoas, torna-‐a ainda mais indispensável. Cada vez mais, a nossa vida é feita através internet. Seja uma simples consulta de informação de horário de funcionamento de uma loja, a compra de produtos de uma loja usando plataformas de venda online, transações bancárias ou até operações fiscais, a internet faz parte das nossas vidas. Até novas áreas de negócio surgem com a massificação do uso da internet. Naturalmente, o surgimento de plataformas que repliquem o mundo real no mundo virtual torna‐se bastante óbvio e cada vez mais desejado.
A pesquisa de emprego é algo bastante comum no mundo real. Naturalmente, com a internet, surgiram e surgem plataformas dedicadas a esta área. As empresas que disponibilizam empregos recorrem-‐se destas plataformas pois, estão ao alcance de muitos utilizadores e, geralmente, são gratuitas, juntando o melhor de
dois mundos.
A necessidade de atingir com maior eficácia o publico alvo leva a que surjam plataformas com maior granularidade de áreas de emprego ou então especializadas em determinadas áreas. Contudo, a pesquisa nestas plataformas fica aquém do desejado pois não tem em consideração a relevância de um emprego para o utilizador apresentando resultados irrelevantes.
No sentido de oferecer um novo paradigma de pesquisa de empregos, criou-se uma plataforma, dotada de conhecimento, que estende a pesquisa o tipo de pesquisa tradicional obtendo mais resultados com muita relevância para o utilizador
Qualitätskontrolle mittels semantischer Technologien in digitalen Bibliotheken
Controlled content quality especially in terms of indexing is one of the major ad-vantages of using digital libraries in contrast to general Web sources or Web search engines. Therefore, more and more digital libraries offer corpora related to a specialized domain. Beyond simple keyword based searches the resulting infor-mation systems often rely on entity centered searches. For being able to offer this kind of search, a high quality document processing is essential.
However, considering today’s information flood the mostly manual effort in ac-quiring new sources and creating suitable (semantic) metadata for content indexing and retrieval is already prohibitive. A recent solution is given by automatic genera-tion of metadata, where mostly statistical techniques like e.g. document classifica-tion and entity extraction currently become more widespread. But in this case neglecting quality assurance is even more problematic, because heuristic genera-tion often fails and the resulting low-quality metadata will directly diminish the quality of service that a digital library provides. Thus, the quality assessment of information system’s metadata annotations used for subsequent querying of collections has to be enabled. In this thesis we discuss the importance of metadata quality assessment for information systems and the benefits gained from controlled and guaranteed quality.Eine kontrollierte Qualität der Metadaten ist einer der wichtigsten Vorteile bei der Verwendung von digitalen Bibliotheken im Vergleich zu Web Suchmaschinen. Auf diesen hochqualitativen Inhalten werden immer mehr fachspezifische Portale durch die digitalen Bibliotheken erzeugt. Die so entstehenden Informationssysteme bieten oftmals neben einer simplen Stichwortsuche auch Objekt zentrierte Suchen an. Um solch eine Objekt-Suche zu ermöglichen, ist aber eine hochqualitative Verarbeitung der zugrunde liegenden Dokumente notwendig.
Betrachtet man hingegen die heutige Informationsflut, so stellt man fest, dass der Aufwand für eine manuelle Erschließung von neuen Quellen und die Erzeugung von (semantischen) Metadaten für die Indexierung schon heute unerschwinglich ist. Eine aktuelle Lösung für dieses Problem ist die zumeist automatische Erzeugung von (semantischen) Metadaten, durch statistische Methoden, wie die automatische Dokumenten Klassifizierung Entitäten Extraktion. Aber bei der Verwendung sol-cher Methoden ist die Vernachlässigung der Qualität noch problematischer, da eine heuristische Erzeugung oftmals fehlerbehaftet ist. Diese schlechte Qualität der so erzeugten Metadaten wird dabei direkt die Servicequalität einer digitalen Biblio-thek herabmindern. Somit muss eine Qualitätsbewertung der Metadaten garantiert werden. In dieser Arbeit diskutieren wir die Bedeutung von Metadaten Qualität für Digitale Bibliotheken und die Chancen die aus kontrollierter und garantierter Qua-lität gewonnen werden können
Recommended from our members
Redesigning the Structure and Access Paths of Databases for Effective and Efficient Query Processing
Many database users are not familiar with formal query languages, the concept of schema, or the exact content of their database. Thus, it is challenging for these users to formulate their information needs over semi-structured and structured databases. To address this problem, researchers have proposed usable query interfaces over which users can formulate their information needs without knowing about formal query languages, schema or the exact content of the database. Although the mentioned interfaces increase the us-ability of the databases, they inherently suffer from low effectiveness and efficiency. The recent growth in databases’ content size and schema complexity only exacerbates this problem. In this dissertation, we present a set of approaches to redesign the components of database management systems to improve the effectiveness and efficiency of query processing. We present theoretical and empirical results on the impact of database size and schema complexity on the effectiveness of keyword query search. Based on these results, we propose a system that answers keyword queries more effectively. Further-more, we present an online learning method that improves the response time of query processing over large databases