5,208 research outputs found

    Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

    No full text
    Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines

    Asynchronous Graph Pattern Matching on Multiprocessor Systems

    Full text link
    Pattern matching on large graphs is the foundation for a variety of application domains. Strict latency requirements and continuously increasing graph sizes demand the usage of highly parallel in-memory graph processing engines that need to consider non-uniform memory access (NUMA) and concurrency issues to scale up on modern multiprocessor systems. To tackle these aspects, graph partitioning becomes increasingly important. Hence, we present a technique to process graph pattern matching on NUMA systems in this paper. As a scalable pattern matching processing infrastructure, we leverage a data-oriented architecture that preserves data locality and minimizes concurrency-related bottlenecks on NUMA systems. We show in detail, how graph pattern matching can be asynchronously processed on a multiprocessor system.Comment: 14 Pages, Extended version for ADBIS 201

    Web Queries: From a Web of Data to a Semantic Web?

    Get PDF

    On the topology Of network fine structures

    Get PDF
    Multi-relational dynamics are ubiquitous in many complex systems like transportations, social and biological. This thesis studies the two mathematical objects that encapsulate these relationships --- multiplexes and interval graphs. The former is the modern outlook in Network Science to generalize the edges in graphs while the latter was popularized during the 1960s in Graph Theory. Although multiplexes and interval graphs are nearly 50 years apart, their motivations are similar and it is worthwhile to investigate their structural connections and properties. This thesis look into these mathematical objects and presents their connections. For example we will look at the community structures in multiplexes and learn how unstable the detection algorithms are. This can lead researchers to the wrong conclusions. Thus it is important to get formalism precise and this thesis shows that the complexity of interval graphs is an indicator to the precision. However this measure of complexity is a computational hard problem in Graph Theory and in turn we use a heuristic strategy from Network Science to tackle the problem. One of the main contributions of this thesis is the compilation of the disparate literature on these mathematical objects. The novelty of this contribution is in using the statistical tools from population biology to deduce the completeness of this thesis's bibliography. It can also be used as a framework for researchers to quantify the comprehensiveness of their preliminary investigations. From the large body of multiplex research, the thesis focuses on the statistical properties of the projection of multiplexes (the reduction of multi-relational system to a single relationship network). It is important as projection is always used as the baseline for many relevant algorithms and its topology is insightful to understand the dynamics of the system.Open Acces

    Data Model and Query Constructs for Versatile Web Query Languages

    Get PDF
    As the Semantic Web is gaining momentum, the need for truly versatile query languages becomes increasingly apparent. A Web query language is called versatile if it can access in the same query program data in different formats (e.g. XML and RDF). Most query languages are not versatile: they have not been specifically designed to cope with both worlds, providing a uniform language and common constructs to query and transform data in various formats. Moreover, most of them do not provide a flexible data model that is powerful enough to naturally convey both Semantic Web data formats (especially RDF and Topic Maps) and XML. This article highlights challenges related to the data model and language constructs for querying both standard Web and Semantic Web data with an emphasis on facilitating sophisticated reasoning. It is shown that Xcerpt’s data model and querying constructs are particularly well-suited for the Semantic Web, but that some adjustments of the Xcerpt syntax allow for even more effective and natural querying of RDF and Topic Maps

    Analyzing the Adoption Rate of Local Variable Type Inference in Open-source Java 10 Projects

    Get PDF
    Type Inference is used in programming languages to improve writability. In this paper, we will be looking more specifically at Local Variable Type Inference (LVTI). For those unfamiliar with LVTI, we will also give an in-depth explanation of what it is and how it works. There is a lot of debate surrounding Type Inference in modern day programming languages. More specifically, whether the costs associated with LVTI outweigh the benefits. It has found its way into many higher-level languages including C#, C++, JavaScript, Swift, Kotlin, Rust, Go, etc. In this paper, we will look at the usefulness of LVTI and its popularity since the release of Java 10. Our study will show that LVTI in Java has not received widespread adoption. We will also explain a possible reason for this is based on the information we have gather from our empirical study which involved statically analyzing 6 popular open source Java 10 projects. We will also discuss different scenarios in which Type Inference can obscure different programming errors

    Semantics and result disambiguation for keyword search on tree data

    Get PDF
    Keyword search is a popular technique for searching tree-structured data (e.g., XML, JSON) on the web because it frees the user from learning a complex query language and the structure of the data sources. However, the convenience of keyword search comes with drawbacks. The imprecision of the keyword queries usually results in a very large number of results of which only very few are relevant to the query. Multiple previous approaches have tried to address this problem. Some of them exploit structural and semantic properties of the tree data in order to filter out irrelevant results while others use a scoring function to rank the candidate results. These are not easy tasks though and in both cases, relevant results might be missed and the users might spend a significant amount of time searching for their intended result in a plethora of candidates. Another drawback of keyword search on tree data, also due to the incapacity of keyword queries to precisely express the user intent, is that the query answer may contain different types of meaningful results even though the user is interested in only some of them. Both problems of keyword search on tree data are addressed in this dissertation. First, an original approach for answering keyword queries is proposed. This approach extracts structural patterns of the query matches and reasons with them in order to return meaningful results ranked with respect to their relevance to the query. The proposed semantics performs comparisons between patterns of results by using different types of ho-momorphisms between the patterns. These comparisons are used to organize the patterns into a graph of patterns which is leveraged to determine ranking and filtering semantics. The experimental results show that the approach produces query results of higher quality compared to the previous ones. To address the second problem, an original approach for clustering the keyword search results on tree data is introduced. The clustered output allows the user to focus on a subset of the results, and to save time and effort while looking for the relevant results. The approach performs clustering at different levels of granularity to group similar results together effectively. The similarity of the results and result clusters is decided using relations on structural patterns of the results defined based on homomor-phisms between path patterns. An originality of the clustering approach is that the clusters are ranked at different levels of granularity to quickly guide the user to the relevant result patterns. An efficient stack-based algorithm is presented for generating result patterns and constructing the clustering hierarchy. The extensive experimentation with multiple real datasets show that the algorithm is fast and scalable. It also shows that the clustering methodology allows the users to effectively retrieve their intended results, and outperforms a recent state-of-the-art clustering approach. In order to tackle the second problem from a different aspect, diversifying the results of keyword search is addressed. Diversification aims to provide the users with a ranked list of results which balances the relevance and redundancy of the results. Measures for quantifying the relevance and dissimilarity of result patterns are presented and a heuristic for generating a diverse set of results using these metrics is introduced

    Developing an online database of experts for the Worcester Regional Chamber of Commerce

    Get PDF
    The Worcester Regional Chamber of Commerce as part of their mission to attract business to the Worcester area, want to create an online searchable database of industry experts made up of faculty members of the Colleges and Universities in the Worcester area. This online database will be placed on the Worcester Regional Chamber of Commerce Higher Education – Business Partnership page on their website. The limitations placed on this request are that the Regional Chamber as of this moment have no monetary or Information Technologies resources to provide for the realization of this request. The proliferation of as A Service Information technology offerings provide a number of options for satisfying the request for an online searchable database of individuals, and some services are geared more specifically for this type of need and are intended for the nonprofit sector as well. The recommendation of this report is for the Worcester regional Chamber of Commerce to consider these options even if it requires a small investment of funds on their part
    corecore