24 research outputs found
Distributed Querying of Large Labeled Graphs
Graph is a vital abstract data type that has profound significance in several applications. Because of its versitality, graphs have been adapted into several different forms and one such adaption with many practical applications is the “Labeled Graph”, where vertices and edges are labeled. An enormous research effort has been invested in to the task of managing and querying graphs, yet a lot challenges are left unsolved. In this thesis, we advance the state-of-the-art for the following query models, and propose a distributed solution to process them in an efficient and scalable manner. • Set Reachability. We formalize and investigate a generalization of the basic notion of reachability, called set reachability. Set reachability deals with finding all reachable pairs for a given source and target sets. We present a non-iterative distributed solution that takes only a single round of communication for any set reachability query. This is achieved by precomputation, replication, and indexing of partial reachabilities among the boundary vertices. • Basic Graph Patterns (BGP). Supported by majority of query languages, BGP queries are a common mode of querying knowledge graphs, biological datasets, etc. We present a novel distributed architecture that relies on the concepts of asynchronous executions, join-ahead pruning, and a multi-threaded query processing framework to process BGP queries in an efficient and scalable manner. • Generalized Graph Patterns (GGP). These queries combine the semantics of pattern matching and navigational queries, and are popular in scenarios where the schema of an underlying graph is either unknown or partially known. We present a distributed solution with bimodal indexing layout that individually support efficient processing of BGP queries and navigational queries. Furthermore, we design a unified query optimizer and a processor to efficiently process GGP queries and also in a scalable manner. To this end, we propose a prototype distributed engine, coined “TriAD” (Triple Asynchronous and Distributed) that supports all the aforementioned query models. We also provide a detailed empirical evaluation of TriAD in comparison to several state-of-the-art systems over multiple real-world and synthetic datasets
{SPAR}-Key: Processing {SPARQL}-Fulltext Queries to Solve {J}eopardy! Clues
We describe our SPAR-Key query engine that implements indexing, ranking, and query processing techniques to run a new kind of SPARQL-fulltext queries that were provided in the context of the INEX 2013 Jeopardy task
{TriAD}: A Distributed Shared-nothing {RDF} Engine Based on Asynchronous Message Passing
We investigate a new approach to the design of distributed, shared-nothing RDF engines. Our engine, coined “TriAD”, combines join-ahead pruning via a novel form of RDF graph summarization with a locality-based, horizontal partitioning of RDF triples into a grid-like, distributed index structure. The multi-threaded and distributed execution of joins in TriAD is facilitated by an asynchronous Mes-sage Passing protocol which allows us to run multiple join oper-ators along a query plan in a fully parallel, asynchronous fashion. We believe that our architecture provides a so far unique approach to join-ahead pruning in a distributed environment, as the more classical form of sideways information passing would not permit for executing distributed joins in an asynchronous way. Our experi-ments over the LUBM, BTC andWSDTS benchmarks demonstrate that TriAD consistently outperforms centralized RDF engines by up to two orders of magnitude, while gaining a factor of more than three compared to the currently fastest, distributed engines. To our knowledge, we are thus able to report the so far fastest query re-sponse times for the above benchmarks using a mid-range server and regular Ethernet setup
Why are NLP Models Fumbling at Elementary Math? A Survey of Deep Learning based Word Problem Solvers
From the latter half of the last decade, there has been a growing interest in
developing algorithms for automatically solving mathematical word problems
(MWP). It is a challenging and unique task that demands blending surface level
text pattern recognition with mathematical reasoning. In spite of extensive
research, we are still miles away from building robust representations of
elementary math word problems and effective solutions for the general task. In
this paper, we critically examine the various models that have been developed
for solving word problems, their pros and cons and the challenges ahead. In the
last two years, a lot of deep learning models have recorded competing results
on benchmark datasets, making a critical and conceptual analysis of literature
highly useful at this juncture. We take a step back and analyse why, in spite
of this abundance in scholarly interest, the predominantly used experiment and
dataset designs continue to be a stumbling block. From the vantage point of
having analyzed the literature closely, we also endeavour to provide a road-map
for future math word problem research