14,946 research outputs found
FishMark: A Linked Data Application Benchmark
Abstract. FishBase is an important species data collection produced by the FishBase Information and Research Group Inc (FIN), a not-forprofit NGO with the aim of collecting comprehensive information (from the taxonomic to the ecological) about all the world’s finned fish species. FishBase is exposed as a MySQL backed website (supporting a range of canned, although complex queries) and serves over 33 million hits per month. FishDelish is a transformation of FishBase into LinkedData weighing in at 1.38 billion triples. We have ported a substantial number of FishBase SQL queries to FishDelish SPARQL query which form the basis of a new linked data application benchmark (using our derivative of the Berlin SPARQL Benchmark harness). We use this benchmarking framework to compare the performance of the native MySQL application, the Virtuoso RDF triple store, and the Quest OBDA system on a fishbase.org like application.
Learning a Neural Semantic Parser from User Feedback
We present an approach to rapidly and easily build natural language
interfaces to databases for new domains, whose performance improves over time
based on user feedback, and requires minimal intervention. To achieve this, we
adapt neural sequence models to map utterances directly to SQL with its full
expressivity, bypassing any intermediate meaning representations. These models
are immediately deployed online to solicit feedback from real users to flag
incorrect queries. Finally, the popularity of SQL facilitates gathering
annotations for incorrect predictions using the crowd, which is directly used
to improve our models. This complete feedback loop, without intermediate
representations or database specific engineering, opens up new ways of building
high quality semantic parsers. Experiments suggest that this approach can be
deployed quickly for any new target domain, as we show by learning a semantic
parser for an online academic database from scratch.Comment: Accepted at ACL 201
SODA: Generating SQL for Business Users
The purpose of data warehouses is to enable business analysts to make better
decisions. Over the years the technology has matured and data warehouses have
become extremely successful. As a consequence, more and more data has been
added to the data warehouses and their schemas have become increasingly
complex. These systems still work great in order to generate pre-canned
reports. However, with their current complexity, they tend to be a poor match
for non tech-savvy business analysts who need answers to ad-hoc queries that
were not anticipated. This paper describes the design, implementation, and
experience of the SODA system (Search over DAta Warehouse). SODA bridges the
gap between the business needs of analysts and the technical complexity of
current data warehouses. SODA enables a Google-like search experience for data
warehouses by taking keyword queries of business users and automatically
generating executable SQL. The key idea is to use a graph pattern matching
algorithm that uses the metadata model of the data warehouse. Our results with
real data from a global player in the financial services industry show that
SODA produces queries with high precision and recall, and makes it much easier
for business users to interactively explore highly-complex data warehouses.Comment: VLDB201
- …