17,617 research outputs found
SODA: Generating SQL for Business Users
The purpose of data warehouses is to enable business analysts to make better
decisions. Over the years the technology has matured and data warehouses have
become extremely successful. As a consequence, more and more data has been
added to the data warehouses and their schemas have become increasingly
complex. These systems still work great in order to generate pre-canned
reports. However, with their current complexity, they tend to be a poor match
for non tech-savvy business analysts who need answers to ad-hoc queries that
were not anticipated. This paper describes the design, implementation, and
experience of the SODA system (Search over DAta Warehouse). SODA bridges the
gap between the business needs of analysts and the technical complexity of
current data warehouses. SODA enables a Google-like search experience for data
warehouses by taking keyword queries of business users and automatically
generating executable SQL. The key idea is to use a graph pattern matching
algorithm that uses the metadata model of the data warehouse. Our results with
real data from a global player in the financial services industry show that
SODA produces queries with high precision and recall, and makes it much easier
for business users to interactively explore highly-complex data warehouses.Comment: VLDB201
Recommended from our members
The National Transport Data Framework
Report by Professor Peter Landshoff (Cambridge University) and
Professor John Polak (Imperial College London) on a project for
the Department for Transport.
emails: [email protected] [email protected] NTDF is designed to be a resource for data owners to deposit descriptions
into a central catalogue, so that people can search for data and find data
and understand their characteristics. The value of this is to individuals, to
commercial organizations, and to public bodies. For example, services that
provide better information to travellers will help to make their journey
less stressful and persuade them to make more use of public transport.
Transport operators need very diverse information to help them
plan developments to their services: demographic, geographical, economic etc.
And policy makers need a similar range of information to help them decide
how to divide their budget and afterwards to evaluate how valuable it has
been.This work was supported by the Department for Transport (DfT)
Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses
A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses
Hierarchical Attention Network for Action Segmentation
The temporal segmentation of events is an essential task and a precursor for
the automatic recognition of human actions in the video. Several attempts have
been made to capture frame-level salient aspects through attention but they
lack the capacity to effectively map the temporal relationships in between the
frames as they only capture a limited span of temporal dependencies. To this
end we propose a complete end-to-end supervised learning approach that can
better learn relationships between actions over time, thus improving the
overall segmentation performance. The proposed hierarchical recurrent attention
framework analyses the input video at multiple temporal scales, to form
embeddings at frame level and segment level, and perform fine-grained action
segmentation. This generates a simple, lightweight, yet extremely effective
architecture for segmenting continuous video streams and has multiple
application domains. We evaluate our system on multiple challenging public
benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech
Egocentric datasets, and achieves state-of-the-art performance. The evaluated
datasets encompass numerous video capture settings which are inclusive of
static overhead camera views and dynamic, ego-centric head-mounted camera
views, demonstrating the direct applicability of the proposed framework in a
variety of settings.Comment: Published in Pattern Recognition Letter
- …