24,587 research outputs found
Translating SQL to Spreadsheet: A Survey
Spreadsheets are the most popular and conventionally databases in use today. Since Spreadsheets are visual and expression based languages, research into the features of spreadsheets is therefore a highly relevant topic to study. Spreadsheet can be viewed as a Relation Database which contains a sheet and its corresponding information in terms of rows, while in RDBMS each table or say relation also represents its contained information in terms of rows. Each row represents a record which belongs to one or more relation. Spreadsheets uses different formulae to extract required information but it need expert knowledge about the tool and its usage. One can extend the usage of Spreadsheet in any direction as it provides great flexibility in terms of data storage and dependency of stored data. We surveyed some of research which took great attention over Spreadsheets and its applicability in different functional cases, such as Data Visualization, SQL Engines and many more. Our survey focuses on QUERYSHEET, ES-SQL, MDSHEET and PrediCalc [3], [5], [4], [8]. These different researches are motivations to our survey and attraction in Spreadsheets and its functional extensibility
Troubles with Nulls, Views from the Users
International audienceIncomplete data, in the form of null values, has been extensively studied since the inception of the relational model in the 1970s. Anecdotally, one hears that the way in which SQL, the standard language for relational databases, handles nulls creates a myriad of problems in everyday applications of database systems. To the best of our knowledge, however, the actual shortcomings of SQL in this respect, as perceived by database practitioners, have not been systematically documented, and it is not known if existing research results can readily be used to address the practical challenges. Our goal is to collect and analyze the shortcomings of nulls and their treatment by SQL, and to re-evaluate existing research in this light. To this end, we designed and conducted a survey on the everyday usage of null values among database users. From the analysis of the results we reached two main conclusions. First, null values are ubiquitous and relevant in real-life scenarios, but SQL's features designed to deal with them cause multiple problems. The severity of these problems varies depending on the SQL features used, and they cannot be reduced to a single issue. Second, foundational research on nulls is misdirected and has been addressing problems of limited practical relevance. We urge the community to view the results of this survey as a way to broaden the spectrum of their researches and further bridge the theory-practice gap on null values
An active learning and training environment for database programming
Active learning facilitated through interactive, self-controlled learning environments differs substantially from traditional instructor-oriented, classroom-based teaching. We present a tool for database programming that integrates knowledge learning and skills training. How these tools are used most effectively is still an open question. Therefore, we discuss analysis and evaluation of these Web-based environments focusing on different aspects of learning behaviour and tool usage. Motivation, acceptance of the learning approach, learning organisation and actual tool usage are aspects of behaviour that require different techniques to be used
Intelligent and adaptive tutoring for active learning and training environments
Active learning facilitated through interactive and adaptive learning environments differs substantially from traditional instructor-oriented, classroom-based teaching. We present a Web-based e-learning environment that integrates knowledge learning and skills training. How these tools are used most effectively is still an open question. We propose knowledge-level interaction and adaptive feedback and guidance as central features. We discuss these features and evaluate the effectiveness of this Web-based environment, focusing on different aspects of learning behaviour and tool usage. Motivation, acceptance of the approach, learning organisation and actual tool usage are aspects of behaviour that require different evaluation techniques to be used
AKARI-CAS --- Online Service for AKARI All-Sky Catalogues
The AKARI All-Sky Catalogues are an important infrared astronomical database
for next-generation astronomy that take over the IRAS catalog. We have
developed an online service, AKARI Catalogue Archive Server (AKARI-CAS), for
astronomers. The service includes useful and attractive search tools and visual
tools.
One of the new features of AKARI-CAS is cached SIMBAD/NED entries, which can
match AKARI catalogs with other catalogs stored in SIMBAD or NED. To allow
advanced queries to the databases, direct input of SQL is also supported. In
those queries, fast dynamic cross-identification between registered catalogs is
a remarkable feature. In addition, multiwavelength quick-look images are
displayed in the visualization tools, which will increase the value of the
service.
In the construction of our service, we considered a wide variety of
astronomers' requirements. As a result of our discussion, we concluded that
supporting users' SQL submissions is the best solution for the requirements.
Therefore, we implemented an RDBMS layer so that it covered important
facilities including the whole processing of tables. We found that PostgreSQL
is the best open-source RDBMS products for such purpose, and we wrote codes for
both simple and advanced searches into the SQL stored functions. To implement
such stored functions for fast radial search and cross-identification with
minimum cost, we applied a simple technique that is not based on dividing
celestial sphere such as HTM or HEALPix. In contrast, the Web application layer
became compact, and was written in simple procedural PHP codes. In total, our
system realizes cost-effective maintenance and enhancements.Comment: Yamauchi, C. et al. 2011, PASP..123..852
Towards an automatic data value analysis method for relational databases
Data is becoming one of the worldâs most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach
The state of SQL-on-Hadoop in the cloud
Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud,
and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark.
The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines.
The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization.
The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some
providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under
the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat
de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
- âŠ