24,587 research outputs found

    Translating SQL to Spreadsheet: A Survey

    Get PDF
    Spreadsheets are the most popular and conventionally databases in use today. Since Spreadsheets are visual and expression based languages, research into the features of spreadsheets is therefore a highly relevant topic to study. Spreadsheet can be viewed as a Relation Database which contains a sheet and its corresponding information in terms of rows, while in RDBMS each table or say relation also represents its contained information in terms of rows. Each row represents a record which belongs to one or more relation. Spreadsheets uses different formulae to extract required information but it need expert knowledge about the tool and its usage. One can extend the usage of Spreadsheet in any direction as it provides great flexibility in terms of data storage and dependency of stored data. We surveyed some of research which took great attention over Spreadsheets and its applicability in different functional cases, such as Data Visualization, SQL Engines and many more. Our survey focuses on QUERYSHEET, ES-SQL, MDSHEET and PrediCalc [3], [5], [4], [8]. These different researches are motivations to our survey and attraction in Spreadsheets and its functional extensibility

    Troubles with Nulls, Views from the Users

    Get PDF
    International audienceIncomplete data, in the form of null values, has been extensively studied since the inception of the relational model in the 1970s. Anecdotally, one hears that the way in which SQL, the standard language for relational databases, handles nulls creates a myriad of problems in everyday applications of database systems. To the best of our knowledge, however, the actual shortcomings of SQL in this respect, as perceived by database practitioners, have not been systematically documented, and it is not known if existing research results can readily be used to address the practical challenges. Our goal is to collect and analyze the shortcomings of nulls and their treatment by SQL, and to re-evaluate existing research in this light. To this end, we designed and conducted a survey on the everyday usage of null values among database users. From the analysis of the results we reached two main conclusions. First, null values are ubiquitous and relevant in real-life scenarios, but SQL's features designed to deal with them cause multiple problems. The severity of these problems varies depending on the SQL features used, and they cannot be reduced to a single issue. Second, foundational research on nulls is misdirected and has been addressing problems of limited practical relevance. We urge the community to view the results of this survey as a way to broaden the spectrum of their researches and further bridge the theory-practice gap on null values

    An active learning and training environment for database programming

    Get PDF
    Active learning facilitated through interactive, self-controlled learning environments differs substantially from traditional instructor-oriented, classroom-based teaching. We present a tool for database programming that integrates knowledge learning and skills training. How these tools are used most effectively is still an open question. Therefore, we discuss analysis and evaluation of these Web-based environments focusing on different aspects of learning behaviour and tool usage. Motivation, acceptance of the learning approach, learning organisation and actual tool usage are aspects of behaviour that require different techniques to be used

    Intelligent and adaptive tutoring for active learning and training environments

    Get PDF
    Active learning facilitated through interactive and adaptive learning environments differs substantially from traditional instructor-oriented, classroom-based teaching. We present a Web-based e-learning environment that integrates knowledge learning and skills training. How these tools are used most effectively is still an open question. We propose knowledge-level interaction and adaptive feedback and guidance as central features. We discuss these features and evaluate the effectiveness of this Web-based environment, focusing on different aspects of learning behaviour and tool usage. Motivation, acceptance of the approach, learning organisation and actual tool usage are aspects of behaviour that require different evaluation techniques to be used

    AKARI-CAS --- Online Service for AKARI All-Sky Catalogues

    Full text link
    The AKARI All-Sky Catalogues are an important infrared astronomical database for next-generation astronomy that take over the IRAS catalog. We have developed an online service, AKARI Catalogue Archive Server (AKARI-CAS), for astronomers. The service includes useful and attractive search tools and visual tools. One of the new features of AKARI-CAS is cached SIMBAD/NED entries, which can match AKARI catalogs with other catalogs stored in SIMBAD or NED. To allow advanced queries to the databases, direct input of SQL is also supported. In those queries, fast dynamic cross-identification between registered catalogs is a remarkable feature. In addition, multiwavelength quick-look images are displayed in the visualization tools, which will increase the value of the service. In the construction of our service, we considered a wide variety of astronomers' requirements. As a result of our discussion, we concluded that supporting users' SQL submissions is the best solution for the requirements. Therefore, we implemented an RDBMS layer so that it covered important facilities including the whole processing of tables. We found that PostgreSQL is the best open-source RDBMS products for such purpose, and we wrote codes for both simple and advanced searches into the SQL stored functions. To implement such stored functions for fast radial search and cross-identification with minimum cost, we applied a simple technique that is not based on dividing celestial sphere such as HTM or HEALPix. In contrast, the Web application layer became compact, and was written in simple procedural PHP codes. In total, our system realizes cost-effective maintenance and enhancements.Comment: Yamauchi, C. et al. 2011, PASP..123..852

    Towards an automatic data value analysis method for relational databases

    Get PDF
    Data is becoming one of the world’s most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach

    The state of SQL-on-Hadoop in the cloud

    Get PDF
    Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud, and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark. The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines. The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization. The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
    • 

    corecore