1 research outputs found
Recommended from our members
Usable and Scalable Querying of Scientific Datasets
Scientists and engineers have to analyze and query multiple large databases. Analysis over databases created by phasor measurement units can provide insight into the health of the grid, thereby improving control over operations. Realizing this data-driven control, however, requires validating, processing and storing massive amounts of PMU data efficiently, which is not always achieved with modern systems. Furthermore, users should know formal query languages, such as SQL, and the structure and content of the database to use these systems. But, scientists do not usually know concepts, such as query languages, and the content and structure of the databases. Finally, the information related to most queries is spread across multiple data sources, where each represents information in a distinct form. Traditionally, users have to write programming rules to integrate the data in these data sources into one database with a homogeneous structure. This, however, takes a great deal of time and effort. Moreover, end-users often do not have the required programming background and expertise to write and maintain these rules. To address these challenges, we proposed novel methods to query multiple large databases easily and efficiently. We also describe a PMU data management system that supports input from multiple PMU data streams, features an event-detection algorithm, and provides an efficient method for retrieving archival data. To make database systems more usable, database systems offer keyword query interfaces where users do not need to know formal query languages and content and structure of the schema. As keyword queries are inherently ambiguous, it is challenging for database systems to answer them precisely. Using extensive empirical studies, we show that users explore and learn to formulate more precise keyword queries in their course of interaction with the database system. We propose an effective and efficient online learning algorithm that adapts to the user learning in the interaction with convergence guarantees. Furthermore, we set forth a novel approach to learning rules to integrate and query multiple databases progressively using end-user feedback. In our framework, each data source learns to translate its information to a form compatible with other data sources. We show that our method delivers effective rules using a modest number of interactions with the end-user