Search CORE

64,631 research outputs found

Reasoning & Querying – State of the Art

Author: Bry François
Furche Tim
Weiand Klara
Publication venue
Publication date: 31/08/2008
Field of study

Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

Open Access LMU

Database Queries that Explain their Work

Author: Acar Umut A.
Ahmed Amal
Cheney James
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Provenance for database queries or scientific workflows is often motivated as providing explanation, increasing understanding of the underlying data sources and processes used to compute the query, and reproducibility, the capability to recompute the results on different inputs, possibly specialized to a part of the output. Many provenance systems claim to provide such capabilities; however, most lack formal definitions or guarantees of these properties, while others provide formal guarantees only for relatively limited classes of changes. Building on recent work on provenance traces and slicing for functional programming languages, we introduce a detailed tracing model of provenance for multiset-valued Nested Relational Calculus, define trace slicing algorithms that extract subtraces needed to explain or recompute specific parts of the output, and define query slicing and differencing techniques that support explanation. We state and prove correctness properties for these techniques and present a proof-of-concept implementation in Haskell.Comment: PPDP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Study of various data mining techniques

Author: Mallick Jnanaranjan
Tudu Manmohan
Publication venue
Publication date: 01/01/2007
Field of study

The advent of computing technology has significantly influenced our lives and two major impacts of this effect are Business Data Processing and Scientific Computing. During the initial years of the development of computer techniques for business, computer professionals were concerned with designing files to store the data so that information could be efficiently retrieved. There were restrictions on storage size for storing data and on speed of accessing the data. Needless to say, the activity was restricted to a very few, highly qualified professionals. Then came the era when the task was simplified by a DBMS [1]. The responsibilities of intricate tasks, such as declarative aspects of the program were passed on to the database administrator and the user could pose his query in simpler languages such as query languages

ethesis@nitr

Federated Query Processing

Author: Endris Kemele M.
Graux Damien
Vidal Maria-Esther
Publication venue: Cham : Springer
Publication date: 01/01/2020
Field of study

Big data plays a relevant role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Semantic web technologies have also experienced great progress, and scientific communities and practitioners have contributed to the problem of big data management with ontological models, controlled vocabularies, linked datasets, data models, query languages, as well as tools for transforming big data into knowledge from which decisions can be made. Despite the significant impact of big data and semantic web technologies, we are entering into a new era where domains like genomics are projected to grow very rapidly in the next decade. In this next era, integrating big data demands novel and scalable tools for enabling not only big data ingestion and curation but also efficient large-scale exploration and discovery. Federated query processing techniques provide a solution to scale up to large volumes of data distributed across multiple data sources. Federated query processing techniques resort to source descriptions to identify relevant data sources for a query, as well as to find efficient execution plans that minimize the total execution time of a query and maximize the completeness of the answers. This chapter summarizes the main characteristics of a federated query engine, reviews the current state of the field, and outlines the problems that still remain open and represent grand challenges for the area

Repositorium für Naturwissenschaften und Technik

The crustal dynamics intelligent user interface anthology

Author: Campbell William J.
Roelofs Larry H.
Short Nicholas M., Jr.
Wattawa Scott L.
Publication venue
Publication date
Field of study

The National Space Science Data Center (NSSDC) has initiated an Intelligent Data Management (IDM) research effort which has, as one of its components, the development of an Intelligent User Interface (IUI). The intent of the IUI is to develop a friendly and intelligent user interface service based on expert systems and natural language processing technologies. The purpose of such a service is to support the large number of potential scientific and engineering users that have need of space and land-related research and technical data, but have little or no experience in query languages or understanding of the information content or architecture of the databases of interest. This document presents the design concepts, development approach and evaluation of the performance of a prototype IUI system for the Crustal Dynamics Project Database, which was developed using a microcomputer-based expert system tool (M. 1), the natural language query processor THEMIS, and the graphics software system GSS. The IUI design is based on a multiple view representation of a database from both the user and database perspective, with intelligent processes to translate between the views

NASA Technical Reports Server

The design and implementation of a meaning driven data query language

Author: Baer D.
Baer D.
Groenewoud P.
Groenewoud P.
Kapetanios E.
Kapetanios E.
Mueller P.
Mueller P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

We present the design and implementation of a Meaning Driven Data Query Language - MDDQL - which aims at the construction of queries through system made suggestions of natural language based query terms for both scientific application domain terms and operator/operation ones. A query construction blackboard is used where query language terms are suggested to the user in its preferred natural language and in a name centered way, together with their connotation. This helps in understanding the meaning of the terms and/or operators or operations to be included in the query. Furthermore, the construction of the query turns out to be an incremental refinement of the query under construction through semantic constraints, where only those domain language terms and/or operators/operations are suggested which result into meaningful combinations of query terms as related to the scientific application domain semantics. Therefore, semantically meaningless queries can be prevented during the query construction. Such a semantics aware mechanism is not available in conventional database query languages such as SQL, where one is allowed to execute a query calculating, for example, the average of numerical data values whereas they represent the codes of categorical values. Moreover, no familiarity with the semantics of complex database schemes or interpretation of the symbols (names of classes/tables/attributes, value codes) underlying the storage model, as well as familiarity with the syntax of a database specific query language are needed by the end-user. The constructed query can be submitted to the MDDQL query interpretation and transformation engine, where the corresponding SQL-query is generated and delegated to a DBMS (e.g., Oracle, MSAccess, SQL-Server). Generation of SQL-statements addressing NF2 data models such as those provided by the object-relational Oracle DBMS is also enabled. The query result is presented in a table based form where all storage model symbols are interpreted and can be exported for the usage with statistical software packages (e.g., SPSS)

Crossref

WestminsterResearch

Data Vaults: a Database Welcome to Scientiﬁc File Repositories

Author: Datcu M. (Mihai)
Espinoza Molina D.
Ivanova M.G. (Milena)
Kargin Y. (Yagiz)
Kersten M.L. (Martin)
Manegold S. (Stefan)
Zhang Y. (Ying)
Publication venue
Publication date: 01/01/2013
Field of study

Efficient management and exploration of high-volume scientific file repositories have become pivotal for advancement in science. We propose to demonstrate the Data Vault, an extension of the database system architecture that transparently opens scientific file repositories for efficient in-database processing and exploration. The Data Vault facilitates science data analysis using high-level declarative languages, such as the traditional SQL and the novel array-oriented SciQL. Data of interest are loaded from the attached repository in a just-in-time manner without need for up-front data ingestion. The demo is built around concrete implementations of the Data Vault for two scientific use cases: seismic time series and Earth observation images. The seismic Data Vault uses the queries submitted by the audience to illustrate the internals of Data Vault functioning by revealing the mechanisms of dynamic query plan generation and on-demand external data ingestion. The image Data Vault shows an application view from the perspective of data mining researchers

CWI's Institutional Repository

International Migration, Integration and Social Cohesion online publications

Object-relational spatio-temporal databases

Author: Cheng Tsz-Shing
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1995
Field of study

We present an object-relational model for uniform handling of dimensional data. Spatial, temporal, spatio-temporal and ordinary data are special cases of dimensional data. The said uniformity is achieved through the concept of dimension alignment, which automatically allows lower dimensional data and queries to be used in a higher dimensional context;Unlike ordinary data, dimensional objects are interwoven. We introduce object identity (oid) fragments to circumvent data redundancy at logical level. Computed types are placed appropriately in a type hierarchy to allow maximal use of existing methods. A query language for spatio-temporal data is presented for associative navigation. A framework for algebraic optimization of the query language is suggested;A pattern matching language is designed for complex querying of spatio-temporal data which seamlessly extends the associative navigation in our query language. The pattern matching language recognizes special features of time and space providing an appropriate level of abstraction for application development compared to traditional languages. This reduces the need for embedding the query language in a lower level language such as C++. The pattern matching language is also dimensionally extensible. The pattern matching allows query of data with multiple granularities and continuous data. It also provides hooks for direct query of scientific data (observations);Our model is dimensionally extensible, and also an extension of a relational model for dimensional data. Moreover the dimensionality and addition of oids are mutually orthogonal concepts. Thus starting from classical ordinary data, one may migrate to higher forms of relational or object-relational data in any sequence, without having to recode application software. Our model does not deal with complex objects, which is left as a future extension

Digital Repository @ Iowa State University (ISU)

Generic functional requirements for a NASA general-purpose data base management system

Author: Lohman G. M.
Publication venue
Publication date
Field of study

Generic functional requirements for a general-purpose, multi-mission data base management system (DBMS) for application to remotely sensed scientific data bases are detailed. The motivation for utilizing DBMS technology in this environment is explained. The major requirements include: (1) a DBMS for scientific observational data; (2) a multi-mission capability; (3) user-friendly; (4) extensive and integrated information about data; (5) robust languages for defining data structures and formats; (6) scientific data types and structures; (7) flexible physical access mechanisms; (8) ways of representing spatial relationships; (9) a high level nonprocedural interactive query and data manipulation language; (10) data base maintenance utilities; (11) high rate input/output and large data volume storage; and adaptability to a distributed data base and/or data base machine configuration. Detailed functions are specified in a top-down hierarchic fashion. Implementation, performance, and support requirements are also given

NASA Technical Reports Server

Multiple Retrieval Models and Regression Models for Prior Art Search

Author: Lopez Patrice
Romary Laurent
Publication venue
Publication date: 01/01/2009
Field of study

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1