Search CORE

90,531 research outputs found

Beyond relational databases: preserving the data

Author: Faria Luís
Ferreira Bruno
Ferreira Miguel
Ramalho José Carlos
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2020
Field of study

Relational databases are one of the main technologies supporting information assets in today’s organizations. They are designed to store, organize and retrieve digital information, and are such a fundamental part of information systems that most would not be able to function without them. Very often, the information contained in databases is irreplaceable or prohibitively expensive to reacquire; therefore, steps must be taken to ensure that the information within databases is preserved. This paper describes a methodology for long-term preservation of relational databases based on information extraction and format migration to a preservation format. It also presents a tool that was developed to support this methodology: Database Preservation Toolkit (DBPTK), as well as the processes and formats needed to preserve databases. The DBPTK connects to live relational databases and extracts information into formats more adequate for long-term preservation. Supported preservation formats include the SIARD 2, created by a cooperation between the Swiss Federal Archives and the E-ARK project that is becoming a standard in the area. DBPTK has a flexible plugin-based architecture enabling its use for other purposes like database upgrade and database migration between different systems. Presented real case scenarios demonstrate the usefulness, correctness and performance of the tool.The initial E-ARK project was in part supported by the European Commission within the Competitiveness and Innovation Programme 2007–2013, Grant Agreement no. 620998 under the Policy Support Programme

Universidade do Minho: RepositoriUM

Crossref

ANSWERING GEOSPARQL QUERIES OVER RELATIONAL DATA

Author: G. Xiao
K. Bereta
M. Koubarakis
Publication venue
Publication date: 01/07/2017
Field of study

In this paper we present the system Ontop-spatial that is able to answer GeoSPARQL queries on top of geospatial relational databases, performing on-the-fly GeoSPARQL-to-SQL translation using ontologies and mappings. GeoSPARQL is a geospatial extension of the query language SPARQL standardized by OGC for querying geospatial RDF data. Our approach goes beyond relational databases and covers all data that can have a relational structure even at the logical level. Our purpose is to enable GeoSPARQL querying on-the-fly integrating multiple geospatial sources, without converting and materializing original data as RDF and then storing them in a triple store. This approach is more suitable in the cases where original datasets are stored in large relational databases (or generally in files with relational structure) and/or get frequently updated

Directory of Open Access Journals

Open Access Repository

Infinite Probabilistic Databases

Author: Grohe Martin
Lindner Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open-world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Infinite Probabilistic Databases

Author: Grohe Martin
Lindner Peter
Publication venue
Publication date: 29/07/2021
Field of study

Probabilistic databases (PDBs) model uncertainty in data in a quantitative way. In the established formal framework, probabilistic (relational) databases are finite probability spaces over relational database instances. This finiteness can clash with intuitive query behavior (Ceylan et al., KR 2016), and with application scenarios that are better modeled by continuous probability distributions (Dalvi et al., CACM 2009). We formally introduced infinite PDBs in (Grohe and Lindner, PODS 2019) with a primary focus on countably infinite spaces. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. We argue that finite point processes are an appropriate model from probability theory for dealing with general probabilistic databases. This allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.Comment: This is the full version of the paper "Infinite Probabilistic Databases" presented at ICDT 2020 (arXiv:1904.06766

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

Teaching Tip: Teaching NoSQL Databases in a Database Course for Business Students

Author: Wang Hai
Wang Shouhong
Publication venue: AIS Electronic Library (AISeL)
Publication date: 15/03/2023
Field of study

NoSQL databases have been used in organizations for decades. Few database textbooks on the market, however, have suitable materials about NoSQL beyond general introductions for typical business students. In fact, users of the typical NoSQL systems on the software market need to have certain computer programming skills. This teaching tip introduces a small unit on NoSQL databases in a traditional database course for students in all business majors. The unit uses a Microsoft Excel-based NoSQL database example to explain the basis of NoSQL, describes the four essential types of NoSQL databases, and discusses representative NoSQL database management systems on the software market. As this unit does not require computer programming skills, it can be easily integrated into an existing relational database course for business students. The unit was tested twice. Students have demonstrated positive first-hand practice experiences of NoSQL beyond general concepts of NoSQL

AIS Electronic Library (AISeL)

DBRepo: a Semantic Digital Repository for Relational Databases

Author: Ganguly Raman
Gergely Eva
Michlits Cornelia
Rauber Andreas
Staudinger Moritz
Stytsenko Kirill
Weise Martin
Publication venue: University of Edinburgh
Publication date: 15/06/2022
Field of study

Data curation is a complex, multi-faceted task. While dedicated data stewards are starting to take care of these activities in close collaboration with researchers for many types of (usually file-based) data in many institutions, this is rarely yet the case for data held in relational databases. Beyond large-scale infrastructures hosting e.g. climate or genome data, researchers usually have to create, build and maintain their database, care about security patches, and feed data into it in order to use it in their research. Data curation, if at all, usually happens after a project is finished, when data may be exported for digital preservation into file repository systems. We present DBRepo, a semantic digital repository for relational databases in a private cloud setting designed to (1) host research data stored in relational databases right from the beginning of a research project, (2) provide separation of concerns, allowing the researchers to focus on the domain aspects of the data and their work while bringing in experts to handle classic data management tasks, (3) improve findability, accessibility and reusability by offering semantic mapping of metadata attributes, and (4) focus on reproducibility in dynamically evolving data by supporting versioning and precise identification/cite-ability for arbitrary subsets of data.&nbsp

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

International Journal of Digital Curation