Search CORE

192 research outputs found

Innovative Evaluation System – IESM: An Architecture for the Database Management System for Mobile Application

Author: Arumugam Vidyapriyadarshini
Joan Lu
Sundaram Aswin
Publication venue
Publication date: 01/07/2011
Field of study

As the mobile applications are constantly facing a rapid development in the recent years especially in the academic environment such as student response system [1-8] used in universities and other educational institutions; there has not been reported an effective and scalable Database Management System to support fast and reliable data storage and retrieval. This paper presents Database Management Architecture for an Innovative Evaluation System based on Mobile Learning Applications. The need for a relatively stable, independent and extensible data model for faster data storage and retrieval is analyzed and investigated. It concludes by emphasizing further investigation for high throughput so as to support multimedia data such as video clips, images and documents

CiteSeerX

University of Huddersfield Repository

Towards Business Intelligence over Unified Structured and Unstructured Data Using XML

Author: Vishu Krishnamurthy
Zhen Hua Liu
Publication venue: 'IntechOpen'
Publication date: 01/02/2012
Field of study

IntechOpen

A Nine Month Progress Report on an Investigation into Mechanisms for Improving Triple Store Performance

Author: Owens Alisdair
Publication venue: s.n.
Publication date
Field of study

This report considers the requirement for fast, efficient, and scalable triple stores as part of the effort to produce the Semantic Web. It summarises relevant information in the major background field of Database Management Systems (DBMS), and provides an overview of the techniques currently in use amongst the triple store community. The report concludes that for individuals and organisations to be willing to provide large amounts of information as openly-accessible nodes on the Semantic Web, storage and querying of the data must be cheaper and faster than it is currently. Experiences from the DBMS field can be used to maximise triple store performance, and suggestions are provided for lines of investigation in areas of storage, indexing, and query optimisation. Finally, work packages are provided describing expected timetables for further study of these topics

Southampton (e-Prints Soton)

Investigation into Indexing XML Data Techniques

Author: Joan Lu
Klaib Alhadi
Publication venue
Publication date: 21/07/2014
Field of study

The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues. Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow

University of Huddersfield Repository

Enabling Graph Analysis Over Relational Databases

Author: Xirogiannopoulos Konstantinos
Publication venue
Publication date: 01/01/2019
Field of study

Complex interactions and systems can be modeled by analyzing the connections between underlying entities or objects described by a dataset. These relationships form networks (graphs), the analysis of which has been shown to provide tremendous value in areas ranging from retail to many scientific domains. This value is obtained by using various methodologies from network science-- a field which focuses on studying network representations in the real world. In particular "graph algorithms", which iteratively traverse a graph's connections, are often leveraged to gain insights. To take advantage of the opportunity presented by graph algorithms, there have been a variety of specialized graph data management systems, and analysis frameworks, proposed in recent years, which have made significant advances in efficiently storing and analyzing graph-structured data. Most datasets however currently do not reside in these specialized systems but rather in general-purpose relational database management systems (RDBMS). A relational or similarly structured system is typically governed by a schema of varying strictness that implements constraints and is meticulously designed for the specific enterprise. Such structured datasets contain many relationships between the entities therein, that can be seen as latent or "hidden" graphs that exist inherently inside the datasets. However, these relationships can only typically be traversed via conducting expensive JOINs using SQL or similar languages. Thus, in order for users to efficiently traverse these latent graphs to conduct analysis, data needs to be transformed and migrated to specialized systems. This creates barriers that hinder and discourage graph analysis; our vision is to break these barriers. In this dissertation we investigate the opportunities and challenges involved in efficiently leveraging relationships within data stored in structured databases. First, we present GraphGen, a lightweight software layer that is independent from the underlying database, and provides interfaces for graph analysis of data in RDBMSs. GraphGen is the first such system that introduces an intuitive high-level language for specifying graphs of interest, and utilizes in-memory graph representations to tackle the problems associated with analyzing graphs that are hidden inside structured datasets. We show GraphGen can analyze such graphs in orders of magnitude less memory, and often computation time, while eliminating manual Extract-Transform-Load (ETL) effort. Second, we examine how in-memory graph representations of RDBMS data can be used to enhance relational query processing. We present a novel, general framework for executing GROUP BY aggregation over conjunctive queries which avoids materialization of intermediate JOIN results, and wrap this framework inside a multi-way relational operator called Join-Agg. We show that Join-Agg can compute aggregates over a class of relational and graph queries using orders of magnitude less memory and computation time

Digital Repository at the University of Maryland

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

Author: Antcheva Ilka
Ballintijn Maarten
Bellenot Bertrand
Biskup Marek
Brun Rene
Buncic Nenad
Canal Philippe
Casadei Diego
Couet Olivier
Fine Valery
Franco Leandro
Ganis Gerardo
Gheata Andrei
Goto Masaharu
Iwaszkiewicz Jan
Kreshuk Anna
Maline David Gonzalez
Maunder Richard
Moneta Lorenzo
Naumann Axel
Offermann Eddy
Onuchin Valeriy
Panacek Suzanne
Rademakers Fons
Russo Paul
Segura Diego Marcos
Tadel Matevz
Publication venue: 'Elsevier BV'
Publication date: 31/08/2015
Field of study

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

arXiv.org e-Print Archive

CERN Document Server

GPU-based JSON data processing using structural indexes

Author: Vlaswinkel Koen R.
Publication venue
Publication date: 05/08/2021
Field of study

Pure OAI Repository

Accelerating data retrieval steps in XML documents

Author: Shen Yun
Publication venue
Publication date: 01/01/2005
Field of study

Repository@Hull - Worktribe

Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries

Author: Alonso Gustavo
Barthels Claude
Besta Maciej
Fischer Marc
Gerstenberger Robert
Hoefler Torsten
Peter Emanuel
Podstawski Michał
Publication venue
Publication date: 04/11/2022
Field of study

Graph processing has become an important part of multiple areas of computer science, such as machine learning, computational sciences, medical applications, social network analysis, and many others. Numerous graphs such as web or social networks may contain up to trillions of edges. Often, these graphs are also dynamic (their structure changes over time) and have domain-specific rich data associated with vertices and edges. Graph database systems such as Neo4j enable storing, processing, and analyzing such large, evolving, and rich datasets. Due to the sheer size of such datasets, combined with the irregular nature of graph processing, these systems face unique design challenges. To facilitate the understanding of this emerging domain, we present the first survey and taxonomy of graph database systems. We focus on identifying and analyzing fundamental categories of these systems (e.g., triple stores, tuple stores, native graph database systems, or object-oriented systems), the associated graph models (e.g., RDF or Labeled Property Graph), data organization techniques (e.g., storing graph data in indexing structures or dividing data into records), and different aspects of data distribution and query execution (e.g., support for sharding and ACID). 51 graph database systems are presented and compared, including Neo4j, OrientDB, or Virtuoso. We outline graph database queries and relationships with associated domains (NoSQL stores, graph streaming, and dynamic graph algorithms). Finally, we describe research and engineering challenges to outline the future of graph databases

arXiv.org e-Print Archive