Search CORE

174,236 research outputs found

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Using a Logic Programming Framework to Control Database Query Dialogues in Natural Language

Author: D.H.D. Warren
E. Bick
H. Kamp
I. Androutsopoulos
I. Androutsopoulos
L. Quintano
S. Abreu
S.P. Abreu
S.P. Abreu
S.P. Abreu
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

We present a natural language question/answering system to interface the University of Évora databases that uses clarification dialogs in order to clarify user questions. It was developed in an integrated logic programming framework, based on constraint logic programming using the GnuProlog(-cx) language [2,11] and the ISCO framework [1]. The use of this LP framework allows the integration of Prolog-like inference mechanisms with classes and inheritance, constraint solving algorithms and provides the connection with relational databases, such as PostgreSQL. This system focus on the questions’ pragmatic analysis, to handle ambiguity, and on an efficient dialogue mechanism, which is able to place relevant questions to clarify the user intentions in a straightforward manner. Proper Nouns resolution and the pp-attachment problem are also handled. This paper briefly presents this innovative system focusing on its ability to correctly determine the user intention through its dialogue capability

Repositório Científico da Universidade de Évora

User Defined Types and Nested Tables in Object Relational Databases

Author: Byrne Bernadette
Garvey Mary
Publication venue
Publication date: 01/06/2006
Field of study

Bernadette Byrne, Mary Garvey, ‘User Defined Types and Nested Tables in Object Relational Databases’, paper presented at the United Kingdom Academy for Information Systems 2006: Putting Theory into Practice, Cheltenham, UK, 5-7 June, 2006.There has been much research and work into incorporating objects into databases with a number of object databases being developed in the 1980s and 1990s. During the 1990s the concept of object relational databases became popular, with object extensions to the relational model. As a result, several relational databases have added such extensions. There has been little in the way of formal evaluation of object relational extensions to commercial database systems. In this work an airline flight logging system, a real-world database application, was taken and a database developed using a regular relational database and again using object relational extensions, allowing the evaluation of the relational extensions.Peer reviewe

Towards a Novel Cooperative Logistics Information System Framework

Author: Amanton Laurent
Sanlaville Eric
Zaidi Fares
Publication venue
Publication date: 08/07/2018
Field of study

Supply Chains and Logistics have a growing importance in global economy. Supply Chain Information Systems over the world are heterogeneous and each one can both produce and receive massive amounts of structured and unstructured data in real-time, which are usually generated by information systems, connected objects or manually by humans. This heterogeneity is due to Logistics Information Systems components and processes that are developed by different modelling methods and running on many platforms; hence, decision making process is difficult in such multi-actor environment. In this paper we identify some current challenges and integration issues between separately designed Logistics Information Systems (LIS), and we propose a Distributed Cooperative Logistics Platform (DCLP) framework based on NoSQL, which facilitates real-time cooperation between stakeholders and improves decision making process in a multi-actor environment. We included also a case study of Hospital Supply Chain (HSC), and a brief discussion on perspectives and future scope of work

arXiv.org e-Print Archive

Recommended from our members

Clinical metagenomics.

Author: Chiu Charles Y
Miller Steven A
Publication venue: eScholarship, University of California
Publication date: 01/06/2019
Field of study

Clinical metagenomic next-generation sequencing (mNGS), the comprehensive analysis of microbial and host genetic material (DNA and RNA) in samples from patients, is rapidly moving from research to clinical laboratories. This emerging approach is changing how physicians diagnose and treat infectious disease, with applications spanning a wide range of areas, including antimicrobial resistance, the microbiome, human host gene expression (transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in the clinical laboratory and address potential solutions for maximizing its impact on patient care and public health

eScholarship - University of California

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft