246 research outputs found
Growth of relational model: Interdependence and complementary to big data
A database management system is a constant application of science that provides a platform for the creation, movement, and use of voluminous data. The area has witnessed a series of developments and technological advancements from its conventional structured database to the recent buzzword, bigdata. This paper aims to provide a complete model of a relational database that is still being widely used because of its well known ACID properties namely, atomicity, consistency, integrity and durability. Specifically, the objective of this paper is to highlight the adoption of relational model approaches by bigdata techniques. Towards addressing the reason for this in corporation, this paper qualitatively studied the advancements done over a while on the relational data model. First, the variations in the data storage layout are illustrated based on the needs of the application. Second, quick data retrieval techniques like indexing, query processing and concurrency control methods are revealed. The paper provides vital insights to appraise the efficiency of the structured database in the unstructured environment, particularly when both consistency and scalability become an issue in the working of the hybrid transactional and analytical database management system
Index structures for distributed text databases
The Web has became an obiquitous resource for distributed computing making it relevant to investigate new ways of providing efficient access to services available at dedicated sites. Efficiency is an ever-increasing demand which can be only satisfied with the development of parallel algorithms which are efficient in practice. This tutorial paper focuses on the design, analysis and implementation of parallel algorithms and data structures for widely-used text database applications on the Web. In particular we describe parallel algorithms for inverted files and suffix arrays structures that are suitable for implementing search engines. Algorithmic design is effected on top of the BSP model of parallel computing. This model ensures portability across diverse parallel architectures ranging from clusters to super-computers.Facultad de Informátic
Index structures for distributed text databases
The Web has became an obiquitous resource for distributed computing making it relevant to investigate new ways of providing efficient access to services available at dedicated sites. Efficiency is an ever-increasing demand which can be only satisfied with the development of parallel algorithms which are efficient in practice. This tutorial paper focuses on the design, analysis and implementation of parallel algorithms and data structures for widely-used text database applications on the Web. In particular we describe parallel algorithms for inverted files and suffix arrays structures that are suitable for implementing search engines. Algorithmic design is effected on top of the BSP model of parallel computing. This model ensures portability across diverse parallel architectures ranging from clusters to super-computers.Facultad de Informátic
Index structures for distributed text databases
The Web has became an obiquitous resource for distributed computing making it relevant to investigate new ways of providing efficient access to services available at dedicated sites. Efficiency is an ever-increasing demand which can be only satisfied with the development of parallel algorithms which are efficient in practice. This tutorial paper focuses on the design, analysis and implementation of parallel algorithms and data structures for widely-used text database applications on the Web. In particular we describe parallel algorithms for inverted files and suffix arrays structures that are suitable for implementing search engines. Algorithmic design is effected on top of the BSP model of parallel computing. This model ensures portability across diverse parallel architectures ranging from clusters to super-computers.Facultad de Informátic
Manufacturing equipment data collection framework
Estágio realizado na Qimonda Portugal, S. A. e orientado pelo Eng.º Nuno SoaresTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200
Recommended from our members
Distributed Inverted Files and Performance: A Study of Parallelism and Data Distribution Methods in IR
The study investigates the performance of parallel information retrieval (IR) algorithms on different data distribution methods for Inverted files to identify which is the best for the requirements of specific IR tasks. We define a data distribution method as a way of distributing Inverted file data to local disks on a parallel machine. A data distribution method may be on-the-fly (with one copy of the index held), replication (all nodes have all of the index) or partitioning (data for index is split amongst nodes). Partitioning of inverted file data can be done in many ways but we consider only two: by term (Termld) and by document (Dodd). Termld partitioning is a type of partitioning which distributes unique word data to a single partition, while D odd partitioning distributes unique document data to a single partition. We consider the issue of improving the performance of standard IR algorithms on these data distribution methods by looking at sequential job service not concurrent job service, e.g. we consider the issue of sequential query service not concurrent query service. This methodology rules out some distribution methods for some tasks studied. We consider the following main tasks of IR: indexing, search, passage retrieval, inverted file update and query optimisation for routing /filtering. We produce a synthetic performance model for each of these tasks for the purposes of comparison. We have two subsidiary aims; one was to demonstrate portability of our implemented data structures and algorithms on different parallel machines. Secondly, we also study the possibility of increased retrieval effectiveness by examining a larger section of the search space for both passage retrieval and routing/filtering. We consider the implications of concurrency in updates on Inverted files. Our theoretical and empirical results show that in most cases the D odd partitioning method is the best data distribution method apart from routing/filtering where replication was found to be superior
An Introduction to Database Systems
This textbook introduces the basic concepts of database systems. These concepts are presented through numerous examples in modeling and design. The material in this book is geared to an introductory course in database systems offered at the junior or senior level of Computer Science. It could also be used in a first year graduate course in database systems, focusing on a selection of the advanced topics in the latter chapters
- …