Search CORE

725 research outputs found

Expressions for Batched Searching of Sequential and Hierarchical Files

Author: NC DOCKS at The University of North Carolina at Greensboro
Palvia Prashant
Publication venue
Publication date: 01/01/1985
Field of study

Batching yields significant savings in access costs in sequential, tree-structured, and random files. A direct and simple expression is developed for computing the average number of records/pages accessed to satisfy a batched query of a sequential file. The advantages of batching for sequential and random files are discussed. A direct equation is provided for the number of nodes accessed in unbatched queries of hierarchical files. An exact recursive expression is developed for node accesses in batched queries of hierarchical files. In addition to the recursive relationship, good, closed-form upper- and lower-bound approximations are provided for the case of batched queries of hierarchical files

The University of North Carolina at Greensboro

Batched Searching in Database Organizations

Author: NC DOCKS at The University of North Carolina at Greensboro
Palvia Prashant
Publication venue
Publication date: 01/01/1988
Field of study

Savings in the number of page accesses due to batching on sequential, tree-structured, and random files are well known and have been reported in the literature. This paper asserts that substantial savings can also be obtained in database organizations by batching the requests for records (in queries), and also by batching intermediate processing requests while traversing the database. A simple database having two interrelated files is used to demonstrate such savings. For the simple database, three variations on batching are reported and compared with the case of unbatched requests. New mathematical expressions have been developed for the batched cases as well as for the unbatched case, and the savings are demonstrated with some example problems. As an extension, larger databases will enjoy even greater savings due to batching. The paper also discusses several strategies for applying the batching approach to current databases, and the advantages of emerging very large main memories for the batching approach

The University of North Carolina at Greensboro

The Effect of Buffer Size on Pages Accessed in Random Files

Author: NC DOCKS at The University of North Carolina at Greensboro
Palvia Prashant
Publication venue
Publication date: 01/01/1988
Field of study

Prior works, for estimating the number of pages (blocks) accessed from secondary memory to retrieve a certain number of records for a query, have ignored the effect of main memory buffer size. While this may not cause any adverse impact for special cases, in most cases the impact of buffer sizes will be to increase the number of page accesses. This paper describes the reasons for the impact due to a limited buffer size and develops new expressions for the number of pages accessed. The accuracy of the expressions is evaluated by simulation modeling; and the effects of limited buffer size are discussed. Analytical works in database analysis and design should use the new expressions: especially when the effect of the buffer size is significant

The University of North Carolina at Greensboro

Estimating disk head movement in batched searching

Author: B. Schneiderman
C. K. Wong
F. W. Burton
J. B. Rothnie
J. G. Kollias
J. G. Kollias
J. Zahorian
P. Palvia
S. B. Yao
S. Christodoulakis
S. J. Waters
T. J. Teorey
T. J. Teory
Y. Manolopoulos
Y. P. Manolopoulos
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Parallel Working-Set Search Structures

Author: Akhremtsev Yaroslav
Crauser A.
Frias Leonor
Oyama Y.
Richard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2018
Field of study

In this paper we present two versions of a parallel working-set map on p processors that supports searches, insertions and deletions. In both versions, the total work of all operations when the map has size at least p is bounded by the working-set bound, i.e., the cost of an item depends on how recently it was accessed (for some linearization): accessing an item in the map with recency r takes O(1+log r) work. In the simpler version each map operation has O((log p)^2+log n) span (where n is the maximum size of the map). In the pipelined version each map operation on an item with recency r has O((log p)^2+log r) span. (Operations in parallel may have overlapping span; span is additive only for operations in sequence.) Both data structures are designed to be used by a dynamic multithreading parallel program that at each step executes a unit-time instruction or makes a data structure call. To achieve the stated bounds, the pipelined data structure requires a weak-priority scheduler, which supports a limited form of 2-level prioritization. At the end we explain how the results translate to practical implementations using work-stealing schedulers. To the best of our knowledge, this is the first parallel implementation of a self-adjusting search structure where the cost of an operation adapts to the access sequence. A corollary of the working-set bound is that it achieves work static optimality: the total work is bounded by the access costs in an optimal static search tree.Comment: Authors' version of a paper accepted to SPAA 201

arXiv.org e-Print Archive

Crossref

Real-Time Stream Processing in Embedded Systems

Author: Mei Haitao
Publication venue: University of York
Publication date: 01/09/2017
Field of study

Modern real-time embedded systems often involve computational-intensive data processing algorithms to meet their application requirements. As a result, there has been an increase in the use of multiprocessor platforms. The stream processing programming model aims to facilitate the construction of concurrent data processing programs to exploit the parallelism available on these architectures. However, most current stream processing frameworks or languages are not designed for use in real-time systems, let alone systems that might also have hard real-time control algorithms. This thesis contends that a generic architecture of a real-time stream processing infrastructure can be created to support predictable processing of both batched and live streaming data sources, and integrated with hard real-time control algorithms. The thesis first reviews relevant stream processing techniques, and identifies the open issues. Then a real-time stream processing task model, and an architecture for supporting that model is proposed. An approach to the integration of stream processing tasks into a real-time environment that also has hard real-time components is presented. Data is processed in parallel using execution-time servers allocated to each core. An algorithm is presented for selecting the parameters of the servers that maximises their capacities (within an overall deadline) and ensures that hard real-time components remain schedulable. Response-time analysis is derived to guarantee that the real-time requirements (deadlines for batched data processing, and latency for each data item for live data) for the stream processing activity are met. A framework, called SPRY, is implemented to support the proposed real-time stream processing architecture. The framework supports fully-partitioned applications that are scheduled using fixed priority-based scheduling techniques. A case study based on a modified Generic Avionics Platform is given to demonstrate the overall approach. Finally, the evaluation shows that the presented approach provides a better schedulability than alternative approaches

White Rose E-theses Online

Ubik--a framework for the development of distributed organizations

Author: De Jong Stephen Peter
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1989
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1989.Includes bibliographical references (leaves 206-210).by Stephen Peter de Jong.Ph.D

DSpace@MIT

Recommended from our members

Network Structures, Concurrency, and Interpretability: Lessons from the Development of an AI Enabled Graph Database System

Author: Cooper Hal James
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

This thesis describes the development of the SmartGraph, an AI enabled graph database. The need for such a system has been independently recognized in the isolated fields of graph databases, graph computing, and computational graph deep learning systems, such as TensorFlow. Though prior works have investigated some relationships between these fields, we believe that the SmartGraph is the first system designed from conception to incorporate the most significant and useful characteristics of each. Examples include the ability to store graph structured data, run analytics natively on this data, and run gradient descent algorithms. It is the synergistic aspects of combining these fields that provide the most novel results presented in this dissertation. Key among them is how the notion of “graph querying” as used in graph databases can be used to solve a problem that has plagued deep learning systems since their inception; rather than attempting to embed graph structured datasets into restrictive vector spaces, we instead allow the deep learning functionality of the system to natively perform graph querying in memory during optimization as a way of interpreting (and learning) the graph. This results in a concept of natural and interpretable processing of graph structured data. Graph computing systems have traditionally used distributed computing across multiple compute nodes (e.g. separate machines connected via Ethernet or internet) to deal with large-scale datasets whilst working sequentially on problems over entire datasets. In this dissertation, we outline a distributed graph computing methodology that facilitates all the above capabilities (even in an environment consisting of a single physical machine) while allowing for a workflow more typical of a graph database than a graph computing system; massive concurrent access allowing for arbitrarily asynchronous execution of queries and analytics across the entire system. Further, we demonstrate how this methodology is key to the artificial intelligence capabilities of the system

Columbia University Academic Commons

Information handling: Concepts which emerged in practical situations and are analysed cybernetically

Author: Hibbs Genevieve Mary
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/1990
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University

OpenGrey Repository

Brunel University Research Archive