93 research outputs found
Parallel Access of Out-Of-Core Dense Extendible Arrays
Datasets used in scientific and engineering applications are often modeled as dense multi-dimensional arrays. For very large datasets, the corresponding array models are typically stored out-of-core as array files. The array elements are mapped onto linear consecutive locations that correspond to the linear ordering of the multi-dimensional indices. Two conventional mappings used are the row-major order and the column-major order of multi-dimensional arrays. Such conventional mappings of dense array files highly limit the performance of applications and the extendibility of the dataset. Firstly, an array file that is organized in say row-major order causes applications that subsequently access the data in column-major order, to have abysmal performance. Secondly, any subsequent expansion of the array file is limited to only one dimension. Expansions of such out-of-core conventional arrays along arbitrary dimensions, require storage reorganization that can be very expensive. Wepresent a solution for storing out-of-core dense extendible arrays that resolve the two limitations. The method uses a mapping function F*(), together with information maintained in axial vectors, to compute the linear address of an extendible array element when passed its k-dimensional index. We also give the inverse function, F-1*() for deriving the k-dimensional index when given the linear address. We show how the mapping function, in combination with MPI-IO and a parallel file system, allows for the growth of the extendible array without reorganization and no significant performance degradation of applications accessing elements in any desired order. We give methods for reading and writing sub-arrays into and out of parallel applications that run on a cluster of workstations. The axial-vectors are replicated and maintained in each node that accesses sub-array elements
Data structures
We discuss data structures and their methods of analysis. In particular, we treat the unweighted and weighted dictionary problem, self-organizing data structures, persistent data structures, the union-find-split problem, priority queues, the nearest common ancestor problem, the selection and merging problem, and dynamization techniques. The methods of analysis are worst, average and amortized case
Online Data Structures in External Memory
The original publication is available at www.springerlink.comThe data sets for many of today's computer applications are
too large to t within the computer's internal memory and must instead
be stored on external storage devices such as disks. A major performance
bottleneck can be the input/output communication (or I/O) between
the external and internal memories. In this paper we discuss a variety of
online data structures for external memory, some very old and some very
new, such as hashing (for dictionaries), B-trees (for dictionaries and 1-D
range search), bu er trees (for batched dynamic problems), interval trees
with weight-balanced B-trees (for stabbing queries), priority search trees
(for 3-sided 2-D range search), and R-trees and other spatial structures.
We also discuss several open problems along the way
Recommended from our members
An attributed-based database structure for small computers
Contemporary database systems are used in a variety of business applications requiring rapid retrieval of onÂline data. When records contain unique information indexed by a single key, the retrieval operation can be simplified. However, when added generality and flexibilÂity is needed, inverted files and sophisticated data models result in a system of interconnecting pointers and their associated data management programs. These, in turn, add unnecessary complexity to the overall system.
This paper develops a simplified data representation that overcomes many of the object1ons of other flexible database representations. In particular, we solve the following problems inherent in other database techniques employing a technique called the hashed index arrays.
(1) Space consuming redundancy inherent in inverted files or linked structures is reduced to a minimum, thus making the indexed array structure attractive for small systems.
(2) Records may be indexed by multiple keys (comÂbined domain values). Each key field need not store a unique value, thus providing multi-attribute processing via a hashing function.
(3) Fast retrieval, updates, and database "growth" is possible without excessive complexity. This is made possible by using extendible array structures based on shells rather than rows and columns
Data structures
We discuss data structures and their methods of analysis. In particular, we treat the unweighted and weighted dictionary problem, self-organizing data structures, persistent data structures, the union-find-split problem, priority queues, the nearest common ancestor problem, the selection and merging problem, and dynamization techniques. The methods of analysis are worst, average and amortized case
Definition of the concept “vector” in set theoretic programming languages
AbstractA basic set of data types for a set theoretic programming language is presented. The emphasis is on a certain definition of vector in terms of sets. The definition is contrasted with other methods of defining vectors in terms of sets, such as Kuratowski's device
Grid File Approach to Large Multidimensional Dynamic Data Structures
Computing and Information Science
- …