2 research outputs found
Techniques for Ensuring Index Usage Predictability in Microsoft SQL Server
The demand for carrying out high-performance operations with data is growing in parallel with the vast growth of data itself. The retrieval of data for analysis, the manipulation of data, as well as its insertion in data stores must all be performed very efficiently, using techniques that ensure speed, reliability and accuracy. This paper investigates the techniques and practices that improve the performance of data retrieving by the use of SQL and Microsoft SQL Server. Being that SQL is a declarative language that specifies what should be produced as a result, instead of how to achieve that result, this paper will look at the internals of SQL Server that affect the "how" of queries and data operations, in order to propose techniques that ensure performance gains. The paper will aim to shed light on the limitations and variance in index usage, and to answer the question why indexes are sometimes used, and other times not, for the same query. To overcome the index limitations the "index fusion" technique is proposed
Recommended from our members
A Paradigm for Scalable, Transactional, and Efficient Spatial Indexes
With large volumes of geo-tagged data collected in various applications, spatial query pro- cessing becomes essential. Query engines depend on efficient indexes to expedite processing. There are three main challenges: scaling out to accommodate large volumes of spatial data, support- ing transactional primitives for strong consistency guarantees, and adapting to highly dynamic workloads. This thesis proposes a paradigm for scalable, transactional, and efficient spatial indexes to significantly reduce development efforts in designing and comparing multiple spatial indexes.This thesis first introduces a distributed and transactional key value store called DTranx to persist the spatial indexes. DTranx follows the SEDA architecture to exploit high concurrency in multi-core environments and it adopts a hybrid of optimistic concurrency control and two-phase commit protocols to narrow down the critical sections of distributed locking during transaction com- mits. Moreover, DTranx integrates a persistent memory based write-ahead log to reduce durability overhead and combines a garbage collection mechanism without affecting normal transactions. To maintain high throughput for search workloads when databases are constantly updated, snapshot transactions are introduced.Then, a paradigm is presented with a set of intuitive APIs and a Mempool runtime to re- duce development efforts. Mempool transparently synchronizes local states of data structures with DTranx and it handles two critical tasks: address translation and transparent server synchroniza- tion, of which the latter includes transaction construction and data synchronization. Furthermore, a dynamic partitioning strategy is integrated into DTranx to generate partitioning and replication plans that reduce inter-server communications and balance resource usage.Lastly, single-threaded data structures BTree and RTree are converted into distributed versions within two weeks. The BTree and RTree achieve 253.07 kops/sec and 77.83 kops/sec through- put respectively for pure search operations in a 25-server cluster