3 research outputs found

    A survey on big data indexing strategies

    Get PDF
    The operations of the Internet have led to a significant growth and accumulation of data known as Big Data.Individuals and organizations that utilize this data, had no idea, nor were they prepared for this data explosion.Hence, the available solutions cannot meet the needs of the growing heterogeneous data in terms of processing. This results in inefficient information retrieval or search query results.The design of indexing strategies that can support this need is required. A survey on various indexing strategies and how they are utilized for solving Big Data management issues can serve as a guide for choosing the strategy best suited for a problem, and can also serve as a base for the design of more efficient indexing strategies.The aim of the study is to explore the characteristics of the indexing strategies used in Big Data manageability by covering some of the weaknesses and strengths of B-tree, R-tree, to name but a few. This paper covers some popular indexing strategies used for Big Data management. It exposes the potentials of each by carefully exploring their properties in ways that are related to problem solving

    ADAM - a database and information retrieval system for big multimedia collections

    Get PDF
    The past decade has seen the rapid proliferation of low-priced devices for recording image, audio and video data in nearly unlimited quantity. Multimedia is Big Data, not only in terms of their volume, but also with respect to their heterogeneous nature. This also includes the variety of the queries to be executed. Current approaches for searching in big multimedia collections mainly rely on keywords. However, manually annotating each single object in a large collection is not feasible. Therefore, content-based multimedia retrieval –using sample objects as query input– is increasingly becoming an important requirement for dealing with the data deluge. In image databases, for instance, effective methods exploit the use of exemplary images or hand-drawn sketches as query input. In this paper, we introduce ADAM, a novel multimedia retrieval system that is tailored to large collections and that is able to support both Boolean retrieval for structured data and similarity-based retrieval for feature vectors extracted from the multimedia objects. For efficient query processing in such big multimedia data, ADAM allows the distribution of the indexed collection to multiple shards and performs queries in a MapReduce style. Furthermore, it supports a signature-based indexing strategy for similarity search that heavily reduces the query time. The efficiency of ADAM has been successfully evaluated in a content-based image retrieval application on the basis of 14 million images from the ImageNet collection
    corecore