20 research outputs found

    Implementation in Data Cube Mining for Map Reduce Paradigm

    Get PDF
    Computing measures for tweeter data cubes mining of cube group over data sets are impossible for many analyses in the tweeter.We have to compute the data set taken from tweeter user. You have to create a cube creation and then measure dimension setting using the roll up function.In the real world various challenges in the cube materlization and mining on web data sets. Map shuffle Reduce can be efficient extract cube and aggregate function on attribtes of tweeter.MR-Cube can be extract from efficient and effective PC cubes of holistic measures over large-tuple aggregation sets.In the existing techniques can not measure the holistic scale to the large tuples. DOI: 10.17762/ijritcc2321-8169.150614

    Static query result caching revisited

    Get PDF
    Query result caching is an important mechanism for search engine efficiency. In this study, we first review several query features that are used to determine the contents of a static result cache. Next, we introduce a new feature that more accurately represents the popularity of a query by measuring the stability of query frequency over a set of time intervals. Experimental results show that this new feature achieves hit ratios better than those of the previously proposed features

    Mejoras algorítmicas y estructuras de datos para búsquedas altamente eficientes

    Get PDF
    El problema de la búsqueda en Internet presenta desafíos constantes. Los datos son cada vez más ricos y complejos, se utilizan y varían en tiempo real, aportando nuevo valor, pero solamente si están disponibles en tiempo y forma. Los usuarios utilizan cada vez más motores de búsqueda, esperando satisfacer sus necesidades de información, navegación o para hacer transacciones, requiriendo que respondan miles de consultas por segundo. Para poder manejar eficientemente el tamaño de una colección de documentos recolectados desde la web, los motores de búsqueda utilizan estructuras de datos distribuidas para hacer eficiente la búsqueda y técnicas de caching para optimizar los tiempos de respuesta. En este proyecto se propone diseñar y evaluar estructuras de datos avanzadas junto con nuevas técnicas algorítmicas que permitan mejorar la performance en las búsquedas para colecciones de datos de escala web.Eje: Procesamiento distribuido y paraleloRed de Universidades con Carreras en Informática (RedUNCI

    Consistency mechanisms for a distributed lookup service supporting mobile applications

    Get PDF
    This paper presents a general-purpose distributed lookup service, denoted Passive Distributed Indexing (PDI). PDI stores entries in form of (key, value) pairs in index caches located in each mobile device. Index caches are filled by epidemic dissemination of popular index entries. By exploiting node mobility, PDI can resolve most queries locally without sending messages outside the radio coverage of the inquiring node. Thus, PDI reduces network traffic for the resolution of keys to values. For keeping index caches coherent, configurable value timeouts implementing implicit invalidation and lazy invalidation caches implementing explicit invalidation are introduced. Inconsistency in index caches due to weak connectivity or node failure is handled by value timeouts. Lazy invalidation caches reduce the fraction of stale index entries due to modified data at the origin node. Similar to index caches, invalidation caches are filled by epidemic distributions of invalidation messages. Simulation results show that with the suitable integration of both invalidation mechanisms, more than 95% of results delivered by PDI index caches are up-to-date for the considered scenario

    Exploiting epidemic data dissemination for consistent lookup operations in mobile applications

    Get PDF
    This paper presents a general-purpose distributed lookup service, denoted Passive Distributed Indexing (PDI). PDI stores entries in form of (key, value) pairs in index caches located at mobile devices. Index caches are filled by epidemic dissemination of popular index entries. By exploiting node mobility, PDI can resolve most queries locally without sending messages outside the radio coverage of the inquiring node. For keeping index caches coherent, configurable value timeouts implementing implicit invalidation and lazy invalidation caches implementing explicit invalidation are introduced. Inconsistency in index caches due to weak connectivity or node failure is handled by value timeouts. Lazy invalidation caches reduce the fraction of stale index entries due to modified data at the origin node. Similar to index caches, invalidation caches are filled by epidemic distributions of invalidation messages. We evaluate the performance of PDI for a mobile P2P file sharing a mobile instant messaging application. Simulation results show that with the suitable integration of both invalidation mechanisms, up to 80% of the lookup operations return correct results and more than 90% of results delivered by PDI index caches are up-to-date

    Exploiting Available Memory and Disk for Scalable Instant Overview Search

    Full text link
    Abstract. Search-As-You-Type (or Instant Search) is a recently intro-duced functionality which shows predictive results while the user types a query letter by letter. In this paper we generalize and propose an ex-tension of this technique which apart from showing on-the-fly the first page of results, it shows various other kinds of information, e.g. the outcome of results clustering techniques, or metadata-based groupings of the results. Although this functionality is more informative than the classic search-as-you type, since it combines Autocompletion, Search-As-You-Type, and Results Clustering, the provision of real-time interaction is more challenging. To tackle this issue we propose an approach based on pre-computed information and we comparatively evaluate various in-dex structures for making real-time interaction feasible, even if the size of the available memory space is limited. This comparison reveals the mem-ory/performance trade-off and allows deciding which index structure to use according to the available main memory and desired performance. Furthermore we show that an incremental algorithm can be used to keep the index structure fresh.

    The value of location in keyword auctions

    Get PDF
    Sponsored links on search engines are an emerging advertising tool, whereby a number of slots are put on sale through keyword auctions. This is also known as contextual advertising. Slot assignment and pricing in keyword auctions are then essential for the search engine\u2019s management since provide the main stream of revenues, and are typically accomplished by the Generalized Second Price (GSP) mechanism. In GSP the price of slots is a monotone function of the slot location, being larger for the highest slots. Though a higher location is associated with larger revenues, the lower costs associated with the lowest slots may make them more attractive for the advertiser. The contribution of this research is to show, by analytical and simulation results based on the theory of order statistics, that advertisers may not get the optimal slot they aim at (the slot maximizing their expected profit) and that the GSP mechanism may be unfair to all the winning bidders but the one who submitted the lowest bid

    A five-level static cache architecture for web search engines

    Get PDF
    Caching is a crucial performance component of large-scale web search engines, as it greatly helps reducing average query response times and query processing workloads on backend search clusters. In this paper, we describe a multi-level static cache architecture that stores five different item types: query results, precomputed scores, posting lists, precomputed intersections of posting lists, and documents. Moreover, we propose a greedy heuristic to prioritize items for caching, based on gains computed by using items' past access frequencies, estimated computational costs, and storage overheads. This heuristic takes into account the inter-dependency between individual items when making its caching decisions, i.e.; after a particular item is cached, gains of all items that are affected by this decision are updated. Our simulations under realistic assumptions reveal that the proposed heuristic performs better than dividing the entire cache space among particular item types at fixed proportions. © 2010 Elsevier Ltd. All rights reserved

    Search engine Performance optimization: methods and techniques [version 3; peer review: 2 approved, 1 not approved]

    Get PDF
    Background With the rapid advancement of information technology, search engine optimisation (SEO) has become crucial for enhancing the visibility and relevance of online content. In this context, the use of cloud platforms like Microsoft Azure is being explored to bolster SEO capabilities. Methods This scientific article offers an in-depth study of search engine optimisation. It explores the different methods and techniques used to improve the performance and efficiency of a search engine, focusing on key aspects such as result relevance, search speed and user experience. The article also presents case studies and concrete examples to illustrate the practical application of optimisation techniques. Results The results demonstrate the importance of optimisation in delivering high quality search results and meeting the increasing demands of users. Conclusions The article addresses the enhancement of search engines through the Microsoft Azure infrastructure and its associated components. It highlights methods such as indexing, semantic analysis, parallel searches, and caching to strengthen the relevance of results, speed up searches, and optimise the user experience. Following the application of these methods, a marked improvement was observed in these areas, thereby showcasing the capability of Microsoft Azure in enhancing search engines. The study sheds light on the implementation and analysis of these Azure-focused techniques, introduces a methodology for assessing their efficacy, and details the specific benefits of each method. Looking forward, the article suggests integrating artificial intelligence to elevate the relevance of results, venturing into other cloud infrastructures to boost performance, and evaluating these methods in specific scenarios, such as multimedia information search. In summary, with Microsoft Azure, the enhancement of search engines appears promising, with increased relevance and a heightened user experience in a rapidly evolving sector
    corecore