220,021 research outputs found

    Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

    Full text link
    We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

    What is Missing for the Full Deployment of Mobile Search Services? Results from a Survey with Experts

    Full text link
    Web search providers have developed a highly successful business model, which has rendered them amongst some of the most profitable companies operating on the internet. Many observers regard mobile search as the next new big market. In contrast to search on PCs, however, the provision of search on mobiles is still in its infancy. In order to shed light on the real prospects of mobile search we performed a two-round Delphi exercise with experts, in which we included two innovative elements. First, the Delphi exercise included seven forward-looking scenarios for discussion. Then, the second round of the Delphi was carried out during a workshop with 19 of the original 61 participants involved. In this paper we present the findings from the discussions of this final round. Our study confirms the high expectations put into the mobile search market. We found that this optimism is rooted in the view that critical technological components are already available. Our paper argues that the technology push is not yet matched by a corresponding market pull. Web search engines, mobile phone manufacturers, and telecom operators are already starting to take action to place themselves in a favourable position. They are exploring trial applications, but business models are still unclear and companies are experimenting with very different approaches. Our Delphi study identifies interfaces as critical for increased mobile search usage. Moreover, experts think that perceived usefulness is valuable but trust is essential and that privacy should be seen as an opportunity rather than a constraint. The paper concludes with some suggestions for fostering innovation, growth and competitiveness in the mobile search domain by increasing the interoperability of services, assuring the openness and mash-ups of content and services, and developing personal identity data management systems to improve user acceptance and enhance trust

    Biomolecular Event Extraction using Natural Language Processing

    Get PDF
    Biomedical research and discoveries are communicated through scholarly publications and this literature is voluminous, rich in scientific text and growing exponentially by the day. Biomedical journals publish nearly three thousand research articles daily, making literature search a challenging proposition for researchers. Biomolecular events involve genes, proteins, metabolites, and enzymes that provide invaluable insights into biological processes and explain the physiological functional mechanisms. Text mining (TM) or extraction of such events automatically from big data is the only quick and viable solution to gather any useful information. Such events extracted from biological literature have a broad range of applications like database curation, ontology construction, semantic web search and interactive systems. However, automatic extraction has its challenges on account of ambiguity and the diverse nature of natural language and associated linguistic occurrences like speculations, negations etc., which commonly exist in biomedical texts and lead to erroneous elucidation. In the last decade, many strategies have been proposed in this field, using different paradigms like Biomedical natural language processing (BioNLP), machine learning and deep learning. Also, new parallel computing architectures like graphical processing units (GPU) have emerged as possible candidates to accelerate the event extraction pipeline. This paper reviews and provides a summarization of the key approaches in complex biomolecular big data event extraction tasks and recommends a balanced architecture in terms of accuracy, speed, computational cost, and memory usage towards developing a robust GPU-accelerated BioNLP system

    The benefits of resource discovery for publishers: a librarian’s view

    Get PDF
    A core goal of librarians is to maximize usage of the content to which their libraries subscribe. Webscale or resource discovery systems offer a single search box for library users to access subscribed content. This article examines usage data at the University of Huddersfield to show how resource discovery has helped to increase the usage of publisher content, which has been made available to discovery vendors and considers the implications for publishers who are yet to do this. The article concludes that resource discovery systems have effectively levelled the playing field, allowing small to medium sized publishers to make content discoverable to users, and encourages publishers who do not have their content indexed in resource discovery systems to speak to discovery service vendor in order to do so at the earliest opportunity

    Library Resources: Procurement, Innovation and Exploitation in a Digital World

    Get PDF
    The possibilities of the digital future require new models for procurement, innovation and exploitation. Emma Crowley and Chris Spencer describe the skills staff need to deliver resources in hybrid and digital environments. The chapter demonstrates the innovative ways that librarians use to procure and exploit the wealth of resources available in a digital world. They also describe the technological developments that can be adopted to improve workflow processes and they highlight the challenges faced on this fascinating journey

    ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads

    Full text link
    ARM processors have dominated the mobile device market in the last decade due to their favorable computing to energy ratio. In this age of Cloud data centers and Big Data analytics, the focus is increasingly on power efficient processing, rather than just high throughput computing. ARM's first commodity server-grade processor is the recent AMD A1100-series processor, based on a 64-bit ARM Cortex A57 architecture. In this paper, we study the performance and energy efficiency of a server based on this ARM64 CPU, relative to a comparable server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads. Specifically, we study these for Intel's HiBench suite of web, query and machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed setup, for data sizes up to 20GB20GB files, 5M5M web pages and 500M500M tuples. Our results show that the ARM64 server's runtime performance is comparable to the x64 server for integer-based workloads like Sort and Hive queries, and only lags behind for floating-point intensive benchmarks like PageRank, when they do not exploit data parallelism adequately. We also see that the ARM64 server takes 13rd\frac{1}{3}^{rd} the energy, and has an Energy Delay Product (EDP) that is 5071%50-71\% lower than the x64 server. These results hold promise for ARM64 data centers hosting Big Data workloads to reduce their operational costs, while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 201

    Stigmergic hyperlink's contributes to web search

    Get PDF
    Stigmergic hyperlinks are hyperlinks with a "heart beat": if used they stay healthy and online; if neglected, they fade, eventually getting replaced. Their life attribute is a relative usage measure that regular hyperlinks do not provide, hence PageRank-like measures have historically been well informed about the structure of webs of documents, but unaware of what users effectively do with the links. This paper elaborates on how to input the users’ perspective into Google’s original, structure centric, PageRank metric. The discussion then bridges to the Deep Web, some search challenges, and how stigmergic hyperlinks could help decentralize the search experience, facilitating user generated search solutions and supporting new related business models.info:eu-repo/semantics/publishedVersio

    Web 2.0 and destination marketing: current trends and future directions

    Get PDF
    Over the last decade, destination marketers and Destination Marketing Organizations (DMOs) have increasingly invested in Web 2.0 technologies as a cost-effective means of promoting destinations online, in the face of drastic marketing budgets cuts. Recent scholarly and industry research has emphasized that Web 2.0 plays an increasing role in destination marketing. However, no comprehensive appraisal of this research area has been conducted so far. To address this gap, this study conducts a quantitative literature review to examine the extent to which Web 2.0 features in destination marketing research that was published until December 2019, by identifying research topics, gaps and future directions, and designing a theory-driven agenda for future research. The study’s findings indicate an increase in scholarly literature revolving around the adoption and use of Web 2.0 for destination marketing purposes. However, the emerging research field is fragmented in scope and displays several gaps. Most of the studies are descriptive in nature and a strong overarching conceptual framework that might help identify critical destination marketing problems linked to Web 2.0 technologies is missing
    corecore