Search CORE

220,021 research outputs found

Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

Author: Dalton Jeff
Li Zhenghua
Lin Jimmy
Mishne Gilad
Sharma Aneesh
Publication venue
Publication date: 27/10/2012
Field of study

We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

arXiv.org e-Print Archive

CiteSeerX

What is Missing for the Full Deployment of Mobile Search Services? Results from a Survey with Experts

Author: Bacigalupo M.
Compañó Ramón
Feijoo Gonzalez Claudio Antonio
Gómez Barroso José Luis
Nikolov S.G.
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/08/2010
Field of study

Web search providers have developed a highly successful business model, which has rendered them amongst some of the most profitable companies operating on the internet. Many observers regard mobile search as the next new big market. In contrast to search on PCs, however, the provision of search on mobiles is still in its infancy. In order to shed light on the real prospects of mobile search we performed a two-round Delphi exercise with experts, in which we included two innovative elements. First, the Delphi exercise included seven forward-looking scenarios for discussion. Then, the second round of the Delphi was carried out during a workshop with 19 of the original 61 participants involved. In this paper we present the findings from the discussions of this final round. Our study confirms the high expectations put into the mobile search market. We found that this optimism is rooted in the view that critical technological components are already available. Our paper argues that the technology push is not yet matched by a corresponding market pull. Web search engines, mobile phone manufacturers, and telecom operators are already starting to take action to place themselves in a favourable position. They are exploring trial applications, but business models are still unclear and companies are experimenting with very different approaches. Our Delphi study identifies interfaces as critical for increased mobile search usage. Moreover, experts think that perceived usefulness is valuable but trust is essential and that privacy should be seen as an opportunity rather than a constraint. The paper concludes with some suggestions for fostering innovation, growth and competitiveness in the mobile search domain by increasing the interoperability of services, assuring the openness and mash-ups of content and services, and developing personal identity data management systems to improve user acceptance and enhance trust

Archivo Digital UPM

Biomolecular Event Extraction using Natural Language Processing

Author: Anandaraj S.P.
Bali Manish
Publication venue: Faculty of Electrical Engineering, J.J. Strossmayer University of Osijek
Publication date: 01/01/2023
Field of study

Biomedical research and discoveries are communicated through scholarly publications and this literature is voluminous, rich in scientific text and growing exponentially by the day. Biomedical journals publish nearly three thousand research articles daily, making literature search a challenging proposition for researchers. Biomolecular events involve genes, proteins, metabolites, and enzymes that provide invaluable insights into biological processes and explain the physiological functional mechanisms. Text mining (TM) or extraction of such events automatically from big data is the only quick and viable solution to gather any useful information. Such events extracted from biological literature have a broad range of applications like database curation, ontology construction, semantic web search and interactive systems. However, automatic extraction has its challenges on account of ambiguity and the diverse nature of natural language and associated linguistic occurrences like speculations, negations etc., which commonly exist in biomedical texts and lead to erroneous elucidation. In the last decade, many strategies have been proposed in this field, using different paradigms like Biomedical natural language processing (BioNLP), machine learning and deep learning. Also, new parallel computing architectures like graphical processing units (GPU) have emerged as possible candidates to accelerate the event extraction pipeline. This paper reviews and provides a summarization of the key approaches in complex biomolecular big data event extraction tasks and recommends a balanced architecture in terms of accuracy, speed, computational cost, and memory usage towards developing a robust GPU-accelerated BioNLP system

HRČAK - Portal of Croatian Scientific and Professional Journals

The benefits of resource discovery for publishers: a librarian’s view

Author: Stone Graham
Publication venue: 'Association of Learned and Professional Society Publishers (ALPSP)'
Publication date: 25/03/2015
Field of study

A core goal of librarians is to maximize usage of the content to which their libraries subscribe. Webscale or resource discovery systems offer a single search box for library users to access subscribed content. This article examines usage data at the University of Huddersfield to show how resource discovery has helped to increase the usage of publisher content, which has been made available to discovery vendors and considers the implications for publishers who are yet to do this. The article concludes that resource discovery systems have effectively levelled the playing field, allowing small to medium sized publishers to make content discoverable to users, and encourages publishers who do not have their content indexed in resource discovery systems to speak to discovery service vendor in order to do so at the earliest opportunity

Crossref

University of Huddersfield Repository

Library Resources: Procurement, Innovation and Exploitation in a Digital World

Author: Crowley Emma J.
Spencer Chris
Publication venue: Ashgate
Publication date: 01/01/2011
Field of study

The possibilities of the digital future require new models for procurement, innovation and exploitation. Emma Crowley and Chris Spencer describe the skills staff need to deliver resources in hybrid and digital environments. The chapter demonstrates the innovative ways that librarians use to procure and exploit the wealth of resources available in a digital world. They also describe the technological developments that can be adopted to improve workflow processes and they highlight the challenges faced on this fascinating journey

Bournemouth University Research Online

ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads

Author: Kalyanasundaram Jayanth
Simmhan Yogesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/09/2017
Field of study

ARM processors have dominated the mobile device market in the last decade due to their favorable computing to energy ratio. In this age of Cloud data centers and Big Data analytics, the focus is increasingly on power efficient processing, rather than just high throughput computing. ARM's first commodity server-grade processor is the recent AMD A1100-series processor, based on a 64-bit ARM Cortex A57 architecture. In this paper, we study the performance and energy efficiency of a server based on this ARM64 CPU, relative to a comparable server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads. Specifically, we study these for Intel's HiBench suite of web, query and machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed setup, for data sizes up to

20GB

files,

5M

web pages and

500M

tuples. Our results show that the ARM64 server's runtime performance is comparable to the x64 server for integer-based workloads like Sort and Hive queries, and only lags behind for floating-point intensive benchmarks like PageRank, when they do not exploit data parallelism adequately. We also see that the ARM64 server takes

\frac{1}{3}^{rd}

the energy, and has an Energy Delay Product (EDP) that is

50-71\%

lower than the x64 server. These results hold promise for ARM64 data centers hosting Big Data workloads to reduce their operational costs, while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 201

arXiv.org e-Print Archive

Crossref

Open Access Repository of IISc Research Publications

Stigmergic hyperlink's contributes to web search

Author: Marques Artur
Publication venue: Instituto Politécnico de Santarém
Publication date: 01/01/2017
Field of study

Stigmergic hyperlinks are hyperlinks with a "heart beat": if used they stay healthy and online; if neglected, they fade, eventually getting replaced. Their life attribute is a relative usage measure that regular hyperlinks do not provide, hence PageRank-like measures have historically been well informed about the structure of webs of documents, but unaware of what users effectively do with the links. This paper elaborates on how to input the users’ perspective into Google’s original, structure centric, PageRank metric. The discussion then bridges to the Deep Web, some search challenges, and how stigmergic hyperlinks could help decentralize the search experience, facilitating user generated search solutions and supporting new related business models.info:eu-repo/semantics/publishedVersio

Repositório Cientifico do Instituto Politécnico de Santarém

Web 2.0 and destination marketing: current trends and future directions

Author: Mariani Marcello
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Over the last decade, destination marketers and Destination Marketing Organizations (DMOs) have increasingly invested in Web 2.0 technologies as a cost-effective means of promoting destinations online, in the face of drastic marketing budgets cuts. Recent scholarly and industry research has emphasized that Web 2.0 plays an increasing role in destination marketing. However, no comprehensive appraisal of this research area has been conducted so far. To address this gap, this study conducts a quantitative literature review to examine the extent to which Web 2.0 features in destination marketing research that was published until December 2019, by identifying research topics, gaps and future directions, and designing a theory-driven agenda for future research. The study’s findings indicate an increase in scholarly literature revolving around the adoption and use of Web 2.0 for destination marketing purposes. However, the emerging research field is fragmented in scope and displays several gaps. Most of the studies are descriptive in nature and a strong overarching conceptual framework that might help identify critical destination marketing problems linked to Web 2.0 technologies is missing

Multidisciplinary Digital Publishing Institute

Central Archive at the University of Reading

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna