220,021 research outputs found
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
What is Missing for the Full Deployment of Mobile Search Services? Results from a Survey with Experts
Web search providers have developed a highly successful business model, which has rendered them amongst some of the most profitable companies operating on the internet. Many observers regard mobile search as the next new big market. In contrast to search on PCs, however, the provision of search on mobiles is still in its infancy. In order to shed light on the real prospects of mobile search we performed a two-round Delphi exercise with experts, in which we included two innovative elements. First, the Delphi exercise included seven forward-looking scenarios for discussion. Then, the second round of the Delphi was carried out during a workshop with 19 of the original 61 participants involved. In this paper we present the findings from the discussions of this final round. Our study confirms the high expectations put into the mobile search market. We found that this optimism is rooted in the view that critical technological components are already available. Our paper argues that the technology push is not yet matched by a corresponding market pull. Web search engines, mobile phone manufacturers, and telecom operators are already starting to take action to place themselves in a favourable position. They are exploring trial applications, but business models are still unclear and companies are experimenting with very different approaches. Our Delphi study identifies interfaces as critical for increased mobile search usage. Moreover, experts think that perceived usefulness is valuable but trust is essential and that privacy should be seen as an opportunity rather than a constraint. The paper concludes with some suggestions for fostering innovation, growth and competitiveness in the mobile search domain by increasing the interoperability of services, assuring the openness and mash-ups of content and services, and developing personal identity data management systems to improve user acceptance and enhance trust
Biomolecular Event Extraction using Natural Language Processing
Biomedical research and discoveries are communicated through scholarly publications and this literature is voluminous, rich in scientific text and growing exponentially by the day. Biomedical journals publish nearly three thousand research articles daily, making literature search a challenging proposition for researchers. Biomolecular events involve genes, proteins, metabolites, and enzymes that provide invaluable insights into biological processes and explain the physiological functional mechanisms. Text mining (TM) or extraction of such events automatically from big data is the only quick and viable solution to gather any useful information. Such events extracted from biological literature have a broad range of applications like database curation, ontology construction, semantic web search and interactive systems. However, automatic extraction has its challenges on account of ambiguity and the diverse nature of natural language and associated linguistic occurrences like speculations, negations etc., which commonly exist in biomedical texts and lead to erroneous elucidation. In the last decade, many strategies have been proposed in this field, using different paradigms like Biomedical natural language processing (BioNLP), machine learning and deep learning. Also, new parallel computing architectures like graphical processing units (GPU) have emerged as possible candidates to accelerate the event extraction pipeline. This paper reviews and provides a summarization of the key approaches in complex biomolecular big data event extraction tasks and recommends a balanced architecture in terms of accuracy, speed, computational cost, and memory usage towards developing a robust GPU-accelerated BioNLP system
The benefits of resource discovery for publishers: a librarian’s view
A core goal of librarians is to maximize usage of the content to which their libraries subscribe. Webscale or resource discovery systems offer a single search box for library users to access subscribed content. This article examines usage data at the University of Huddersfield to show how resource discovery has helped to increase the usage of publisher content, which has been made available to discovery vendors and considers the implications for publishers who are yet to do this. The article concludes that resource discovery systems have effectively levelled the playing field, allowing small to medium sized publishers to make content discoverable to users, and encourages publishers who do not have their content indexed in resource discovery systems to speak to discovery service vendor in order to do so at the earliest opportunity
Library Resources: Procurement, Innovation and Exploitation in a Digital World
The possibilities of the digital future require new models for procurement, innovation and exploitation. Emma Crowley and Chris Spencer describe the skills staff need to deliver resources in hybrid and digital environments. The chapter demonstrates the innovative ways that librarians use to procure and exploit the wealth of resources available in a digital world. They also describe the technological developments that can be adopted to improve workflow processes and they highlight the challenges faced on this fascinating journey
ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads
ARM processors have dominated the mobile device market in the last decade due
to their favorable computing to energy ratio. In this age of Cloud data centers
and Big Data analytics, the focus is increasingly on power efficient
processing, rather than just high throughput computing. ARM's first commodity
server-grade processor is the recent AMD A1100-series processor, based on a
64-bit ARM Cortex A57 architecture. In this paper, we study the performance and
energy efficiency of a server based on this ARM64 CPU, relative to a comparable
server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads.
Specifically, we study these for Intel's HiBench suite of web, query and
machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed
setup, for data sizes up to files, web pages and tuples. Our
results show that the ARM64 server's runtime performance is comparable to the
x64 server for integer-based workloads like Sort and Hive queries, and only
lags behind for floating-point intensive benchmarks like PageRank, when they do
not exploit data parallelism adequately. We also see that the ARM64 server
takes the energy, and has an Energy Delay Product (EDP) that
is lower than the x64 server. These results hold promise for ARM64
data centers hosting Big Data workloads to reduce their operational costs,
while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE
International Conference on High Performance Computing, Data, and Analytics
(HiPC), 201
Stigmergic hyperlink's contributes to web search
Stigmergic hyperlinks are hyperlinks with a "heart beat": if used they stay healthy and online; if
neglected, they fade, eventually getting replaced. Their life attribute is a relative usage measure that
regular hyperlinks do not provide, hence PageRank-like measures have historically been well
informed about the structure of webs of documents, but unaware of what users effectively do with
the links.
This paper elaborates on how to input the users’ perspective into Google’s original, structure centric,
PageRank metric. The discussion then bridges to the Deep Web, some search challenges, and how
stigmergic hyperlinks could help decentralize the search experience, facilitating user generated
search solutions and supporting new related business models.info:eu-repo/semantics/publishedVersio
Web 2.0 and destination marketing: current trends and future directions
Over the last decade, destination marketers and Destination Marketing Organizations (DMOs) have increasingly invested in Web 2.0 technologies as a cost-effective means of promoting destinations online, in the face of drastic marketing budgets cuts. Recent scholarly and industry research has emphasized that Web 2.0 plays an increasing role in destination marketing. However, no comprehensive appraisal of this research area has been conducted so far. To address this gap, this study conducts a quantitative literature review to examine the extent to which Web 2.0 features in destination marketing research that was published until December 2019, by identifying research topics, gaps and future directions, and designing a theory-driven agenda for future research. The study’s findings indicate an increase in scholarly literature revolving around the adoption and use of Web 2.0 for destination marketing purposes. However, the emerging research field is fragmented in scope and displays several gaps. Most of the studies are descriptive in nature and a strong overarching conceptual framework that might help identify critical destination marketing problems linked to Web 2.0 technologies is missing
- …