Search CORE

110,243 research outputs found

Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm

Author: Gupta Vibhuti
Hewett Rattikorn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/12/2018
Field of study

Twitter is a popular social network platform where users can interact and post texts of up to 280 characters called tweets. Hashtags, hyperlinked words in tweets, have increasingly become crucial for tweet retrieval and search. Using hashtags for tweet topic classification is a challenging problem because of context dependent among words, slangs, abbreviation and emoticons in a short tweet along with evolving use of hashtags. Since Twitter generates millions of tweets daily, tweet analytics is a fundamental problem of Big data stream that often requires a real-time Distributed processing. This paper proposes a distributed online approach to tweet topic classification with hashtags. Being implemented on Apache Storm, a distributed real time framework, our approach incrementally identifies and updates a set of strong predictors in the Na\"ive Bayes model for classifying each incoming tweet instance. Preliminary experiments show promising results with up to 97% accuracy and 37% increase in throughput on eight processors.Comment: IEEE International Conference on Big Data 201

arXiv.org e-Print Archive

Crossref

Boosting computational power through spatial multiplexing in quantum reservoir computing

Author: Fujii Keisuke
Kitagawa Masahiro
Mitarai Kosuke
Nakajima Kohei
Negoro Makoto
Publication venue: 'American Physical Society (APS)'
Publication date: 12/03/2018
Field of study

Quantum reservoir computing provides a framework for exploiting the natural dynamics of quantum systems as a computational resource. It can implement real-time signal processing and solve temporal machine learning problems in general, which requires memory and nonlinear mapping of the recent input stream using the quantum dynamics in computational supremacy region, where the classical simulation of the system is intractable. A nuclear magnetic resonance spin-ensemble system is one of the realistic candidates for such physical implementations, which is currently available in laboratories. In this paper, considering these realistic experimental constraints for implementing the framework, we introduce a scheme, which we call a spatial multiplexing technique, to effectively boost the computational power of the platform. This technique exploits disjoint dynamics, which originate from multiple different quantum systems driven by common input streams in parallel. Accordingly, unlike designing a single large quantum system to increase the number of qubits for computational nodes, it is possible to prepare a huge number of qubits from multiple but small quantum systems, which are operationally easy to handle in laboratory experiments. We numerically demonstrate the effectiveness of the technique using several benchmark tasks and quantitatively investigate its specifications, range of validity, and limitations in detail.Comment: 15 page

arXiv.org e-Print Archive

Osaka University Knowledge Archive

Institutional Repositories DataBase (IRDB)

An occam Style Communications System for UNIX Networks

Author: Vella Kevin J.
Publication venue: University of Kent, Computing Laboratory
Publication date: 01/12/1995
Field of study

This document describes the design of a communications system which provides occam style communications primitives under a Unix environment, using TCP/IP protocols, and any number of other protocols deemed suitable as underlying transport layers. The system will integrate with a low overhead scheduler/kernel without incurring significant costs to the execution of processes within the run time environment. A survey of relevant occam and occam3 features and related research is followed by a look at the Unix and TCP/IP facilities which determine our working constraints, and a description of the T9000 transputer's Virtual Channel Processor, which was instrumental in our formulation. Drawing from the information presented here, a design for the communications system is subsequently proposed. Finally, a preliminary investigation of methods for lightweight access control to shared resources in an environment which does not provide support for critical sections, semaphores, or busy waiting, is made. This is presented with relevance to mutual exclusion problems which arise within the proposed design. Future directions for the evolution of this project are discussed in conclusion

Kent Academic Repository

Next challenges for adaptive learning systems

Author: Bifet A.
Gaber M.
Gabrys B.
Gama J.
Minku L.
Musial K.
Zliobaite I.
Publication venue
Publication date: 01/01/2012
Field of study

University of Birmingham Research Portal

Portsmouth University Research Portal (Pure)

A Case Study in Coordination Programming: Performance Evaluation of S-Net vs Intel's Concurrent Collections

Author: Gijsbers Bert
Grelck Clemens
Shafarenko Alex
Tveretina Olga
Zaichenkov Pavel
Publication venue
Publication date: 01/01/2014
Field of study

We present a programming methodology and runtime performance case study comparing the declarative data flow coordination language S-Net with Intel's Concurrent Collections (CnC). As a coordination language S-Net achieves a near-complete separation of concerns between sequential software components implemented in a separate algorithmic language and their parallel orchestration in an asynchronous data flow streaming network. We investigate the merits of S-Net and CnC with the help of a relevant and non-trivial linear algebra problem: tiled Cholesky decomposition. We describe two alternative S-Net implementations of tiled Cholesky factorization and compare them with two CnC implementations, one with explicit performance tuning and one without, that have previously been used to illustrate Intel CnC. Our experiments on a 48-core machine demonstrate that S-Net manages to outperform CnC on this problem.Comment: 9 pages, 8 figures, 1 table, accepted for PLC 2014 worksho

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

International Migration, Integration and Social Cohesion online publications

Data Workflow - A Workflow Model for Continuous Data Processing

Author: Wombacher Andreas
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2010
Field of study

Online data or streaming data are getting more and more important for enterprise information systems, e.g. by integrating sensor data and workflows. The continuous flow of data provided e.g. by sensors requires new workflow models addressing the data perspective of these applications, since continuous data is potentially infinite while business process instances are always finite.\ud In this paper a formal workflow model is proposed with data driven coordination and explicating properties of the continuous data processing. These properties can be used to optimize data workflows, i.e., reducing the computational power for processing the workflows in an engine by reusing intermediate processing results in several workflows

University of Twente Research Information