Search CORE

3 research outputs found

Critical Analysis of Solutions to Hadoop Small File Problem

Author: Dr. Chandramouli H
Prof. Shwetha K S
Publication venue: Global Journals Inc. (US)
Publication date: 28/10/2023
Field of study

Hadoop big data platform is designed to process large volume of data Small file problem is a performance bottleneck in Hadoop processing Small files lower than the block size of Hadoop creates huge storage overhead at Namenode s and also wastes computational resources due to spawning of many map tasks Various solutions like merging small files mapping multiple map threads to same java virtual machine instance etc have been proposed to solve the small file problems in Hadoop This survey does a critical analysis of existing works addressing small file problems in Hadoop and its variant platforms like Spark The aim is to understand their effectiveness in reducing the storage computational overhead and identify the open issues for further researc

Global Journal of Computer Science and Technology (GJCST)

A Semi-supervised Corpus Annotation for Saudi Sentiment Analysis Using Twitter

Author: A Assiri
AB Soliman
AS Alqarafi
HK Aldayel
K Dashtipour
K Khalifa
M Rushdi-Saleh
Publication venue: Springer
Publication date: 01/01/2018
Field of study

In the literature, limited work has been conducted to develop sentiment resources for Saudi dialect. The lack of resources such as dialectical lexicons and corpora are some of the major bottlenecks to the successful development of Arabic sentiment analysis models. In this paper, a semi-supervised approach is presented to construct an annotated sentiment corpus for Saudi dialect using Twitter. The presented approach is primarily based on a list of lexicons built by using word embedding techniques such as word2vec. A huge corpus extracted from twitter is annotated and manually reviewed to exclude incorrect annotated tweets which is publicly available. For corpus validation, state-of-the-art classification algorithms (such as Logistic Regression, Support Vector Machine, and Naive Bayes) are applied and evaluated. Simulation results demonstrate that the Naive Bayes algorithm outperformed all other approaches and achieved accuracy up to 91%

Crossref

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Repository@Napier

Advances in brain inspired cognitive systems: 9th international conference, BICS 2018, Xi'an, China, July 7-8, 2018, proceedings

Author: Hussain Amir
Liu Cheng-Lin
Luo Bin
Ren Jinchang
Zhao Huimin
Zhao Xinbo
Zheng Jiangbin
Publication venue: Springer International Publishing AG
Publication date: 01/01/2018
Field of study

CERN Document Server