Search CORE

477 research outputs found

JUMMP: Job Uninterrupted Maneuverable MapReduce Platform

Author: Apon Amy
Duffy Edward
Moody William Clay
Ngo Linh B.
Publication venue: Clemson University Libraries
Publication date: 01/09/2013
Field of study

In this paper, we present JUMMP, the Job Uninterrupted Maneuverable MapReduce Platform, an automated scheduling platform that provides a customized Hadoop environment within a batch-scheduled cluster environment. JUMMP enables an interactive pseudo-persistent MapReduce platform within the existing administrative structure of an academic high performance computing center by “jumping” between nodes with minimal administrative effort. Jumping is implemented by the synchronization of stopping and starting daemon processes on different nodes in the cluster. Our experimental evaluation shows that JUMMP can be as efficient as a persistent Hadoop cluster on dedicated computing resources, depending on the jump time. Additionally, we show that the cluster remains stable, with good performance, in the presence of jumps that occur as frequently as the average length of reduce tasks of the currently executing MapReduce job. JUMMP provides an attractive solution to academic institutions that desire to integrate Hadoop into their current computing environment within their financial, technical, and administrative constraints

Clemson University: TigerPrints

Scalable Audience Reach Estimation in Real-time Online Advertising

Author: Dasdan Ali
Foldes Peter
Jalali Ali
Kolay Santanu
Publication venue
Publication date: 13/05/2013
Field of study

Online advertising has been introduced as one of the most efficient methods of advertising throughout the recent years. Yet, advertisers are concerned about the efficiency of their online advertising campaigns and consequently, would like to restrict their ad impressions to certain websites and/or certain groups of audience. These restrictions, known as targeting criteria, limit the reachability for better performance. This trade-off between reachability and performance illustrates a need for a forecasting system that can quickly predict/estimate (with good accuracy) this trade-off. Designing such a system is challenging due to (a) the huge amount of data to process, and, (b) the need for fast and accurate estimates. In this paper, we propose a distributed fault tolerant system that can generate such estimates fast with good accuracy. The main idea is to keep a small representative sample in memory across multiple machines and formulate the forecasting problem as queries against the sample. The key challenge is to find the best strata across the past data, perform multivariate stratified sampling while ensuring fuzzy fall-back to cover the small minorities. Our results show a significant improvement over the uniform and simple stratified sampling strategies which are currently widely used in the industry

arXiv.org e-Print Archive

CiteSeerX

Crossref

LightLog: A lightweight temporal convolutional network for log anomaly detection on the edge

Author: Chen Luke
Fang Hui
Qin Jing
Tian Ji-Yu
Wang Zumin
Publication venue: 'Elsevier BV'
Publication date: 11/02/2022
Field of study

Ulster University's Research Portal

Designing, Building, and Modeling Maneuverable Applications within Shared Computing Resources

Author: Moody William Clay
Publication venue: Clemson University Libraries
Publication date: 01/05/2015
Field of study

Extending the military principle of maneuver into war-ﬁghting domain of cyberspace, academic and military researchers have produced many theoretical and strategic works, though few have focused on researching actual applications and systems that apply this principle. We present our research in designing, building and modeling maneuverable applications in order to gain the system advantages of resource provisioning, application optimization, and cybersecurity improvement. We have coined the phrase “Maneuverable Applications” to be deﬁned as distributed and parallel application that take advantage of the modiﬁcation, relocation, addition or removal of computing resources, giving the perception of movement. Our work with maneuverable applications has been within shared computing resources, such as the Clemson University Palmetto cluster, where multiple users share access and time to a collection of inter-networked computers and servers. In this dissertation, we describe our implementation and analytic modeling of environments and systems to maneuver computational nodes, network capabilities, and security enhancements for overcoming challenges to a cyberspace platform. Speciﬁcally we describe our work to create a system to provision a big data computational resource within academic environments. We also present a computing testbed built to allow researchers to study network optimizations of data centers. We discuss our Petri Net model of an adaptable system, which increases its cybersecurity posture in the face of varying levels of threat from malicious actors. Lastly, we present work and investigation into integrating these technologies into a prototype resource manager for maneuverable applications and validating our model using this implementation

Clemson University: TigerPrints

MaxHadoop: An Efficient Scalable Emulation Tool to Test SDN Protocols in Emulated Hadoop Environments

Author: Alessio Carmenini
Andrea Marotta
Claudio Calcaterra
Dajana Cassioli
Ubaldo Bucci
Publication venue
Publication date: 29/07/2020
Field of study

AbstractThis paper presents MaxHadoop, a flexible and scalable emulation tool, which allows the efficient and accurate emulation of Hadoop environments over Software Defined Networks (SDNs). Hadoop has been designed to manage endless data-streams over networks, making it a tailored candidate to support the new class of network services belonging to Big Data. The development of Hadoop is contemporary with the evolution of networks towards the new architectures "Software Defined." To create our emulation environment, tailored to SDNs, we employ MaxiNet, given its capability of emulating large-scale SDNs. We make it possible to emulate realistic Hadoop scenarios on large-scale SDNs using low-cost commodity hardware, by resolving a few key limitations of MaxiNet through appropriate configuration settings. We validate the MaxHadoop emulator by executing two benchmarks, namely WordCount and TeraSort, to evaluate a set of Key Performance Indicators. The tests' outcomes evidence that MaxHadoop outperforms other existing emulation tools running over commodity hardware. Finally, we show the potentiality of MaxHadoop by utilizing it to perform a comparison of SDN-based network protocols

Open Access Repository

Recommended from our members

Novel information and data exchange within power systems using enhanced blockchain technologies

Author: Amjad Mubashar
Publication venue: Brunel University London
Publication date: 01/01/2023
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonCurrent energy systems are primarily designed for centralized power generation and supplying bulk electricity to users with stable and predictable usage patterns. However, with the increasing penetration of renewable energy sources (RES), future energy systems will require greater flexibility and wider distribution of both demand and supply. Integrating RES on a large scale poses challenges to the hosting capacity of distribution systems. To address these challenges, the digitalization of energy systems through novel Information and Communication Technologies (ICT) infrastructure is essential. The shift from centralized to highly distributed systems necessitates increased coordination and communication efforts. This is because a distributed system is composed of multiple independent entities that need to communicate and collaborate effectively to accomplish a shared objective. Coordination and communication are necessary to ensure that the system is operating efficiently and effectively. Traditional centralized cloud-based data exchange schemes depend on a single trusted third party, this may lead to single-point failure and lack of data privacy and access control. To overcome these issues, a novel approach is proposed for exchanging data within power systems using blockchain technology. This approach enables users to securely exchange data while maintaining ownership. The experiments conducted demonstrate that the proposed approach can handle more users and enables information and data exchange within power systems. Secondly, this thesis proposes an Artificial Neural Network (ANN) based prediction model to optimize the performance of the blockchain-enabled data exchange approach. A use case for exchanging data within the power system is implemented on the proposed platform using various performance metrics. The results of the proposed approach are compared to two other schemes: the baseline scheme and an optimized scheme. The evaluation results indicate that the proposed approach can enhance network performance when compared to the baseline and optimized schemes. In summary, the proposed novel approach to ICT infrastructure for successfully exchanging information and data within power systems entities. The performance of the novel approach is evaluated based on the ability to handle multiple users, scalability, reliability, and security

Brunel University Research Archive

Towards platforms for improved recommender systems at social media scale

Author: Sowinski Christina Diedhiou
Publication venue
Publication date: 01/09/2019
Field of study

Portsmouth University Research Portal (Pure)