9 research outputs found

    A Guide for selecting big data analytics tools in an organisation

    Get PDF
    Selection of appropriate big data analytics (BDA) tools (software) for business purposes is increasingly challenging, which sometimes lead to incompatibility with existing technologies. This becomes prohibitive in attempts to execute some functions or activities in an environment. The objective of this study was to propose a model, which can be used to guide the selection of BDA in an organization. The interpretivist approach was employed. Qualitative data was collected and analyzed using the hermeneutics approach. The analysis focused on examining and gaining better understanding of the strengths and weaknesses of the most common BDA tools. The technical and non-technical factors that influence the selection of BDA were identified. Based on which a solution is proposed in the form of a model. The model is intended to guide selection of most appropriate BDA tools in an organization. The model is intended to increase BDA usefulness towards improving organization’s competitiveness

    دراسة معايير تصنيفية جديدة لأنظمة قواعد المعطيات غير العلائقية في إدارة البيانات الضخمة

    Get PDF
    دراسة معايير تصنيفية جديدة لأنظمة قواعد المعطيات غير العلائقية في إدارة البيانات الضخمة كمال عبد الهادي السلوم هاني محمد حسن مع تطور تطبيقات الانترنيت والزيادة الكبيرة في حجم البيانات المنتجة يومياً، أظهرت قواعد البيانات التقليدية العلائقية ضعفاً في المقدرة على الإدارة الفعالة لهذه البيانات نتيجة القيود التي يفرضها نموذج التصميم المرتبط بها، فظهرت حديثاً قواعد المعطيات غير العلائقية لتجد حلاً لهذه المشكلة دون الاعتماد على البنى والهيكليات الـمـــسبقة التصميم، وعاكسةً لنموذج مرن قدم حلولاً ناجحة لإدارة قواعد البيانات الضخمة، وتطورت هذه الأنظمة كماً ونوعاً خلال فترة زمنية قصيرة، وأصبحت ظاهرة عالمية في إدارة قواعد البيانات سبقت ظهور الدراسات البحثية في هذا المجال، الأمر الذي تنبه له مؤخراً الباحثون الاختصاصيون في هذا المجال، ومع وفرة الموارد التسويقية من هذه الأنظمة بات من الضروري الخوض في دراسات تحليلية وتصميمية شاملة لعملها، لتشكل دليلاً للمستخدمين نحو النماذج والأصناف المناسبة لإدارة بياناتهم الضخمة اعتماداً على خصائص تصنيفية عديدة، وتقديم فكرة عن آلية عمل هذه الأنظمة والتقنيات المستخدمة فيها وهو ما تم التطرق إليه في هذا البحث. According to the development of  internet  applications and the huge increase in the produced volume of data daily, the traditional relational database systems showed weakness in the efficiency of managing this data, because of the restrictions imposed by the design model connected to it. Recently, non-relational database appeared to solve this problem without depending on all predesigned structures and reflecting a flexible model that provides successful solution for managing big databases. In short time, the number of these systems has largely developed both qualitatively and quantitatively, and become a global phenomenon in managing database that preceded the appearance of related studies, something that research specialists have recently noticed. As result of the abundance in marketing resources of these systems, carrying out analytical and design studies of its work becomes a necessity. According ,these studies become a user guide for suitable models and types in managing huge data depending on their several classification properties, They also provide an idea about the mechanism of these systems and the techniques used. This issue was discussed in the research

    Proliferating Cloud Density through Big Data Ecosystem, Novel XCLOUDX Classification and Emergence of as-a-Service Era

    Get PDF
    Big Data is permeating through the bigger aspect of human life for scientific and commercial dependencies, especially for massive scale data analytics of beyond the exabyte magnitude. As the footprint of Big Data applications is continuously expanding, the reliability on cloud environments is also increasing to obtain appropriate, robust and affordable services to deal with Big Data challenges. Cloud computing avoids any need to locally maintain the overly scaled computing infrastructure that include not only dedicated space, but the expensive hardware and software also. Several data models to process Big Data are already developed and a number of such models are still emerging, potentially relying on heterogeneous underlying storage technologies, including cloud computing. In this paper, we investigate the growing role of cloud computing in Big Data ecosystem. Also, we propose a novel XCLOUDX {XCloudX, X…X} classification to zoom in to gauge the intuitiveness of the scientific name of the cloud-assisted NoSQL Big Data models and analyze whether XCloudX always uses cloud computing underneath or vice versa. XCloudX symbolizes those NoSQL Big Data models that embody the term “cloud” in their name, where X is any alphanumeric variable. The discussion is strengthen by a set of important case studies. Furthermore, we study the emergence of as-a-Service era, motivated by cloud computing drive and explore the new members beyond traditional cloud computing stack, developed over the last few years

    A multilevel approach to big data analysis using analytic tools and actor network theory

    Get PDF
    Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness. Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis. Method: In the study, the qualitative methods were employed from the interpretivist approach perspective. Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks. Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness

    Identifying and Scoping Context-Specific Use Cases For Blockchain-Enabled Systems in the Wild.

    Get PDF
    Advances in technology often provide a catalyst for digital innovation. Arising from the global banking crisis at the end of the first decade of the 21st Century, decentralised and distributed systems have seen a surge in growth and interest. Blockchain technology, the foundation of the decentralised virtual currency Bitcoin, is one such catalyst. The main component of a blockchain, is its public record of verified, timestamped transactions maintained in an append-only, chain-like, data structure. This record is replicated across n-nodes in a network of co-operating participants. This distribution offers a public proof of transactions verified in the past. Beyond tokens and virtual currency, real-world use cases for blockchain technology are in need of research and development. The challenge in this proof-of-concept research is to identify an orchestration model of innovation that leads to the successful development of software artefacts that utilise blockchain technology. These artefacts must maximise the potential of the technology and enhance the real-world business application. An original two phase orchestration model is defined. The model includes both a discovery and implementation phase and implements state-of-the-art process innovation frameworks: Capability Maturity Modelling, Business Process Redesign, Open Innovation and Distributed Digital Innovation. The model succeeds in its aim to generate feasible problem-solution design pairings to be implemented as blockchain enabled software systems. Three systems are developed: an internal supply-chain management system, a crowd-source sponsorship model for individual players on a team and a proof-of-origin smart tag system. The contribution is to have defined an innovation model through which context-specific blockchain usecases can be identified and scoped in the wild

    Big Data Analytics for the Cloud

    Get PDF
    Η παρούσα διπλωματική εργασία χωρίζεται σε τρία μέρη. Το πρώτο μέρος αντιστοιχεί στη μελέτη και την παρουσίαση αρχιτεκτονικών που αποτελούν λύσεις για την αντιμετώπιση της πρόκλησης της Διαχείρισης Μεγάλων Δεδομένων, οι οποίες κλιμακώνονται. Το δεύτερο μέρος περιλαμβάνει την επεξεργασία ενός συνόλου δεδομένων το οποίο αποτελείται από μετρήσεις διαφόρων αισθητήρων εγκατεστημένων σε τρένα. Το τελευταίο μέρος περιέχει την ρύθμιση του SiteWhere, μιας IoT πλατφόρμας ανοιχτού λογισμικού, την αποστολή και την αποθήκευση δεδομένων στην πλατφόρμα αυτή, καθώς και την επεξεργασία αυτών των δεδομένων σε περιβάλλον Spark. Το Κεφάλαιο 1 αποτελεί μια εισαγωγή. Στο Κεφάλαιο 2 παρουσιάζεται η αρχιτεκτονική και οι δυνατότητες του SiteWhere ως μία γενική λύση για τη διαχείριση συσκευών IoT. Το Κεφάλαιο 3 εισάγει τις έννοιες των όρων «Μεγάλα Δεδομένα» και «Υπολογιστικό Νέφος». Επίσης παρουσιάζει διάφορες λύσεις για τη Διαχείριση Μεγάλων Δεδομένων καθώς και τις επιστημονικές τάσεις σε αυτό το ζήτημα. Το Κεφάλαιο 4 περιέχει τη μελέτη αλγορίθμων Συσταδοποίησης (KMeans, Birch, Mean Shift, DBSCAN), που χρησιμοποιούνται στο σύνολο δεδομένων του τρένου. Το Κεφάλαιο 5 παρουσιάζει την έννοια της «Πρόβλεψης Χρονοσειράς» και ερευνά τη συμπεριφορά δύο διαφορετικών Νευρωνικών Δικτύων (MLP, LSTM), σχετικά με τη δυνατότητα που παρέχουν για προβλέψεις. Στο Κεφάλαιο 6 παρουσιάζεται λεπτομερώς ο τρόπος με τον οποίο χρησιμοποιήθηκε η πλατφόρμα SiteWhere. Αρχικά παρουσιάζεται η αποστολή δεδομένων στην πλατφόρμα, τα οποία αποθηκεύονται στη βάση δεδομένων InfluxDB και οπτικοποιούνται μέσω της πλατφόρμας Grafana. Στη συνέχεια τα δεδομένα αυτά ανακτώνται από τη βάση, υφίστανται επεξεργασία (Συσταδοποίηση με KMeans και Πρόβλεψη με MLP) στο Spark και γίνεται σύγκριση αυτών των αποτελεσμάτων με αυτά της επεξεργασίας στο «τοπικό σύστημα». Στο Κεφάλαιο 7 γίνεται μια ανακεφαλαίωση και παρουσιάζεται μια σύνοψη των συμπερασμάτων που έχουν εξαχθεί και παρουσιαστεί στα προηγούμενα κεφάλαια.The work for this master thesis is divided into three parts. The first part focused on the study and presentation of scalable solutions for data processing architectures for the Big Data challenge. The second focuses on the processing of a dataset comprising measurements that were collected by different sensors, which were installed on a train. The last part focused on is the setup of a server of the open source IoT platform SiteWhere, the dispatch of data to the server, the storage of the data to a NoSQL database and the processing of these data in a Spark instance. Chapter 1 provides an introduction. In Chapter 2, the architecture and the capabilities of SiteWhere as a holistic solution for IoT management is presented. Chapter 3 introduces the basic notions of the terms “Big Data” and “Cloud”. It also presents different solutions for the Big Data challenge along with the scientific trends on this topic. In Chapter 4, a study of various Clustering algorithms (KMeans, Birch, Mean Shift, DBSCAN), which are used to process the real dataset collected from onboard train sensors, takes place. Chapter 5 introduces the notion of “time-series forecasting” and investigates the behavior of two different types of Neural Networks (MLP, LSTM) with respect to this notion. Chapter 6 presents the work that took place on the SiteWhere platform. The chapter begins with the description of the dispatch of data to the server and continues with the visualization, on Grafana, of the train data that were stored in InfluxDB, a database that SiteWhere supports. Following this, the retrieval of the data from the database and their processing (through KMeans Clustering and Forecasting with MLP) on a Spark instance takes place and finally a comparison between that process and the one on the local system is presented. Chapter 7 provides a summary and highlights some of the conclusions that were derived and presented in the previous chapters
    corecore