Search CORE

28 research outputs found

Data Warehouse with Big Data Technology for Higher Education

Author: Santoso Leo Willyanto
Yulia
Publication venue
Publication date: 18/12/2017
Field of study

Nowadays, data warehouse tools and technologies cannot handle the load and analytic process of data into meaningful information for top management. Big data technology should be implemented to extend the existing data warehouse solutions. Universities already collect vast amounts of data so the academic data of university has been growing significantly and become a big academic data. These datasets are rich and growing. University’s top-level management needs tools to produce information from the records. The generated information is expected to support the decision-making process of top-level management. This paper explores how big data technology could be implemented with data warehouse to support decision making process. In this framework, we propose Hadoop as big data analytic tools to be implemented for data ingestion/staging. The paper concludes by outlining future directions relating to the development and implementation of an institutional project on Big Data

Scientific Repository

Data Warehouse with Big Data Technology for Higher Education

Author: Santoso Leo Willyanto
Yulia
Publication venue
Publication date: 01/01/2017
Field of study

Nowadays, data warehouse tools and technologies cannot handle the load and analytic process of data into meaningful information for top management. Big data technology should be implemented to extend the existing data warehouse solutions. Universities already collect vast amounts of data so the academic data of university has been growing significantly and become a big academic data. These datasets are rich and growing. University�s top-level management needs tools to produce information from the records. The generated information is expected to support the decision-making process of top-level management. This paper explores how big data technology could be implemented with data warehouse to support decision making process. In this framework, we propose Hadoop as big data analytic tools to be implemented for data ingestion/staging. The paper concludes by outlining future directions relating to the development and implementation of an institutional project on Big Data

Crossref

Scientific Repository

Exploratory Research on Developing Hadoop-based Data Analytics Tools

Author: Basuki Kenny
Dewi Lily Puspa
Handojo Andreas
Mirabel Mikiavonty Endrawati
Palit Henry Novianus
Publication venue
Publication date: 29/09/2017
Field of study

As we crossed past the millennium mark, slowly but steadily we were flooded by (digital) data from virtually everywhere. The challenges of big data demand new methods, techniques, and paradigms for processing data in a fast and scalable fashion. Most enterprises today need to employ data analysis technologies to remain competitive and profitable. Alas, the adoption of big data analysis technologies in Indonesia is still in its infancy, even in the academic sector. To encourage more adoptions of big data technologies, this study explored the development of Hadoop-based data analytics tools. Two case studies were used in the exploration. One is to showcase the performance comparison between Hadoop and DBMS, whereas the other is between Hadoop and a statistical analysis tool. Results clearly demonstrate that Hadoop is superior in processing a large data size. We also derive some recommendations to tune Hadoop optimally

Crossref

Scientific Repository

Sketch of Big Data Real-Time Analytics Model

Author: Amen Bakhtiar
Lu Joan
Publication venue
Publication date: 24/05/2015
Field of study

Big Data has drawn huge attention from researchers in information sciences, decision makers in governments and enterprises. However, there is a lot of potential and highly useful value hidden in the huge volume of data. Data is the new oil, but unlike oil data can be refined further to create even more value. Therefore, a new scientific paradigm is born as data-intensive scientific discovery, also known as Big Data. The growth volume of real-time data requires new techniques and technologies to discover insight value. In this paper we introduce the Big Data real-time analytics model as a new technique. We discuss and compare several Big Data technologies for real-time processing along with various challenges and issues in adapting Big Data. Real-time Big Data analysis based on cloud computing approach is our future research direction

University of Liverpool Repository

University of Huddersfield Repository

Big data analysis solutions using mapReduce framework

Author: Elagib Sara B.
Hassan Abdalla Hashim Aisha
Najeeb Athaur Rahman
Olanrewaju Rashidah Funke
Publication venue
Publication date: 01/01/2014
Field of study

Recently, data that generated from variety of sources with massive volumes, high rates, and different data structure, data with these characteristics is called Big Data. Big Data processing and analyzing is a challenge for the current systems because they were designed without Big Data requirements in mind and most of them were built on centralized architecture, which is not suitable for Big Data processing because it results on high processing cost and low processing performance and quality. MapReduce framework was built as a parallel distributed programming model to process such large-scale datasets effectively and efficiently. This paper presents six successful Big Data software analysis solutions implemented on MapReduce framework, describing their datasets structures and how they were implemented, so that it can guide and help other researchers in their own Big Data solutions

Crossref

The International Islamic University Malaysia Repository

A Survey on Data Mining and Analysis in Hadoop and MongoDb

Author: C.Zala Manmitsinh
Dhobi Jitendra S.
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 29/06/2015
Field of study

Data Mining is a process to generate pattern and rules from various types of data marts and data warehouses ,in this process there are several steps which contains data cleaning data anomaly detection then clean data is mined with various approaches .In this research we have discussed data mining on large datasets ( Big Data) with this large data set major issues are scalability and security ,Hadoop is the tool to mine the data and Mongo db provides input for it, which is a key-value paradigm for parsing the data ,Other approaches are discussed with this report and their capability for data storage ,Map reduce is method which can be used to reduce the data set to reduce query processing time and improve system throughput, In the Proposed system we are going to mine the big data this Hadoop and Mongo db and we will try to mine the data with sorted or double sorted key value pair ,for and analyze the outcome of system. Keywords- DataMIning , Hadoop, MapReduce, HDFS, MongoDb

International Institute for Science, Technology and Education (IISTE): E-Journals

Potenziale von Big-Data im Gesundheitswesen

Author: Sariyar Murat
Walser Konrad
Publication venue: BFH-Zentrum Digital Society
Publication date: 05/05/2017
Field of study

Aufgrund der zunehmenden IT-Unterstützung produzieren Spitäler eine Vielzahl digital vorgehaltener Informationen. Es stellt sich daher immer häufiger die Frage, welchen Nutzen eine Analyse und Verwendung von großen heterogenen Datenmengen (Big-Data) bieten kann

Berner Fachhochschule: ARBOR

Concept and benchmark results for Big Data energy forecasting based on Apache Spark

Author: A McAfee
C Monteiro
H Hassani
H Maaß
J Antonanzas
J Jung
JÁ González Ordiano
JÁ González Ordiano
L Breiman
L Fahrmeir
MJ Duran
R Shyam
RJ Hyndman
S Fan
S Gottwalt
S Klaiber
S Pelland
S Zhou
T Gneiting
T Gneiting
T Hong
V Hagenmeyer
X Fang
Y Simmhan
Publication venue: SpringerOpen
Publication date: 01/03/2018
Field of study

The present article describes a concept for the creation and application of energy forecasting models in a distributed environment. Additionally, a benchmark comparing the time required for the training and application of data-driven forecasting models on a single computer and a computing cluster is presented. This comparison is based on a simulated dataset and both R and Apache Spark are used. Furthermore, the obtained results show certain points in which the utilization of distributed computing based on Spark may be advantageous

Crossref

KITopen

Directory of Open Access Journals

Um modelo de arquitetura em camadas empilhadas para Big Data

Author: Nieto Lemus Alba Consuelo
Ordóñez Salinas Sonia
Publication venue: 'Universidad Icesi'
Publication date: 01/04/2016
Field of study

Debido a la necesidad del análisis para los nuevos tipos de datos no estructurados, repetitivos y no repetitivos, surge Big Data. Aunque el tema ha sido extensamente difundido, no hay disponible una arquitectura de referencia para sistemas Big Data que incorpore el tratamiento de grandes volúmenes de datos en bruto, agregados y no agregados ni propuestas completas para manejar el ciclo de vida de los datos o una terminología estandarizada en ésta área, menos una metodología que soporte el diseño y desarrollo de dicha arquitectura. Solo hay arquitecturas de pequeña escala, de tipo industrial, orientadas al producto, que se reducen al alcance de la solución de una compañía o grupo de compañías, que se enfocan en la tecnología, pero omiten el punto de vista funcional. El artículo explora los requerimientos para la formulación de un modelo arquitectural que soporte la analítica y la gestión de datos estructurados y no estructurados, repetitivos y no repetitivos, y contempla algunas propuestas arquitecturales de tipo industrial o tecnológicas, para al final proponer un modelo lógico de arquitectura multicapas escalonado, que pretende dar respuesta a los requerimientos que cubran, tanto a Data Warehouse, como a Big Data.Until recently, the issue of analytical data was related to Data Warehouse, but due to the necessity of analyzing new types of unstructured data, both repetitive and non-repetitive, Big Data arises. Although this subject has been widely studied, there is not available a reference architecture for Big Data systems involved with the processing of large volumes of raw data, aggregated and non-aggregated. There are not complete proposals for managing the lifecycle of data or standardized terminology, even less a methodology supporting the design and development of that architecture. There are architectures in small-scale, industrial and product-oriented, which limit their scope to solutions for a company or group of companies, focused on technology but omitting the functionality. This paper explores the requirements for the formulation of an architectural model that supports the analysis and management of data: structured, repetitive and non-repetitive unstructured; there are some architectural proposals –industrial or technological type– to propose a logical model of multi-layered tiered architecture, which aims to respond to the requirements covering both Data Warehouse and Big Data.A questão da analítica de dados foi relacionada com o Data Warehouse, mas devido à necessidade de uma análise de novos tipos de dados não estruturados, repetitivos e não repetitivos, surge a Big Data. Embora o tema tenha sido amplamente difundido, não existe uma arquitetura de referência para os sistemas Big Data que incorpore o processamento de grandes volumes de dados brutos, agregados e não agregados; nem propostas completas para a gestão do ciclo de vida dos dados, nem uma terminologia padronizada nesta área, e menos uma metodologia que suporte a concepção e desenvolvimento de dita arquitetura. O que existe são arquiteturas em pequena escala, de tipo industrial, orientadas ao produto, limitadas ao alcance da solução de uma empresa ou grupo de empresas, focadas na tecnologia, mas que omitem o ponto de vista funcional. Este artigo explora os requisitos para a formulação de um modelo de arquitetura que possa suportar a analítica e a gestão de dados estruturados e não estruturados, repetitivos e não repetitivos. Dessa exploração contemplam-se algumas propostas arquiteturais de tipo industrial ou tecnológicas, eu propor um modelo lógico de arquitetura em camadas empilhadas, que visa responder às exigências que abrangem tanto Data Warehouse como Big Data

Universidad Icesi, Colombia: Portal de revistas

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Sistemas y Telemática

Biblioteca Digital - Universidad Icesi

A REVIEW ON INTERNET OF THINGS ARCHITECTURE FOR BIG DATA PROCESSING

Author: Akeel Awadh Wid
Khalaf Hamoud Alaa
Salah Hashim Ali
Publication venue: University of Information and Technology Communications
Publication date: 30/06/2020
Field of study

The importance of big data implementations is increased due to large amount of gathered data via the online gates. The businesses and organizations would benefit from the big data analysis i.e. analyze the political, market, and social interests of the people. The Internet of Things (IoT) presents many facilities that support the big data transfer between various Internet objects. The integration between the big data and IoT offer a lot of implementations in the daily life like GPS, Satellites, and airplanes tracking. There are many challenges face the integration between big data transfer and IoT technology. The main challenges are the transfer architecture, transfer protocols, and the transfer security. The main aim of this paper is to review the useful architecture of IoT for the purpose of big data processing with the consideration of the various requirements such as the transfer protocol. This paper also reviews other important issues such as the security requirements and the multiple IoT applications. In addition, the future directions of the IoT-Big data are explained in this paper

Iraqi Journal for Computers and Informatics