Search CORE

9 research outputs found

QUERY PERFORMANCE EVALUATION OVER HEALTH DATA

Author: Pinarer Ozgun
Turhan Sultan
Publication venue: 'IADIS - International Association for the Development of the Information Society'
Publication date: 17/07/2019
Field of study

International audienceIn recent years, there has been a significant increase in the number and variety of application scenarios studied under the e-health. Each application generates an immense data that is growing constantly. In this context, it becomes an important challenge to store and analyze the data efficiently and economically via conventional database management tools. The traditional relational database systems may sometimes not answer the requirements of the increased type, volume, velocity and dynamic structure of the new datasets. Effective healthcare data management and its transformation into information/knowledge are therefore challenging issues. So, organizations especially hospitals and medical centers that deal with immense data, either have to purchase new systems or re-tool what they already have. The new data models so-called NOSQL, its management tool Hadoop Distributed File Systems is replacing RDBMs especially in real-time healthcare data analytics processes. It becomes a real challenge to perform complex reporting in these applications as the size of the data grows exponentially. Along with that, there is customers demand complex analysis and reporting on those data. Compared to the traditional DBs, Hadoop Framework is designed to process a large volume of data. In this study, we examine the query performance of a traditional DBs and Big Data platforms on healthcare data. In this paper, we try to explore whether it is really necessary to invest on big data environment to run queries on the high volume data or this can also be done with the current relational database management systems and their supporting hardware infrastructure. We present our experience and a comprehensive performance evaluation of data management systems in the context of application performance

Crossref

HAL

Hal-Diderot

Pemanfaatan Database Kependudukan Terdistribusi pada Ragam Aplikasi Sistem Informasi di Pemerintah Kabupaten/kota

Author: Ashari A. (Ahmad)
Sutanta E. (Edhy)
Publication venue: Indonesian Computer Electronics and Instrumentation Support Society
Publication date: 01/01/2012
Field of study

Government of Indonesia in this regard Kemendagri is currently implementing an e-ID card (e-KTP) program that was developed by the Single Identification Number (SIN). The program is expected to generate an accurate database of the national population. Availability of better population database will provide the maximum benefit if it can be utilized in various applications of information systems at the regency. Distributed database population is a potential development of e-Gov better, through the development of various primary and secondary/derivative information systems with integrated of database, middleware, and applications which developed by web service technologies. Evaluation of system performance continuously also need to be done as part of the process in the life cycle of information systems

Neliti

repository civitas UGM

PEMANFAATAN DATABASE KEPENDUDUKAN TERDISTRIBUSI PADA RAGAM APLIKASI SISTEM INFORMASI DI PEMERINTAH KABUPATEN/KOTA

Author: Ashari Ahmad
Sutanta Edhy
Publication venue: 'STMIK Pontianak'
Publication date: 12/06/2015
Field of study

Abstract : Government of Indonesia in this regard Kemendagri is currently implementing an e-ID card (e-KTP) program that was developed by the Single Identification Number (SIN). The program is expected to generate an accurate database of the national population. Availability of better population database will provide the maximum benefit if it can be utilized in various applications of information systems at the regency. Distributed database population is a potential development of e-Gov better, through the development of various primary and secondary/derivative information systems with integrated of database, middleware, and applications which developed by web service technologies. Evaluation of system performance continuously also need to be done as part of the process in the life cycle of information systems. Keywords: e-Government, e-KTP, integration, population database, web servic

STMIK Pontianak Online Journals (Sekolah Tinggi Manajemen Informatika dan Komputer)

A dependable time series analytic framework for cyber-physical systems of IoT-based smart grid

Author: Chang Victor
Fan Yiping
Liu Bin
Mao Yishu
Shi Weiwei
Vijayakumar P.
Wang Chang
Wang Jiabao
Zhu Yongxin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2018
Field of study

Teeside University's Research Repository

Performance Analysis Of Scalable Sql And Nosql Databases : A Quantitative Approach

Author: Balasubramanian Harish
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2014
Field of study

Benchmarking is a common method in evaluating and choosing a NoSQL database. There are already lots of benchmarking reports available in internet and research papers. Most of the benchmark reports measure the database performance only by overall throughput and latency. This is an adequate performance analysis but need not to be the end. We define some new perspectives which also need to be considered during NoSQL performance analysis. We have demonstrated this approach by benchmarking HBase, MongoDB and sharded MySQL using YCSB. Based on the results we observe that NoSQL databases do not consider the capability of the data nodes while assigning data to it. And these databases\u27 performance is seriously affected by the bottleneck nodes and the databases are not attempting to resolve this bottleneck situation automatically

Digital Commons@Wayne State University

Big Data in the Cloud: A Survey

Author: Jorge Bernardino
Pedro Caldeira Neves
Publication venue: RonPub
Publication date: 01/01/2015
Field of study

Big Data has become a hot topic across several business areas requiring the storage and processing of huge volumes of data. Cloud computing leverages Big Data by providing high storage and processing capabilities and enables corporations to consume resources in a pay-as-you-go model making clouds the optimal environment for storing and processing huge quantities of data. By using virtualized resources, Cloud can scale very easily, be highly available and provide massive storage capacity and processing power. This paper surveys existing databases models to store and process Big Data within a Cloud environment. Particularly, we detail the following traditional NoSQL databases: BigTable, Cassandra, DynamoDB, HBase, Hypertable, and MongoDB. The MapReduce framework and its developments Apache Spark, HaLoop, Twister, and other alternatives such as Apache Giraph, GraphLab, Pregel and MapD - a novel platform that uses GPU processing to accelerate Big Data processing - are also analyzed. Finally, we present two case studies that demonstrate the successful use of Big Data within Cloud environments and the challenges that must be addressed in the future

RonPub -- Research Online Publishing

SparkIR: a Scalable Distributed Information Retrieval Engine over Spark

Author: Al-Rasbi Sara Yaqoob
Publication venue
Publication date: 01/01/2020
Field of study

Search engines have to deal with a huge amount of data (e.g., billions of documents in the case of the Web) and find scalable and efficient ways to produce effective search results. In this thesis, we propose to use Spark framework, an in memory distributed big data processing framework, and leverage its powerful capabilities of handling large amount of data to build an efficient and scalable experimental search engine over textual documents. The proposed system, SparkIR, can serve as a research framework for conducting information retrieval (IR) experiments. SparkIR supports two indexing schemes, document-based partitioning and term-based partitioning, to adopt document-at-a-time (DAAT) and term-at-a-time (TAAT) query evaluation methods. Moreover, it offers static and dynamic pruning to improve the retrieval efficiency. For static pruning, it employs champion list and tiering, while for dynamic pruning, it uses MaxScore top k retrieval. We evaluated the performance of SparkIR using ClueWeb12-B13 collection that contains about 50M English Web pages. Experiments over different subsets of the collection and compared the Elasticsearch baseline show that SparkIR exhibits reasonable efficiency and scalability performance overall for both indexing and retrieval. Implemented as an open-source library over Spark, users of SparkIR can also benefit from other Spark libraries (e.g., MLlib and GraphX), which, therefore, eliminates the need of usin

Qatar University Institutional Repository

Creación de una distribución Hadoop con fines docentes

Author: Morales Carvajal Andersson Jair
Publication venue
Publication date: 01/01/2023
Field of study

Repositorio de Universidad de La Rioja