963 research outputs found

    Big data values deliverance: OSS model

    Get PDF
    Open source software (OSS) repositories, like GitHub, conjointly build numerous big data projects. GitHub developers and/or its responders extend/enhance a project’s software capabilities. Over time, GitHub’s repositories are mined for new knowledge and capabilities. This study’s values-deliverance staging system data mines, isolates, collates and incorporates relevant GitHub text into values deliverance model constructs. This suggests differential construct effects influence a project’s activities levels. The study suggests OSS big data platforms can be software data mined to isolate and assess the values embedded. This also elucidates pathways where behavioral values deliverance improvements to GitHub can likely be most beneficial

    Performance Evaluation of Hadoop based Big Data Applications with HiBench Benchmarking tool on IaaS Cloud Platforms

    Get PDF
    Cloud computing is a computing paradigm where large numbers of devices are connected through networks that provide a dynamically scalable infrastructure for applications, data and storage. Currently, many businesses, from small scale to big companies and industries, are changing their operations to utilize cloud services because cloud platforms could increase company’s growth through process efficiency and reduction in information technology spending [Coles16]. Companies are relying on cloud platforms like Amazon Web Services, Google Compute Engine, and Microsoft Azure, etc., for their business development. Due to the emergence of new technologies, devices, and communications, the amount of data produced is growing rapidly every day. Big data is a collection of large dataset, typically hundreds of gigabytes, terabytes or petabytes. Big data storage and the analytics of this huge volume of data are a great challenge for companies and new businesses to handle, which is a primary focus of this paper. This research was conducted on Amazon’s Elastic Compute Cloud (EC2) and Microsoft Azure platforms using the HiBench Hadoop Big Data Benchmark suite [HiBench16]. Processing huge volumes of data is a tedious task that is normally handled through traditional database servers. In contrast, Hadoop is a powerful framework is used to handle applications with big data requirements efficiently by using the MapReduce algorithm to run them on systems with many commodity hardware nodes. Hadoop’s distributed file system facilitates rapid storage and data transfer rates of big data among the nodes and remains operational even when a node failure has occurred in a cluster. HiBench is a big data benchmarking tool that is used for evaluating the performance of big data applications whose data are handled and controlled by the Hadoop framework cluster. Hadoop cluster environment was enabled and evaluated on two cloud platforms. A quantitative comparison was performed on Amazon EC2 and Microsoft Azure along with a study of their pricing models. Measures are suggested for future studies and research

    Some Contribution of Statistical Techniques in Big Data: A Review

    Get PDF
    Big Data is a popular topic in research work. Everyone is talking about big data, and it is believed that science, business, industry, government, society etc. will undergo a through change with the impact of big data.Big data is used to refer to very huge data set having large, more complex, hidden pattern, structured and unstructured nature of data with the difficulties to collect, storage, analysing for process or result. So proper advanced techniques to use to gain knowledge about big data. In big data research big challenge is created in storage, process, search, sharing, transfer, analysis and visualizing. To deeply discuss on introduction of big data, issue, management and all used big data techniques. Also in this paper present a review of various advanced statistical techniques to handling the key application of big data have large data set. These advanced techniques handle the structure as well as unstructured big data in different area

    Efficiently Conducting Quality-of-Service Analyses by Templating Architectural Knowledge

    Get PDF
    Previously, software architects were unable to effectively and efficiently apply reusable knowledge (e.g., architectural styles and patterns) to architectural analyses. This work tackles this problem with a novel method to create and apply templates for reusable knowledge. These templates capture reusable knowledge formally and can efficiently be integrated in architectural analyses

    A New HadoopBased Network Management System withPolicy Approach

    Get PDF
    In recent years with the improvement in the field of network technology and decreasing of technology cost, lots of data are produced. This massive amount of data needs mechanism for processing and mining information rapidly. In this paper a new Hadoop based network management system with policy approach which is considered hierarchical manager is presented. Storing and processing massive data efficiently are two capability of Hadoop technology by using HDFS and MapReduce. In this paper, processing time is considered as a main factor. As a result it is proved that this management system using policy approach increases the performance of entire system without putting on extra cost for implementation. This system in contrast with pure Hadoop and centralized system is several times more rapid
    • …
    corecore