7 research outputs found

    A Novel Approach for Processing of Real Time Big Data for Machine Learning By Using Map reduce Paradigm

    Get PDF
    As of late Big Data and its investigation assuming overwhelming part in ideal stockpiling of semi or unstructured information and Decision making by utilizing mining systems and prescient examination. Particularly Remote Sensing gathers colossal information as multispectral high determination satellite pictures. These pictures contain assortment of information in tremendous volume as pixels. Dispersing high volume information into various product frameworks utilizing disseminated record framework is a noteworthy upset made by Hadoop system to deal with enormous information with the accessible equipment and computational abilities. Delineate is a strategy which performs Map capacities and Reduce works on the disseminated document framework. This paper examined on continuous Big Data Analytical design for remote detecting satellite application. To deal with Remote Sensing Data proposed engineering contains three fundamental units, for example, Data Pre-Processing Unit (DPREU), Data Analysis Unit (DAU) and Data Post-Processing Unit (DPOSTU). In the first place, DPREU gets the required information from satellite sensors by utilizing filtration, adjusted conveyed stockpiling and parallel preparing utilizing Hadoop condition. Second, DAU recognizes the concealed examples from information put away in disseminated File System utilizing Map capacities took after by Reduce works in Map-Reduce worldview. At last, DPOSTU is the upper layer unit of the proposed design, which is in charge of arranging stockpiling of the outcomes, and era of choice in light of the outcomes got from DAU. Mapper capacities are part into number of record perusers and they will read the information stacked circulates document framework by utilizing key-esteem combine. The yield of each Map capacity is taken by Reducer work for further investigation.

    Data Mining with Big Data Using HACE Theorem

    Get PDF
    The term Big Data comprises large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution

    A review of Data Mining Techniques Using in Big Data

    Get PDF
    The term Big Data comprises large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution

    Data Mining in Social Networks

    Get PDF
    The objective of the study is to examine the idea of Big Data and its applications in data mining. The data in the universe is expanding step by step every year and turns into large data. These significant data can be determined to utilize a few data mining undertakings. In short, Big Data can be called as an “asset” and data mining is a technique that is employed to give useful results. This paper implements an HACE algorithm that analysis the structure of big data and presents an efficient data mining technique. This framework model incorporates a mixture of information sources, mining techniques, customer interest, security, and data protection system. The study also analyzes and presents the challenges and issues faced in the big data model

    Algorithms for Mining the Evolution of Conserved Relational States in Dynamic Networks

    No full text

    A framework for dynamic heterogeneous information networks change discovery based on knowledge engineering and data mining methods

    Get PDF
    Information Networks are collections of data structures that are used to model interactions in social and living phenomena. They can be either homogeneous or heterogeneous and static or dynamic depending upon the type and nature of relations between the network entities. Static, homogeneous and heterogenous networks have been widely studied in data mining but recently, there has been renewed interest in dynamic heterogeneous information networks (DHIN) analysis because the rich temporal, structural and semantic information is hidden in this kind of network. The heterogeneity and dynamicity of the real-time networks offer plenty of prospects as well as a lot of challenges for data mining. There has been substantial research undertaken on the exploration of entities and their link identification in heterogeneous networks. However, the work on the formal construction and change mining of heterogeneous information networks is still infant due to its complex structure and rich semantics. Researchers have used clusters-based methods and frequent pattern-mining techniques in the past for change discovery in dynamic heterogeneous networks. These methods only work on small datasets, only provide the structural change discovery and fail to consider the quick and parallel process on big data. The problem with these methods is also that cluster-based approaches provide the structural changes while the pattern-mining provide semantic characteristics of changes in a dynamic network. Another interesting but challenging problem that has not been considered by past studies is to extract knowledge from these semantically richer networks based on the user-specific constraint.This study aims to develop a new change mining system ChaMining to investigate dynamic heterogeneous network data, using knowledge engineering with semantic web technologies and data mining to overcome the problems of previous techniques, this system and approach are important in academia as well as real-life applications to support decision-making based on temporal network data patterns. This research has designed a novel framework “ChaMining” (i) to find relational patterns in dynamic networks locally and globally by employing domain ontologies (ii) extract knowledge from these semantically richer networks based on the user-specific (meta-paths) constraints (iii) Cluster the relational data patterns based on structural properties of nodes in the dynamic network (iv) Develop a hybrid approach using knowledge engineering, temporal rule mining and clustering to detect changes in the dynamic heterogeneous networks.The evidence is presented in this research shows that the proposed framework and methods work very efficiently on the benchmark big dynamic heterogeneous datasets. The empirical results can contribute to a better understanding of the rich semantics of DHIN and how to mine them using the proposed hybrid approach. The proposed framework has been evaluated with the previous six dynamic change detection algorithms or frameworks and it performs very well to detect microscopic as well as macroscopic human-understandable changes. The number of change patterns extracted in this approach was higher than the previous approaches which help to reduce the information loss

    Modular wireless networks for infrastructure-challenged environments

    Get PDF
    While access to Internet and cellular connectivity is easily achieved in densely-populated areas, provisioning of communication services is much more challenging in remote rural areas. At the same time Internet access is of critical importance to residents of such rural communities. People's curiosity and realization of the opportunities provided by Internet and cellular access is the key ingredient to adoption. However, poor network performance can easily impede the process of adoption by discouraging people to access and use connectivity. With this in mind, we evaluate performance and adoption of various connectivity technologies in rural developing regions and identify avenues that need immediate attention to guarantee smoother technology adoption. In light of this analysis we propose novel system designs that meet these needs. In this thesis we focus on cellular and broadband Internet connectivity. Commercial cellular networks are highly centralized, which requires costly backhaul. This, coupled with high price for equipment, maintenance and licensing renders cellular network access commercially-infeasible in rural areas. At the same time rural cellular communications are highly local: 70% of the rural-residential calls have an originator-destination pair within the same antenna. In line with this observation we design a low-cost cellular network architecture dubbed Kwiizya, to provide local voice and text messaging services in a rural community. Where outbound connectivity is available, Kwiizya can provide global services. While commercial networks are becoming more available in rural areas they are often out of financial reach of rural residents. Furthermore, these networks typically provide only basic voice and SMS services and no mobile data. To address these challenges, our proposed work allows Kwiizya to operate in coexistence with commercial cellular networks in order to extend local coverage and provide more advanced services that are not delivered by the commercial networks. Internet connectivity in rural areas is typically provided through slow satellite links. The challenges in performance and adoption of such networks have been previously studied. We add a unique dataset and consequent analysis to this spectrum of work, which captures the upgrade of the gateway connectivity in the rural community of Macha, Zambia from a 256kbps satellite link to a more capable 2Mbps terrestrial link. We show that the improvement in performance and user experience is not necessarily proportional to the bandwidth increase. While this increase improved the network usability, it also opened opportunities for adoption of more demanding services that were previously out of reach. As a result the network performance was severely degraded over the long term. To address these challenges we employ white space communication both for connectivity to more capable remote gateways, as well as for end user connectivity. We develop VillageLink, a distributed method that optimizes channel allocation to maximize throughput and enables both remote gateway access as well as end user coverage. While VillageLink features lightweight channel probing, we also consider external sources of channel availability. We design a novel approach for estimation of channel occupancy called TxMiner, which is capable of extracting transmitter characteristics from raw spectrum measurements. We study the adoption and implications of network connectivity in rural communities. In line with the results of our analyses we design and build system architectures that are geared to meet critical needs in these communities. While the focus of analysis in this thesis is on rural sub-Saharan Africa, the proposed designs and system implementations are more general and can serve in infrastructure-challenged communities across the world
    corecore