7,131 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    A Critical Look at Decentralized Personal Data Architectures

    Full text link
    While the Internet was conceived as a decentralized network, the most widely used web applications today tend toward centralization. Control increasingly rests with centralized service providers who, as a consequence, have also amassed unprecedented amounts of data about the behaviors and personalities of individuals. Developers, regulators, and consumer advocates have looked to alternative decentralized architectures as the natural response to threats posed by these centralized services. The result has been a great variety of solutions that include personal data stores (PDS), infomediaries, Vendor Relationship Management (VRM) systems, and federated and distributed social networks. And yet, for all these efforts, decentralized personal data architectures have seen little adoption. This position paper attempts to account for these failures, challenging the accepted wisdom in the web community on the feasibility and desirability of these approaches. We start with a historical discussion of the development of various categories of decentralized personal data architectures. Then we survey the main ideas to illustrate the common themes among these efforts. We tease apart the design characteristics of these systems from the social values that they (are intended to) promote. We use this understanding to point out numerous drawbacks of the decentralization paradigm, some inherent and others incidental. We end with recommendations for designers of these systems for working towards goals that are achievable, but perhaps more limited in scope and ambition

    Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

    Full text link
    Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page

    Together we stand, Together we fall, Together we win: Dynamic Team Formation in Massive Open Online Courses

    Full text link
    Massive Open Online Courses (MOOCs) offer a new scalable paradigm for e-learning by providing students with global exposure and opportunities for connecting and interacting with millions of people all around the world. Very often, students work as teams to effectively accomplish course related tasks. However, due to lack of face to face interaction, it becomes difficult for MOOC students to collaborate. Additionally, the instructor also faces challenges in manually organizing students into teams because students flock to these MOOCs in huge numbers. Thus, the proposed research is aimed at developing a robust methodology for dynamic team formation in MOOCs, the theoretical framework for which is grounded at the confluence of organizational team theory, social network analysis and machine learning. A prerequisite for such an undertaking is that we understand the fact that, each and every informal tie established among students offers the opportunities to influence and be influenced. Therefore, we aim to extract value from the inherent connectedness of students in the MOOC. These connections carry with them radical implications for the way students understand each other in the networked learning community. Our approach will enable course instructors to automatically group students in teams that have fairly balanced social connections with their peers, well defined in terms of appropriately selected qualitative and quantitative network metrics.Comment: In Proceedings of 5th IEEE International Conference on Application of Digital Information & Web Technologies (ICADIWT), India, February 2014 (6 pages, 3 figures
    • …
    corecore