1,176 research outputs found

    From Social Data Mining to Forecasting Socio-Economic Crisis

    Full text link
    Socio-economic data mining has a great potential in terms of gaining a better understanding of problems that our economy and society are facing, such as financial instability, shortages of resources, or conflicts. Without large-scale data mining, progress in these areas seems hard or impossible. Therefore, a suitable, distributed data mining infrastructure and research centers should be built in Europe. It also appears appropriate to build a network of Crisis Observatories. They can be imagined as laboratories devoted to the gathering and processing of enormous volumes of data on both natural systems such as the Earth and its ecosystem, as well as on human techno-socio-economic systems, so as to gain early warnings of impending events. Reality mining provides the chance to adapt more quickly and more accurately to changing situations. Further opportunities arise by individually customized services, which however should be provided in a privacy-respecting way. This requires the development of novel ICT (such as a self- organizing Web), but most likely new legal regulations and suitable institutions as well. As long as such regulations are lacking on a world-wide scale, it is in the public interest that scientists explore what can be done with the huge data available. Big data do have the potential to change or even threaten democratic societies. The same applies to sudden and large-scale failures of ICT systems. Therefore, dealing with data must be done with a large degree of responsibility and care. Self-interests of individuals, companies or institutions have limits, where the public interest is affected, and public interest is not a sufficient justification to violate human rights of individuals. Privacy is a high good, as confidentiality is, and damaging it would have serious side effects for society.Comment: 65 pages, 1 figure, Visioneer White Paper, see http://www.visioneer.ethz.c

    Challenges for MapReduce in Big Data

    Get PDF
    In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research

    Big Data and the Internet of Things

    Full text link
    Advances in sensing and computing capabilities are making it possible to embed increasing computing power in small devices. This has enabled the sensing devices not just to passively capture data at very high resolution but also to take sophisticated actions in response. Combined with advances in communication, this is resulting in an ecosystem of highly interconnected devices referred to as the Internet of Things - IoT. In conjunction, the advances in machine learning have allowed building models on this ever increasing amounts of data. Consequently, devices all the way from heavy assets such as aircraft engines to wearables such as health monitors can all now not only generate massive amounts of data but can draw back on aggregate analytics to "improve" their performance over time. Big data analytics has been identified as a key enabler for the IoT. In this chapter, we discuss various avenues of the IoT where big data analytics either is already making a significant impact or is on the cusp of doing so. We also discuss social implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski (eds.) Big Data Analysis: New algorithms for a new society, Springer Series on Studies in Big Data, to appea

    Internet of Things-aided Smart Grid: Technologies, Architectures, Applications, Prototypes, and Future Research Directions

    Full text link
    Traditional power grids are being transformed into Smart Grids (SGs) to address the issues in existing power system due to uni-directional information flow, energy wastage, growing energy demand, reliability and security. SGs offer bi-directional energy flow between service providers and consumers, involving power generation, transmission, distribution and utilization systems. SGs employ various devices for the monitoring, analysis and control of the grid, deployed at power plants, distribution centers and in consumers' premises in a very large number. Hence, an SG requires connectivity, automation and the tracking of such devices. This is achieved with the help of Internet of Things (IoT). IoT helps SG systems to support various network functions throughout the generation, transmission, distribution and consumption of energy by incorporating IoT devices (such as sensors, actuators and smart meters), as well as by providing the connectivity, automation and tracking for such devices. In this paper, we provide a comprehensive survey on IoT-aided SG systems, which includes the existing architectures, applications and prototypes of IoT-aided SG systems. This survey also highlights the open issues, challenges and future research directions for IoT-aided SG systems

    A Strategic Roadmap for Maximizing Big Data Return

    Get PDF
    Big Data has turned out to be one of the popular expressions in IT the last couple of years. In the current digital period, according to the huge improvement occurring in the web and online world innovations, we are facing a gigantic volume of information. The size of data has expanded significantly with the appearance of today's innovation in numerous segments, for example, assembling, business, and science. Types of information have been changed from structured data-driven databases to data including documents, images, audio, video, and social media contents referred to as unstructured data or Big Data. Consequently, most of the organizations try to invest in the big data technology aiming to get value from their investment. However, the organizations face a challenge to determine their requirements and then the technology that suits their businesses. Different technologies are provided by variety of vendors, each of them can be used, and there is no methodology helping them for choosing and making a right decision. Therefore, the objective of this paper is to construct a roadmap for helping the organizations determine their needs and selecting a suitable technology and applying this conducted proposed roadmap practically on two companies

    Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

    Get PDF
    One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study
    • …
    corecore