68,841 research outputs found

    Fog based intelligent transportation big data analytics in the internet of vehicles environment: motivations, architecture, challenges, and critical issues

    Get PDF
    The intelligent transportation system (ITS) concept was introduced to increase road safety, manage traffic efficiently, and preserve our green environment. Nowadays, ITS applications are becoming more data-intensive and their data are described using the '5Vs of Big Data'. Thus, to fully utilize such data, big data analytics need to be applied. The Internet of vehicles (IoV) connects the ITS devices to cloud computing centres, where data processing is performed. However, transferring huge amount of data from geographically distributed devices creates network overhead and bottlenecks, and it consumes the network resources. In addition, following the centralized approach to process the ITS big data results in high latency which cannot be tolerated by the delay-sensitive ITS applications. Fog computing is considered a promising technology for real-time big data analytics. Basically, the fog technology complements the role of cloud computing and distributes the data processing at the edge of the network, which provides faster responses to ITS application queries and saves the network resources. However, implementing fog computing and the lambda architecture for real-time big data processing is challenging in the IoV dynamic environment. In this regard, a novel architecture for real-time ITS big data analytics in the IoV environment is proposed in this paper. The proposed architecture merges three dimensions including intelligent computing (i.e. cloud and fog computing) dimension, real-time big data analytics dimension, and IoV dimension. Moreover, this paper gives a comprehensive description of the IoV environment, the ITS big data characteristics, the lambda architecture for real-time big data analytics, several intelligent computing technologies. More importantly, this paper discusses the opportunities and challenges that face the implementation of fog computing and real-time big data analytics in the IoV environment. Finally, the critical issues and future research directions section discusses some issues that should be considered in order to efficiently implement the proposed architecture

    Communication Theoretic Data Analytics

    Full text link
    Widespread use of the Internet and social networks invokes the generation of big data, which is proving to be useful in a number of applications. To deal with explosively growing amounts of data, data analytics has emerged as a critical technology related to computing, signal processing, and information networking. In this paper, a formalism is considered in which data is modeled as a generalized social network and communication theory and information theory are thereby extended to data analytics. First, the creation of an equalizer to optimize information transfer between two data variables is considered, and financial data is used to demonstrate the advantages. Then, an information coupling approach based on information geometry is applied for dimensionality reduction, with a pattern recognition example to illustrate the effectiveness. These initial trials suggest the potential of communication theoretic data analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan. 201

    Performance-Aware High-Performance Computing for Remote Sensing Big Data Analytics

    Get PDF
    The incredible increase in the volume of data emerging along with recent technological developments has made the analysis processes which use traditional approaches more difficult for many organizations. Especially applications involving subjects that require timely processing and big data such as satellite imagery, sensor data, bank operations, web servers, and social networks require efficient mechanisms for collecting, storing, processing, and analyzing these data. At this point, big data analytics, which contains data mining, machine learning, statistics, and similar techniques, comes to the help of organizations for end-to-end managing of the data. In this chapter, we introduce a novel high-performance computing system on the geo-distributed private cloud for remote sensing applications, which takes advantages of network topology, exploits utilization and workloads of CPU, storage, and memory resources in a distributed fashion, and optimizes resource allocation for realizing big data analytics efficiently

    Street Smart in 5G : Vehicular Applications, Communication, and Computing

    Get PDF
    Recent advances in information technology have revolutionized the automotive industry, paving the way for next-generation smart vehicular mobility. Specifically, vehicles, roadside units, and other road users can collaborate to deliver novel services and applications that leverage, for example, big vehicular data and machine learning. Relatedly, fifth-generation cellular networks (5G) are being developed and deployed for low-latency, high-reliability, and high bandwidth communications. While 5G adjacent technologies such as edge computing allow for data offloading and computation at the edge of the network thus ensuring even lower latency and context-awareness. Overall, these developments provide a rich ecosystem for the evolution of vehicular applications, communications, and computing. Therefore in this work, we aim at providing a comprehensive overview of the state of research on vehicular computing in the emerging age of 5G and big data. In particular, this paper highlights several vehicular applications, investigates their requirements, details the enabling communication technologies and computing paradigms, and studies data analytics pipelines and the integration of these enabling technologies in response to application requirements.Peer reviewe

    Approximate Data Analytics Systems

    Get PDF
    Today, most modern online services make use of big data analytics systems to extract useful information from the raw digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore’s law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. Unfortunately, the advancements in approximate computing are primarily geared towards batch analytics and cannot provide low-latency guarantees in the context of stream processing, where new data continuously arrives as an unbounded stream. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale stream data to achieve low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: • StreamApprox—a data stream analytics system for approximate computing. This system supports approximate computing for low-latency stream analytics in a transparent way and has an ability to adapt to rapid fluctuations of input data streams. In this system, we designed an online adaptive stratified reservoir sampling algorithm to produce approximate output with bounded error. • IncApprox—a data analytics system for incremental approximate computing. This system adopts approximate and incremental computing in stream processing to achieve high-throughput and low-latency with efficient resource utilization. In this system, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. • PrivApprox—a data stream analytics system for privacy-preserving and approximate computing. This system supports high utility and low-latency data analytics and preserves user’s privacy at the same time. The system is based on the combination of privacy-preserving data analytics and approximate computing. • ApproxJoin—an approximate distributed joins system. This system improves the performance of joins — critical but expensive operations in big data systems. In this system, we employed a sketching technique (Bloom filter) to avoid shuffling non-joinable data items through the network as well as proposed a novel sampling mechanism that executes during the join to obtain an unbiased representative sample of the join output. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance speedup compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications

    Towards the Development of a Framework for Socially Responsible Software by Analyzing Social Media Big Data on Cloud Through Ontological Engineering

    Get PDF
    AbstractA socially responsible internet is the need of the hour considering its huge potential and role in educating and transforming the society. Social computing is emerging as an important area as far as development of next generation web is concerned. With the proliferation of social networking applications, vast amount of data is available on cloud, which may be analyzed to gain useful insight into behavioral and linguistic patterns of different cultural and socio-economic groups further classified on the basis of gender and age etc. The idea is to come up with an appropriate framework for socially responsible software artifacts. These artifacts will monitor online social network data and analyze it from the perspective of socially responsible behavior based on ontological engineering concepts. Identification of socially responsible agents is such an example, though based on a different approach. More examples may be taken from literature dealing with microblog analytics, social semantic web, upper ontology for social web, and social-network-sourced big data analytics. In the present work, it is proposed to focus on analysis/monitoring of socially responsible behavior of social media big data and develop an upper level ontology as the framework/tool for such an analytics

    Mobile Big Data Analytics in Healthcare

    Get PDF
    Mobile and ubiquitous devices are everywhere around us generating considerable amount of data. The concept of mobile computing and analytics is expanding due to the fact that we are using mobile devices day in and out without even realizing it. These mobile devices use Wi-Fi, Bluetooth or mobile data to be intermittently connected to the world, generating, sending and receiving data on the move. Latest mobile applications incorporating graphics, video and audio are main causes of loading the mobile devices by consuming battery, memory and processing power. Mobile Big data analytics includes for instance, big health data, big location data, big social media data, and big heterogeneous data. Healthcare is undoubtedly one of the most data-intensive industries nowadays and the challenge is not only in acquiring, storing, processing and accessing data, but also in engendering useful insights out of it. These insights generated from health data may reduce health monitoring cost, enrich disease diagnosis, therapy, and care and even lead to human lives saving. The challenge in mobile data and Big data analytics is how to meet the growing performance demands of these activities while minimizing mobile resource consumption. This thesis proposes a scalable architecture for mobile big data analytics implementing three new algorithms (i.e. Mobile resources optimization, Mobile analytics customization and Mobile offloading), for the effective usage of resources in performing mobile data analytics. Mobile resources optimization algorithm monitors the resources and switches off unused network connections and application services whenever resources are limited. However, analytics customization algorithm attempts to save energy by customizing the analytics process while implementing some data-aware techniques. Finally, mobile offloading algorithm decides on the fly whether to process data locally or delegate it to a Cloud back-end server. The ultimate goal of this research is to provide healthcare decision makers with the advancements in mobile Big data analytics and support them in handling large and heterogeneous health datasets effectively on the move

    A Workflow-oriented Language for Scalable Data Analytics

    Get PDF
    Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.Data in digital repositories are everyday more and more massive and distributed. Therefore analyzing them requires efficient data analysis techniques and scalable storage and computing platforms. Cloud computing infrastructures offer an effective support for addressing both the computational and data storage needs of big data mining and parallel knowledge discovery applications. In fact, complex data mining tasks involve data- and compute-intensive algorithms that require large and efficient storage facilities together with high performance processors to get results in acceptable times. In this paper we describe a Data Mining Cloud Framework (DMCF) designed for developing and executing distributed data analytics applications as workflows of services. We describe also a workflow-oriented language, called JS4Cloud, to support the design and execution of script-based data analysis workflows on DMCF. We finally present a data analysis application developed with JS4Cloud, and the scalability achieved executing it on DMCF.The work presented in this paper has been partially supported by EU under the COST programme Action IC1305, ’Network for Sustainable Ultrascale Computing (NESUS)’
    • …
    corecore