27 research outputs found

    On Efficiently Partitioning a Topic in Apache Kafka

    Full text link
    Apache Kafka addresses the general problem of delivering extreme high volume event data to diverse consumers via a publish-subscribe messaging system. It uses partitions to scale a topic across many brokers for producers to write data in parallel, and also to facilitate parallel reading of consumers. Even though Apache Kafka provides some out of the box optimizations, it does not strictly define how each topic shall be efficiently distributed into partitions. The well-formulated fine-tuning that is needed in order to improve an Apache Kafka cluster performance is still an open research problem. In this paper, we first model the Apache Kafka topic partitioning process for a given topic. Then, given the set of brokers, constraints and application requirements on throughput, OS load, replication latency and unavailability, we formulate the optimization problem of finding how many partitions are needed and show that it is computationally intractable, being an integer program. Furthermore, we propose two simple, yet efficient heuristics to solve the problem: the first tries to minimize and the second to maximize the number of brokers used in the cluster. Finally, we evaluate its performance via large-scale simulations, considering as benchmarks some Apache Kafka cluster configuration recommendations provided by Microsoft and Confluent. We demonstrate that, unlike the recommendations, the proposed heuristics respect the hard constraints on replication latency and perform better w.r.t. unavailability time and OS load, using the system resources in a more prudent way.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. This work was funded by the European Union's Horizon 2020 research and innovation programme MARVEL under grant agreement No 95733

    Distributed Path Reconfiguration and Data Forwarding in Industrial IoT Networks

    Full text link
    In today's typical industrial environments, the computation of the data distribution schedules is highly centralised. Typically, a central entity configures the data forwarding paths so as to guarantee low delivery delays between data producers and consumers. However, these requirements might become impossible to meet later on, due to link or node failures, or excessive degradation of their performance. In this paper, we focus on maintaining the network functionality required by the applications after such events. We avoid continuously recomputing the configuration centrally, by designing an energy efficient local and distributed path reconfiguration method. Specifically, given the operational parameters required by the applications, we provide several algorithmic functions which locally reconfigure the data distribution paths, when a communication link or a network node fails. We compare our method through simulations to other state of the art methods and we demonstrate performance gains in terms of energy consumption and data delivery success rate as well as some emerging key insights which can lead to further performance gains

    ML-based Approaches for Wireless NLOS Localization: Input Representations and Uncertainty Estimation

    Full text link
    The challenging problem of non-line-of-sight (NLOS) localization is critical for many wireless networking applications. The lack of available datasets has made NLOS localization difficult to tackle with ML-driven methods, but recent developments in synthetic dataset generation have provided new opportunities for research. This paper explores three different input representations: (i) single wireless radio path features, (ii) wireless radio link features (multi-path), and (iii) image-based representations. Inspired by the two latter new representations, we design two convolutional neural networks (CNNs) and we demonstrate that, although not significantly improving the NLOS localization performance, they are able to support richer prediction outputs, thus allowing deeper analysis of the predictions. In particular, the richer outputs enable reliable identification of non-trustworthy predictions and support the prediction of the top-K candidate locations for a given instance. We also measure how the availability of various features (such as angles of signal departure and arrival) affects the model's performance, providing insights about the types of data that should be collected for enhanced NLOS localization. Our insights motivate future work on building more efficient neural architectures and input representations for improved NLOS localization performance, along with additional useful application features.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Work partly supported by the RA Science Committee grant No. 22rl-052 (DISTAL) and the EU under Italian National Recovery and Resilience Plan of NextGenerationEU on "Telecommunications of the Future" (PE00000001 - program "RESTART"

    Agnostic Learning for Packing Machine Stoppage Prediction in Smart Factories

    Full text link
    The cyber-physical convergence is opening up new business opportunities for industrial operators. The need for deep integration of the cyber and the physical worlds establishes a rich business agenda towards consolidating new system and network engineering approaches. This revolution would not be possible without the rich and heterogeneous sources of data, as well as the ability of their intelligent exploitation, mainly due to the fact that data will serve as a fundamental resource to promote Industry 4.0. One of the most fruitful research and practice areas emerging from this data-rich, cyber-physical, smart factory environment is the data-driven process monitoring field, which applies machine learning methodologies to enable predictive maintenance applications. In this paper, we examine popular time series forecasting techniques as well as supervised machine learning algorithms in the applied context of Industry 4.0, by transforming and preprocessing the historical industrial dataset of a packing machine's operational state recordings (real data coming from the production line of a manufacturing plant from the food and beverage domain). In our methodology, we use only a single signal concerning the machine's operational status to make our predictions, without considering other operational variables or fault and warning signals, hence its characterization as ``agnostic''. In this respect, the results demonstrate that the adopted methods achieve a quite promising performance on three targeted use cases

    IEEE Access Special Section Editorial: Wirelessly Powered Networks, and Technologies

    Get PDF
    Wireless Power Transfer (WPT) is, by definition, a process that occurs in any system where electrical energy is transmitted from a power source to a load without the connection of electrical conductors. WPT is the driving technology that will enable the next stage in the current consumer electronics revolution, including battery-less sensors, passive RF identification (RFID), passive wireless sensors, the Internet of Things and 5G, and machine-to-machine solutions. WPT-enabled devices can be powered by harvesting energy from the surroundings, including electromagnetic (EM) energy, leading to a new communication networks paradigm, the Wirelessly Powered Networks

    A Survey on Networked Data Streaming With Apache Kafka

    No full text
    Apache Kafka has become a popular solution for managing networked data streaming in a variety of applications, from industrial to general purpose. This paper systematically surveys the research literature in this field by carefully classifying it into key macro areas, namely algorithms, networks, data, cyber-physical systems, and security. Through this meticulous classification, the paper aims to identify and analyze the optimization aspects relevant to each area, drawing upon practical applications as the basis for analysis. In this respect, the paper synthesizes and consolidates existing knowledge, saving researchers valuable time and effort in searching for relevant information across multiple sources. The tangible benefits of this survey paper include providing a consolidated knowledge base about research-intensive Apache Kafka topics, highlighting practical insights and novel approaches, pointing up cross-domain applications, identifying related research challenges, and serving as a trusted reference for the Apache Kafka community

    Performance Analysis of Latency-Aware Data Management in Industrial IoT Networks

    No full text
    Maintaining critical data access latency requirements is an important challenge of Industry 4.0. The traditional, centralized industrial networks, which transfer the data to a central network controller prior to delivery, might be incapable of meeting such strict requirements. In this paper, we exploit distributed data management to overcome this issue. Given a set of data, the set of consumer nodes and the maximum access latency that consumers can tolerate, we consider a method for identifying and selecting a limited set of proxies in the network where data needed by the consumer nodes can be cached. The method targets at balancing two requirements; data access latency within the given constraints and low numbers of selected proxies. We implement the method and evaluate its performance using a network of WSN430 IEEE 802.15.4-enabled open nodes. Additionally, we validate a simulation model and use it for performance evaluation in larger scales and more general topologies. We demonstrate that the proposed method (i) guarantees average access latency below the given threshold and (ii) outperforms traditional centralized and even distributed approaches

    Wireless energy transfer in sensor networks with adaptive, limited knowledge protocols

    No full text
    We investigate the problem of efficient wireless energy transfer in Wireless Rechargeable Sensor Networks (WRSNs). In such networks a special mobile entity (called the Mobile Charger) traverses the network and wirelessly replenishes the energy of sensor nodes. In contrast to most current approaches, we envision methods that are distributed, adaptive and use limited network information. We propose three new, alternative protocols for efficient charging, addressing key issues which we identify, most notably (i) to what extent each sensor should be charged, (ii) what is the best split of the total energy between the charger and the sensors and (iii) what are good trajectories the Mobile Charger should follow. One of our protocols (LRP) performs some distributed, limited sampling of the network status, while another one (RTP) reactively adapts to energy shortage alerts judiciously spread in the network. We conduct detailed simulations in uniform and non-uniform network deployments, using three different underlying routing protocol families. In most cases, both our charging protocols significantly outperform known state of the art methods, while their performance gets quite close to the performance of the global knowledge method (GKP) we also provide. (C) 2014 Elsevier B.V. All rights reserved
    corecore