31 research outputs found

    MOMIS Dashboard: a powerful data analytics tool for Industry 4.0

    Get PDF
    In this work we present the MOMIS Dashboard, an interactive data analytics tool to explore and visualize data sources content through several kind of dynamic views (e.g. maps, bar, line, pie, etc.). The software tool is very versatile, and supports the connection to the main relational DBMS and Big Data sources. Moreover, it can be connected to MOMIS, a powerful Open Source Data Integration system, able to integrate heterogeneous data sources as enterprise information systems as well as sensors data. MOMIS Dashboard provides a secure permission management to limit data access on the basis of a user role, and a Designer to create and share personalized insights on the company KPIs, facilitating the enterprise collaboration. We illustrate the MOMIS Dashboard efficacy in a real enterprise scenario: a production monitoring platform to analyze real-time and historical data collected through sensors located on production machines that optimize production, energy consumption, and enable preventive maintenance

    Blockchain based Access Control for Enterprise Blockchain Applications

    Get PDF
    Access control is one of the fundamental security mechanisms of IT systems. Most existing access control schemes rely on a centralized party to manage and enforce access control policies. As blockchain technologies, especially permissioned networks, find more applicability beyond cryptocurrencies in enterprise solutions, it is expected that the security requirements will increase. Therefore, it is necessary to develop an access control system that works in a decentralized environment without compromising the unique features of a blockchain. A straightforward method to support access control is to deploy a firewall in front of the enterprise blockchain application. However, this approach does not take advantage of the desirable features of blockchain. In order to address these concerns, we propose a novel blockchain‐based access control scheme, which keeps the decentralization feature for access control–related operations. The newly proposed system also provides the capability to protect user\u27s privacy by leveraging ring signature. We implement a prototype of the scheme using Hyperledger Fabric and assess its performance to show that it is practical for real‐world applications

    On Efficiently Partitioning a Topic in Apache Kafka

    Full text link
    Apache Kafka addresses the general problem of delivering extreme high volume event data to diverse consumers via a publish-subscribe messaging system. It uses partitions to scale a topic across many brokers for producers to write data in parallel, and also to facilitate parallel reading of consumers. Even though Apache Kafka provides some out of the box optimizations, it does not strictly define how each topic shall be efficiently distributed into partitions. The well-formulated fine-tuning that is needed in order to improve an Apache Kafka cluster performance is still an open research problem. In this paper, we first model the Apache Kafka topic partitioning process for a given topic. Then, given the set of brokers, constraints and application requirements on throughput, OS load, replication latency and unavailability, we formulate the optimization problem of finding how many partitions are needed and show that it is computationally intractable, being an integer program. Furthermore, we propose two simple, yet efficient heuristics to solve the problem: the first tries to minimize and the second to maximize the number of brokers used in the cluster. Finally, we evaluate its performance via large-scale simulations, considering as benchmarks some Apache Kafka cluster configuration recommendations provided by Microsoft and Confluent. We demonstrate that, unlike the recommendations, the proposed heuristics respect the hard constraints on replication latency and perform better w.r.t. unavailability time and OS load, using the system resources in a more prudent way.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. This work was funded by the European Union's Horizon 2020 research and innovation programme MARVEL under grant agreement No 95733

    ΠŸΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΠ΅ Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Ρ‹ ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ систСмы управлСния городской Ρ€Π΅Π»ΡŒΡΠΎΠ²ΠΎΠΉ транспортной систСмой

    Get PDF
    The increase in the volume of passenger transportation in megalopolises and large urban agglomerations is efficiently provided by the integration of urban public transit systems and city railways. Traffic management under those conditions requires creating intelligent centralised multi-level traffic control systems that implement the required indicators of quality, comfort, and traffic safety regarding passenger transportation. Besides, modern control systems contribute to traction power saving, are foundation and integral part of the systems of digitalisation of urban transit and the cities. Building systems solving the traffic planning and control tasks is implemented using algorithms based on the methods of artificial intelligence, principles of hierarchically structured centralised systems, opportunities provided by Big Data technology. Under those conditions it is necessary to consider growing requirements towards software as well as theoretical design and practical implementation of network organisation.This article discusses designing architecture and shaping requirements for developed applications and their integration with databases to create a centralised intelligent control system for the urban rail transit system (CICS URTS). The article proposes the original architecture of the network, routing of information flows and software of CICS URTS. The routing design is based on a fully connected network. This allows to significantly increase the network bandwidth and meet the requirements regarding information protection, since information flows are formed based on the same type of protocols, which prevents emergence of covert transmission channels. The implementation of the core using full connectivity allows, according to the tags of information flows, to pre-form the routes for exchange of information between servers and applications deployed in CICS URTS. The use of encrypted tags of information flows makes it much more difficult to carry out attacks and organise collection of information about the network structure.Platforms for development of intelligent control systems (ICS), which include CICS URTS, high computing power, data storage capacity and new frameworks are becoming more available for researchers and developers and allow rapid development of ICS. The article discusses the issues of interaction of applications with databases through a combination of several approaches used in the field of Big Data, substantiates combination of Internet of Things (IoT) methodology and microservice architecture. This combination will make it possible to single out business processes in the system and form streaming data processing requiring operational analysis by a human, which is shown by relevant examples.Thus, the objective of the article is to formalise the principles of organising data exchange between CICS URTS and automated control systems (ACS) of railway companies (in our case, using the example of JSC Russian Railways), URTS services providers, and city government bodies, implement the developed approaches into the architecture of CICS URTS and formalise principles of organisation of microservice architecture of CICS URTS software. The main research methods are graph theory, Big Data and IoT methods.Рост ΠΎΠ±ΡŠΡ‘ΠΌΠ° пассаТирских ΠΏΠ΅Ρ€Π΅Π²ΠΎΠ·ΠΎΠΊ Π² условиях ΠΊΡ€ΡƒΠΏΠ½Ρ‹Ρ… городских Π°Π³Π»ΠΎΠΌΠ΅Ρ€Π°Ρ†ΠΈΠΉ ΠΈ мСгаполисов эффСктивно обСспСчиваСтся объСдинСниСм общСствСнного транспорта ΠΈ городских Π»ΠΈΠ½ΠΈΠΉ ΠΆΠ΅Π»Π΅Π·Π½Ρ‹Ρ… Π΄ΠΎΡ€ΠΎΠ³. Π£ΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ Π΄Π²ΠΈΠΆΠ΅Π½ΠΈΠ΅ΠΌ Π² этих условиях Ρ‚Ρ€Π΅Π±ΡƒΠ΅Ρ‚ создания ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹Ρ… Ρ†Π΅Π½Ρ‚Ρ€Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Π½Ρ‹Ρ… ΠΌΠ½ΠΎΠ³ΠΎΡƒΡ€ΠΎΠ²Π½Π΅Π²Ρ‹Ρ… систСм управлСния, Ρ€Π΅Π°Π»ΠΈΠ·ΡƒΡŽΡ‰ΠΈΡ… Π·Π°Π΄Π°Π½Π½Ρ‹Π΅ ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»ΠΈ качСства, ΠΊΠΎΠΌΡ„ΠΎΡ€Ρ‚Π° ΠΈ бСзопасности ΠΏΠ΅Ρ€Π΅Π²ΠΎΠ·ΠΎΠΊ пассаТиров. Π‘ΠΎΠ²Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹Π΅ систСмы управлСния Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ Ρ€Π΅ΡˆΠ°ΡŽΡ‚ Π·Π°Π΄Π°Ρ‡ΠΈ экономии энСргии Π½Π° тягу ΠΏΠΎΠ΄Π²ΠΈΠΆΠ½ΠΎΠ³ΠΎ состава, ΡΠ²Π»ΡΡŽΡ‚ΡΡ Ρ„ΡƒΠ½Π΄Π°ΠΌΠ΅Π½Ρ‚ΠΎΠΌ ΠΈ составной Ρ‡Π°ΡΡ‚ΡŒΡŽ систСм Ρ†ΠΈΡ„Ρ€ΠΎΠ²ΠΈΠ·Π°Ρ†ΠΈΠΈ городского транспорта ΠΈ Π³ΠΎΡ€ΠΎΠ΄Π° Π² Ρ†Π΅Π»ΠΎΠΌ. ΠŸΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΠ΅ систСм, Ρ€Π΅ΡˆΠ°ΡŽΡ‰ΠΈΡ… Π·Π°Π΄Π°Ρ‡ΠΈ планирования ΠΈ управлСния Π΄Π²ΠΈΠΆΠ΅Π½ΠΈΠ΅ΠΌ, рСализуСтся с ΠΏΡ€ΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ΠΌ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΎΠ², ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽΡ‰ΠΈΡ… ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ искусствСнного ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚Π°, ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΡ‹ иСрархичСского построСния Ρ†Π΅Π½Ρ‚Ρ€Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Π½Ρ‹Ρ… систСм, возмоТности Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ Big Data. Π’ этих условиях Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎ ΡƒΡ‡ΠΈΡ‚Ρ‹Π²Π°Ρ‚ΡŒ Π²ΠΎΠ·Ρ€ΠΎΡΡˆΠΈΠ΅ трСбования Π½Π΅ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΊ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠ½ΠΎΠΌΡƒ ΠΎΠ±Π΅ΡΠΏΠ΅Ρ‡Π΅Π½ΠΈΡŽ, Π½ΠΎ ΠΈ ΠΊ тСорСтичСским ΠΈ практичСским Ρ€Π΅ΡˆΠ΅Π½ΠΈΡΠΌ ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΈ сСти.Π’ Π΄Π°Π½Π½ΠΎΠΉ ΡΡ‚Π°Ρ‚ΡŒΠ΅ Ρ€Π°ΡΡΠΌΠ°Ρ‚Ρ€ΠΈΠ²Π°ΡŽΡ‚ΡΡ вопросы формирования Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Ρ‹ ΠΈ Ρ‚Ρ€Π΅Π±ΠΎΠ²Π°Π½ΠΈΠΉ ΠΊ Ρ€Π°Π·Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅ΠΌΡ‹ΠΌ прилоТСниям ΠΈ ΠΈΡ… ΠΈΠ½Ρ‚Π΅Π³Ρ€Π°Ρ†ΠΈΠΈ с Π±Π°Π·Π°ΠΌΠΈ Π΄Π°Π½Π½Ρ‹Ρ… для создания Ρ†Π΅Π½Ρ‚Ρ€Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Π½ΠΎΠΉ ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΠΉ систСмы управлСния городской Ρ€Π΅Π»ΡŒΡΠΎΠ²ΠΎΠΉ транспортной систСмой (ЦИБУ Π“Π Π’Π‘). Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ ΠΏΡ€Π΅Π΄Π»Π°Π³Π°ΡŽΡ‚ΡΡ ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»ΡŒΠ½Ρ‹Π΅ ΠΏΠΎΠ΄Ρ…ΠΎΠ΄Ρ‹ ΠΊ Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Π΅ сСти, ΠΌΠ°Ρ€ΡˆΡ€ΡƒΡ‚ΠΈΠ·Π°Ρ†ΠΈΠΈ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ² ΠΈ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠ½ΠΎΠΌΡƒ ΠΎΠ±Π΅ΡΠΏΠ΅Ρ‡Π΅Π½ΠΈΡŽ ЦИБУ Π“Π Π’Π‘. Π’ основС построСния ΠΌΠ°Ρ€ΡˆΡ€ΡƒΡ‚ΠΈΠ·Π°Ρ†ΠΈΠΈ Π»Π΅ΠΆΠΈΡ‚ использованиС полносвязной сСти. Π­Ρ‚ΠΎ позволяСт Π·Π½Π°Ρ‡ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ ΡƒΠ²Π΅Π»ΠΈΡ‡ΠΈΡ‚ΡŒ ΠΏΡ€ΠΎΠΏΡƒΡΠΊΠ½ΡƒΡŽ ΡΠΏΠΎΡΠΎΠ±Π½ΠΎΡΡ‚ΡŒ сСти ΠΈ Π²Ρ‹ΠΏΠΎΠ»Π½ΠΈΡ‚ΡŒ трСбования ΠΏΠΎ Π·Π°Ρ‰ΠΈΡ‚Π΅ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ, Ρ‚Π°ΠΊ ΠΊΠ°ΠΊ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Π΅ ΠΏΠΎΡ‚ΠΎΠΊΠΈ Ρ„ΠΎΡ€ΠΌΠΈΡ€ΡƒΡŽΡ‚ΡΡ Π½Π° Π±Π°Π·Π΅ ΠΎΠ΄Π½ΠΎΡ‚ΠΈΠΏΠ½Ρ‹Ρ… ΠΏΡ€ΠΎΡ‚ΠΎΠΊΠΎΠ»ΠΎΠ², Ρ‡Ρ‚ΠΎ прСпятствуСт ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΡŽ скрытых ΠΊΠ°Π½Π°Π»ΠΎΠ² ΠΏΠ΅Ρ€Π΅Π΄Π°Ρ‡ΠΈ. РСализация ядра с использованиСм полносвязности позволяСт ΠΏΠΎ ΠΌΠ΅Ρ‚ΠΊΠ°ΠΌ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ² Π·Π°Ρ€Π°Π½Π΅Π΅ ΡΡ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ ΠΌΠ°Ρ€ΡˆΡ€ΡƒΡ‚Ρ‹ ΠΎΠ±ΠΌΠ΅Π½Π° ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ ΠΌΠ΅ΠΆΠ΄Ρƒ сСрвСрами ΠΈ прилоТСниями, Ρ€Π°Π·Π²Ρ‘Ρ€Π½ΡƒΡ‚Ρ‹ΠΌΠΈ Π² ЦИБУ Π“Π Π’Π‘. ИспользованиС ΡˆΠΈΡ„Ρ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… ΠΌΠ΅Ρ‚ΠΎΠΊ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½Ρ‹Ρ… ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ² Π·Π½Π°Ρ‡ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ услоТняСт ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π°Ρ‚Π°ΠΊ ΠΈ ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΡŽ сбора ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ ΠΎ структурС сСти.ΠŸΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΡ‹ для Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹Ρ… систСм управлСния (ИБУ), ΠΊ ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΌ относится ЦИБУ Π“Π Π’Π‘, ΠΎΠ³Ρ€ΠΎΠΌΠ½Ρ‹Π΅ Π²Ρ‹Ρ‡ΠΈΡΠ»ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ мощности, Ρ…Ρ€Π°Π½ΠΈΠ»ΠΈΡ‰Π° Π΄Π°Π½Π½Ρ‹Ρ… ΠΈ Π½ΠΎΠ²Ρ‹Π΅ Ρ„Ρ€Π΅ΠΉΠΌΠ²ΠΎΡ€ΠΊΠΈ становятся всё Π±ΠΎΠ»Π΅Π΅ доступными для ΡƒΡ‡Ρ‘Π½Ρ‹Ρ… ΠΈ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Ρ‡ΠΈΠΊΠΎΠ² ΠΈ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΡŽΡ‚ быстро Ρ€Π°Π·Π²ΠΈΠ²Π°Ρ‚ΡŒΡΡ ИБУ. Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ Ρ€Π°ΡΡΠΌΠ°Ρ‚Ρ€ΠΈΠ²Π°ΡŽΡ‚ΡΡ вопросы взаимодСйствия ΠΏΡ€ΠΈΠ»ΠΎΠΆΠ΅Π½ΠΈΠΉ с Π±Π°Π·Π°ΠΌΠΈ Π΄Π°Π½Π½Ρ‹Ρ… Π½Π° основС ΠΊΠΎΠΌΠ±ΠΈΠ½Π°Ρ†ΠΈΠΈ Π½Π΅ΡΠΊΠΎΠ»ΡŒΠΊΠΈΡ… ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΎΠ², ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌΡ‹Ρ… Π² области Big Data, обосновываСтся сочСтаниС ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠ»ΠΎΠ³ΠΈΠΈ ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚Π° Π²Π΅Ρ‰Π΅ΠΉ (IoT) ΠΈ микросСрвисной Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Ρ‹. Данная комбинация ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΡ‚ Π²Ρ‹Π΄Π΅Π»ΠΈΡ‚ΡŒ Π² систСмС бизнСс-процСссы ΠΈ ΡΡ„ΠΎΡ€ΠΌΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ ΠΏΠΎΡ‚ΠΎΠΊΠΎΠ²ΡƒΡŽ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΡƒ Π΄Π°Π½Π½Ρ‹Ρ…, Ρ‚Ρ€Π΅Π±ΡƒΡŽΡ‰ΠΈΡ… ΠΎΠΏΠ΅Ρ€Π°Ρ‚ΠΈΠ²Π½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊΠΎΠΌ, Ρ‡Ρ‚ΠΎ доказываСтся ΡΠΎΠΎΡ‚Π²Π΅Ρ‚ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΠΌΠΈ ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π°ΠΌΠΈ.Π’Π°ΠΊΠΈΠΌ ΠΎΠ±Ρ€Π°Π·ΠΎΠΌ, Ρ†Π΅Π»ΡŒΡŽ ΡΡ‚Π°Ρ‚ΡŒΠΈ являСтся формализация ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠΎΠ² ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΈ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΎΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΠΌΠ΅Π½Π° ЦИБУ Π“Π Π’Π‘ ΠΈ Π°Π²Ρ‚ΠΎΠΌΠ°Ρ‚ΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… систСм управлСния (АБУ) ΠΆΠ΅Π»Π΅Π·Π½ΠΎΠ΄ΠΎΡ€ΠΎΠΆΠ½Ρ‹Ρ… ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠΉ (Π² нашСм случаС – Π½Π° ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ ОАО Β«Π Π–Π”Β»), ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΉ, ΠΏΡ€Π΅Π΄ΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰ΠΈΡ… услуги Π“Π Π’Π‘, ΠΈ городских ΠΎΡ€Π³Π°Π½ΠΎΠ² управлСния, рСализация сформулированных ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠΎΠ² Π² Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Π΅ ЦИБУ Π“Π Π’Π‘ ΠΈ формализация ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠΎΠ² ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΈ микросСрвисной Π°Ρ€Ρ…ΠΈΡ‚Π΅ΠΊΡ‚ΡƒΡ€Ρ‹ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠ½ΠΎΠ³ΠΎ обСспСчСния ЦИБУ Π“Π Π’Π‘. ΠžΡΠ½ΠΎΠ²Π½Ρ‹ΠΌΠΈ ΠΌΠ΅Ρ‚ΠΎΠ΄Π°ΠΌΠΈ исслСдования ΡΠ²Π»ΡΡŽΡ‚ΡΡ тСория Π³Ρ€Π°Ρ„ΠΎΠ², ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ Big Data, IoT

    AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing

    Full text link
    Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The major market players in this domain are clearly represented by Apache Spark and Flink, which provide a variety of frontend APIs for SQL, statistical inference, machine learning, stream processing, and many others. Yet rather few details are reported on the integration of these engines into the underlying High-Performance Computing (HPC) infrastructure and the communication protocols they use. Spark and Flink, for example, are implemented in Java and still rely on a dedicated master node for managing their control flow among the worker nodes in a compute cluster. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as "Asynchronous Iterative Routing"), which facilitates a direct and asynchronous communication among all client nodes and thereby completely avoids the overhead induced by the control flow with a master node that may otherwise form a performance bottleneck. Our experiments over a variety of benchmark settings confirm that AIR outperforms Spark and Flink in terms of latency and throughput by a factor of up to 15; moreover, we demonstrate that AIR scales out much better than existing DSPSs to clusters consisting of up to 8 nodes and 224 cores.Comment: 16 pages, 6 figures, 15 plot
    corecore