    Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities

    In today's digital world, Generative Artificial Intelligence (GenAI) such as Large Language Models (LLMs) is becoming increasingly prevalent, extending its reach across diverse applications. This surge in adoption has sparked a significant increase in demand for data-centric GenAI models, highlighting the necessity for robust data communication infrastructures. Central to this need are message brokers, which serve as essential channels for data transfer within various system components. This survey aims to delve into a comprehensive analysis of traditional and modern message brokers, offering a comparative study of prevalent platforms. Our study considers numerous criteria including, but not limited to, open-source availability, integrated monitoring tools, message prioritization mechanisms, capabilities for parallel processing, reliability, distribution and clustering functionalities, authentication processes, data persistence strategies, fault tolerance, and scalability. Furthermore, we explore the intrinsic constraints that the design and operation of each message broker might impose, recognizing that these limitations are crucial in understanding their real-world applicability. Finally, this study examines the enhancement of message broker mechanisms specifically for GenAI contexts, emphasizing the criticality of developing a versatile message broker framework. Such a framework would be poised for quick adaptation, catering to the dynamic and growing demands of GenAI in the foreseeable future. Through this dual-pronged approach, we intend to contribute a foundational compendium that can guide future innovations and infrastructural advancements in the realm of GenAI data communication.Comment: 20 pages, 181 references, 7 figures, 5 table

    Scaling up publish/subscribe overlays using interest correlation for link sharing

    Topic-based publish/subscribe is at the core of many distributed systems, ranging from application integration middleware to news dissemination. Therefore, much research was dedicated to publish/subscribe architectures and protocols, and in particular to the design of overlay networks for decentralized topic-based routing and efficient message dissemination. Nonetheless, existing systems fail to take full advantage of shared interests when disseminating information, hence suffering from high maintenance and traffic costs, or construct overlays that cope poorly with the scale and dynamism of large networks. In this paper we present StaN, a decentralized protocol that optimizes the properties of gossip-based overlay networks for topicbased publish/subscribe by sharing a large number of physical connections without disrupting its logical properties. StaN relies only on local knowledge and operates by leveraging common interests among participants to improve global resource usage and promote topic and event scalability. The experimental evaluation under two real workloads, both via a real deployment and through simulation shows that StaN provides an attractive infrastructure for scalable topic-based publish/subscribe

    New Challenges on Web Architectures for the Homogenization of the Heterogeneity of Smart Objects in the Internet of Things

    Aquesta tesi tracta de dues de les noves tecnologies relacionades amb la Internet of Things (IoT) i la seva integració amb el camp de les Smart Grids (SGs); aquestes tecnologies son la Web of Things (WoT) i la Social Internet of Things (SIoT). La WoT és una tecnologia que s’espera que proveeixi d’un entorn escalable i interoperable a la IoT usant la infraestructura web existent, els protocols web y la web semàntica. També s’espera que la SIoT contribueixi a solucionar els reptes d’escalabilitat i capacitat de descobriment creant una xarxa social d’agents (objectes i humans). Per explorar la sinergia entre aquestes tecnologies, l’objectiu és el de proporcionar evidència pràctica i empírica, generalment en forma de prototips d’implementació i experimentació empírica. En relació amb la WoT i les SGs, s’ha creat un prototip per al Web of Energy (WoE) que té com a objectiu abordar els desafiaments presents en el domini les SGs. El prototip és capaç de proporcionar interoperabilitat i homogeneïtat entre diversos protocols. El disseny d’implementació es basa en el Model d’Actors, que també proporciona escalabilitat del prototip. L’experimentació mostra que el prototip pot gestionar la transmissió de missatges per a aplicacions de les SGs que requereixen que la comunicació es realitzi sota llindars de temps crítics. També es pren una altra direcció d’investigació similar, menys centrada en les SGs, però per a una gamma més àmplia de dominis d’aplicació. S’integra la descripció dels fluxos d’execució com a màquines d’estats finits utilitzant ontologies web (Resource Description Framework (RDF)) i metodologies de la WoT (les accions es realitzen basant-se en peticions Hyper-Text Transfer Protocol/Secure (HTTP/S) a Uniform Resource Locators (URLs)). Aquest flux d’execució, que també pot ser un plantilla per a permetre una configuració flexible en temps d’execució, s’implementa i interpreta com si fos (i mitjançant) un Virtual Object (VO). L’objectiu de la plantilla és ser reutilitzable i poder-se compartir entre múltiples desplegaments de la IoT dins el mateix domini d’aplicació. A causa de les tecnologies utilitzades, la solució no és adequada per a aplicacions de temps crític (llindar de temps relativament baix i rígid). No obstant això, és adequat per a aplicacions que no demanden resposta en un temps crític i que requereixen el desplegament de VOs similars en el que fa referència al flux d’execució. Finalment, el treball s’enfoca en una altra tecnologia destinada a millorar l’escalabilitat i la capacitat de descobriment en la IoT. La SIoT està sorgint com una nova estructura de la IoT que uneix els nodes a través de relacions significatives. Aquestes relacions tenen com a objectiu millorar la capacitat de descobriment; en conseqüència, millora la escalabilitat d’una xarxa de la IoT. En aquest treball s’aplica aquest nou paradigma per optimitzar la gestió de l’energia en el costat de la demanda a les SGs. L’objectiu és aprofitar les característiques de la SIoT per ajudar a la creació de Prosumer Community Groups (PCGs) (grups d’usuaris que consumeixen o produeixen energia) amb el mateix objectiu d’optimització en l’ús de l’energia. La sinergia entre la SIoT i les SGs s’ha anomenat Social Internet of Energy (SIoE). Per tant, amb la SIoE i amb el focus en un desafiament específic, s’estableix la base conceptual per a la integració entre la SIoT i les SGs. Els experiments inicials mostren resultats prometedors i aplanen el camí per a futures investigacions i avaluacions de la proposta. Es conclou que el WoT i la SIoT són dos paradigmes complementaris que nodreixen l’evolució de la propera generació de la IoT. S’espera que la propera generació de la IoT sigui un Multi-Agent System (MAS) generalitzat. Alguns investigadors ja estan apuntant a la Web i les seves tecnologies (per exemple, Web Semàntica, HTTP/S)—i més concretamente a la WoT — com a l’entorn que nodreixi a aquests agents. La SIoT pot millorar tant l’entorn com les relacions entre els agents en aquesta fusió. Les SGs també poden beneficiar-se dels avenços de la IoT, ja que es poden considerar com una aplicació específica d’aquesta última.  Esta tesis trata de dos de las novedosas tecnologías relacionadas con la Internet of Things (IoT) y su integración con el campo de las Smart Grids (SGs); estas tecnologías son laWeb of Things (WoT) y la Social Internet of Things (SIoT). La WoT es una tecnología que se espera que provea de un entorno escalable e interoperable a la IoT usando la infraestructura web existente, los protocolos web y la web semántica. También se espera que la SIoT contribuya a solucionar los retos de escalabilidad y capacidad de descubrimiento creando una red social de agentes (objetos y humanos). Para explorar la sinergia entre estas tecnologías, el objetivo es el de proporcionar evidencia práctica y empírica, generalmente en forma de prototipos de implementación y experimentación empírica. En relación con la WoT y las SGs, se ha creado un prototipo para la Web of Energy (WoE) que tiene como objetivo abordar los desafíos presentes en el dominio las SGs. El prototipo es capaz de proporcionar interoperabilidad y homogeneidad entre diversos protocolos. El diseño de implementación se basa en el Modelo de Actores, que también proporciona escalabilidad del prototipo. La experimentación muestra que el prototipo puede manejar la transmisión de mensajes para aplicaciones de las SGs que requieran que la comunicación se realice bajo umbrales de tiempo críticos. También se toma otra dirección de investigación similar, menos centrada en las SGs, pero para una gama más amplia de dominios de aplicación. Se integra la descripción de los flujos de ejecución como máquinas de estados finitos utilizando ontologías web (Resource Description Framework (RDF)) y metodologías de la WoT (las acciones se realizan basándose en peticiones Hyper-Text Transfer Protocol/Secure (HTTP/S) a Uniform Resource Locators (URLs)). Este flujo de ejecución, que también puede ser una plantilla para permitir una configuración flexible en tiempo de ejecución, se implementa e interpreta como si fuera (y a través de) un Virtual Object (VO). El objetivo de la plantilla es que sea reutilizable y se pueda compartir entre múltiples despliegues de la IoT dentro del mismo dominio de aplicación. Debido a las tecnologías utilizadas, la solución no es adecuada para aplicaciones de tiempo crítico (umbral de tiempo relativamente bajo y rígido). Sin embargo, es adecuado para aplicaciones que no demandan respuesta en un tiempo crítico y que requieren el despliegue de VOs similares en cuanto al flujo de ejecución. Finalmente, el trabajo se enfoca en otra tecnología destinada a mejorar la escalabilidad y la capacidad de descubrimiento en la IoT. La SIoT está emergiendo como una nueva estructura de la IoT que une los nodos a través de relaciones significativas. Estas relaciones tienen como objetivo mejorar la capacidad de descubrimiento; en consecuencia, mejora la escalabilidad de una red de la IoT. En este trabajo se aplica este nuevo paradigma para optimizar la gestión de la energía en el lado de la demanda en las SGs. El objetivo es aprovechar las características de la SIoT para ayudar en la creación de Prosumer Community Groups (PCGs) (grupos de usuarios que consumen o producen energía) con el mismo objetivo de optimización en el uso de la energía. La sinergia entre la SIoT y las SGs ha sido denominada Social Internet of Energy (SIoE). Por lo tanto, con la SIoE y con el foco en un desafío específico, se establece la base conceptual para la integración entre la SIoT y las SG. Los experimentos iniciales muestran resultados prometedores y allanan el camino para futuras investigaciones y evaluaciones de la propuesta. Se concluye que la WoT y la SIoT son dos paradigmas complementarios que nutren la evolución de la próxima generación de la IoT. Se espera que la próxima generación de la IoT sea un Multi-Agent System (MAS) generalizado. Algunos investigadores ya están apuntando a la Web y sus tecnologías (por ejemplo,Web Semántica, HTTP/S)—y más concretamente a la WoT — como el entorno que nutra a estos agentes. La SIoT puede mejorar tanto el entorno como las relaciones entre los agentes en esta fusión. Como un campo específico de la IoT, las SGs también pueden beneficiarse de los avances de la IoT.This thesis deals with two novel Internet of Things (IoT) technologies and their integration to the field of the Smart Grid (SG); these technologies are the Web of Things (WoT) and the Social Internet of Things (SIoT). The WoT is an enabling technology expected to provide a scalable and interoperable environment to the IoT using the existing web infrastructure, web protocols and the semantic web. The SIoT is expected to expand further and contribute to scalability and discoverability challenges by creating a social network of agents (objects and humans). When exploring the synergy between those technologies, we aim at providing practical and empirical evidence, usually in the form of prototype implementations and empirical experimentation. In relation to the WoT and SG, we create a prototype for the Web of Energy (WoE), that aims at addressing challenges present in the SG domain. The prototype is capable of providing interoperability and homogeneity among diverse protocols. The implementation design is based on the Actor Model, which also provides scalability in regards to the prototype. Experimentation shows that the prototype can handle the transmission of messages for time-critical SG applications. We also take another similar research direction less focused on the SG, but for a broader range of application domains. We integrate the description of flows of execution as Finite-State Machines (FSMs) using web ontologies (Resource Description Framework (RDF)) and WoT methodologies (actions are performed on the basis of calls Hyper Text Transfer Protocol/ Secure (HTTP/S) to a Uniform Resource Locator (URL)). This execution flow, which can also be a template to allow flexible configuration at runtime, is deployed and interpreted as (and through) a Virtual Object (VO). The template aims to be reusable and shareable among multiple IoT deployments within the same application domain. Due to the technologies used, the solution is not suitable for time-critical applications. Nevertheless, it is suitable for non-time-critical applications that require the deployment of similar VOs. Finally, we focus on another technology aimed at improving scalability and discoverability in IoT. The SIoT is emerging as a new IoT structure that links nodes through meaningful relationships. These relationships aim at improving discoverability; consequently, improving the scalability of an IoT network. We apply this new paradigm to optimize energy management at the demand side in a SG. Our objective is to harness the features of the SIoT to aid in the creation of Prosumer Community Group (PCG) (groups of energy users that consume or produce energy) with the same Demand Side Management (DSM) goal. We refer to the synergy between SIoT and SG as Social Internet of Energy (SIoE). Therefore, with the SIoE and focusing on a specific challenge, we set the conceptual basis for the integration between SIoT and SG. Initial experiments show promising results and pave the way for further research and evaluation of the proposal. We conclude that the WoT and the SIoT are two complementary paradigms that nourish the evolution of the next generation IoT. The next generation IoT is expected to be a pervasive Multi-Agent System (MAS). Some researchers are already pointing at the Web and its technologies (e.g. Semantic Web, HTTP/S) — and more concretely at the WoT — as the environment nourishing the agents. The SIoT can enhance both the environment and the relationships between agents in this fusion. As a specific field of the IoT, the SG can also benefit from IoT advancements

    mARC: Memory by Association and Reinforcement of Contexts

    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    Extração de conhecimento a partir de fontes semi-estruturadas

    The increasing number of small, cheap devices, full of sensing capabilities lead to an untapped source of data that can be explored to improve and optimize multiple systems, from small-scale home automation to large-scale applications such as agriculture monitoring, traffic flow and industrial maintenance prediction. Yet, hand in hand with this growth, goes the increasing difficulty to collect, store and organize all these new data. The lack of standard context representation schemes is one of the main struggles in this area. Furthermore, conventional methods for extracting knowledge from data rely on standard representations or a priori relations. These a priori relations add latent information to the underlying model, in the form of context representation schemes, table relations, or even ontologies. Nonetheless, these relations are created and maintained by human users. While feasible for small-scale scenarios or specific areas, this becomes increasingly difficult to maintain when considering the potential dimension of IoT and M2M scenarios. This thesis addresses the problem of storing and organizing context information from IoT/M2M scenarios in a meaningful way, without imposing a representation scheme or requiring a priori relations. This work proposes a d-dimension organization model, which was optimized for IoT/M2M data. The model relies on machine learning features to identify similar context sources. These features are then used to learn relations between data sources automatically, providing the foundations for automatic knowledge extraction, where machine learning, or even conventional methods, can rely upon to extract knowledge on a potentially relevant dataset. During this work, two different machine learning techniques were tackled: semantic and stream similarity. Semantic similarity estimates the similarity between concepts (in textual form). This thesis proposes an unsupervised learning method for semantic features based on distributional profiles, without requiring any specific corpus. This allows the organizational model to organize data based on concept similarity instead of string matching. Another advantage is that the learning method does not require input from users, making it ideal for massive IoT/M2M scenarios. Stream similarity metrics estimate the similarity between two streams of data. Although these methods have been extensively researched for DNA sequencing, they commonly rely on variants of the longest common sub-sequence. This PhD proposes a generative model for stream characterization, specially optimized for IoT/M2M data. The model can be used to generate statistically significant data’s streams and estimate the similarity between streams. This is then used by the context organization model to identify context sources with similar stream patterns. The work proposed in this thesis was extensively discussed, developed and published in several international publications. The multiple contributions in projects and collaborations with fellow colleagues, where parts of the work developed were used successfully, support the claim that although the context organization model (and subsequent similarity features) were optimized for IoT/M2M data, they can potentially be extended to deal with any kind of context information in a wide array of applications.O número crescente de dispositivos pequenos e baratos, repletos de capacidades sensoriais, criou uma nova fonte de dados que pode ser explorada para melhorar e otimizar vários sistemas, desde domótica em ambientes residenciais até aplicações de larga escala como monitorização agrícola, gestão de tráfego e manutenção preditiva a nível industrial. No entanto, este crescimento encontra-se emparelhado com a crescente dificuldade em recolher, armazenar e organizar todos estes dados. A inexistência de um esquema de representação padrão é uma das principais dificuldades nesta área. Além disso, métodos de extração de conhecimento convencionais dependem de representações padrão ou relações definidas a priori. No entanto estas relações são definidas e mantidas por utilizadores humanos. Embora seja viável para cenários de pequena escala ou áreas especificas, este tipo de relações torna-se cada vez mais difícil de manter quando se consideram cenários com a dimensão associado a IoT e M2M. Esta tese de doutoramento endereça o problema de armazenar e organizar informação de contexto de cenários de IoT/M2M, sem impor um esquema de representação ou relações a priori. Este trabalho propõe um modelo de organização com d dimensões, especialmente otimizado para dados de IoT/M2M. O modelo depende de características de machine learning para identificar fontes de contexto similares. Estas caracteristicas são utilizadas para aprender relações entre as fontes de dados automaticamente, criando as fundações para a extração de conhecimento automática. Quer machine learning quer métodos convencionais podem depois utilizar estas relações automáticas para extrair conhecimento em datasets potencialmente relevantes. Durante este trabalho, duas técnicas foram desenvolvidas: similaridade semântica e similaridade entre séries temporais. Similaridade semântica estima a similaridade entre conceitos (em forma textual). Este trabalho propõe um método de aprendizagem não supervisionado para features semânticas baseadas em perfis distributivos, sem exigir nenhum corpus específico. Isto permite ao modelo de organização organizar dados baseado em conceitos e não em similaridade de caracteres. Numa outra vantagem importante para os cenários de IoT/M2M, o método de aprendizagem não necessita de dados de entrada adicionados por utilizadores. A similaridade entre séries temporais são métricas que permitem estimar a similaridade entre várias series temporais. Embora estes métodos tenham sido extensivamente desenvolvidos para sequenciação de ADN, normalmente dependem de variantes de métodos baseados na maior sub-sequencia comum. Esta tese de doutoramento propõe um modelo generativo para caracterizar séries temporais, especialmente desenhado para dados IoT/M2M. Este modelo pode ser usado para gerar séries temporais estatisticamente corretas e estimar a similaridade entre múltiplas séries temporais. Posteriormente o modelo de organização identifica fontes de contexto com padrões temporais semelhantes. O trabalho proposto foi extensivamente discutido, desenvolvido e publicado em diversas publicações internacionais. As múltiplas contribuições em projetos e colaborações com colegas, onde partes trabalho desenvolvido foram utilizadas com sucesso, permitem reivindicar que embora o modelo (e subsequentes técnicas) tenha sido otimizado para dados IoT/M2M, podendo ser estendido para lidar com outros tipos de informação de contexto noutras áreas.The present study was developed in the scope of the Smart Green Homes Project [POCI-01-0247-FEDER-007678], a co-promotion between Bosch Termotecnologia S.A. and the University of Aveiro. It is financed by Portugal 2020 under the Competitiveness and Internationalization Operational Program, and by the European Regional Development Fund.Programa Doutoral em Informátic

    2013 Doctoral Workshop on Distributed Systems

    The Doctoral Workshop on Distributed Systems was held at Les Plans-sur-Bex, Switzerland, from June 26-28, 2013. Ph.D. students from the Universities of Neuchâtel and Bern as well as the University of Applied Sciences of Fribourg presented their current research work and discussed recent research results. This technical report includes the extended abstracts of the talks given during the workshop

    Agent Organization and Request Propagation in the Knowledge Plane

    In designing and building a network like the Internet, we continue to face the problems of scale and distribution. In particular, network management has become an increasingly difficult task, and network applications often need to maintain efficient connectivity graphs for various purposes. The knowledge plane was proposed as a new construct to improve network management and applications. In this proposal, I propose an application-independent mechanism to support the construction of application-specific connectivity graphs. Specifically, I propose to build a network knowledge plane and multiple sub-planes for different areas of network services. The network knowledge plane provides valuable knowledge about the Internet to the sub-planes, and each sub-plane constructs its own connectivity graph using network knowledge and knowledge in its own specific area. I focus on two key design issues: (1) a region-based architecture for agent organization; (2) knowledge dissemination and request propagation. Network management and applications benefit from the underlying network knowledge plane and sub-planes. To demonstrate the effectiveness of this mechanism, I conduct case studies in network management and security