520 research outputs found

    Model-based resource analysis and synthesis of service-oriented automotive software architectures

    Get PDF
    Context Automotive software architectures describe distributed functionality by an interaction of software components. One drawback of today\u27s architectures is their strong integration into the onboard communication network based on predefined dependencies at design time. The idea is to reduce this rigid integration and technological dependencies. To this end, service-oriented architecture offers a suitable methodology since network communication is dynamically established at run-time. Aim We target to provide a methodology for analysing hardware resources and synthesising automotive service-oriented architectures based on platform-independent service models. Subsequently, we focus on transforming these models into a platform-specific architecture realisation process following AUTOSAR Adaptive. Approach For the platform-independent part, we apply the concepts of design space exploration and simulation to analyse and synthesise deployment configurations, i. e., mapping services to hardware resources at an early development stage. We refine these configurations to AUTOSAR Adaptive software architecture models representing the necessary input for a subsequent implementation process for the platform-specific part. Result We present deployment configurations that are optimal for the usage of a given set of computing resources currently under consideration for our next generation of E/E architecture. We also provide simulation results that demonstrate the ability of these configurations to meet the run time requirements. Both results helped us to decide whether a particular configuration can be implemented. As a possible software toolchain for this purpose, we finally provide a prototype. Conclusion The use of models and their analysis are proper means to get there, but the quality and speed of development must also be considered

    Scalable processing of aggregate functions for data streams in resource-constrained environments

    Get PDF
    The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large datacenters, a common demand for currently emerging scenarios is the need to process vast amounts of data with low latencies, generally performing the analysis process as close to the data source as possible. Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. However, data analytics might require to aggregate extensive windows of data. Approximate computing has been a central paradigm for decades in data analytics in order to improve the performance and reduce the needed resources, such as memory, computation time, bandwidth or energy. In exchange for these improvements, the aggregated results suffer from a level of inaccuracy that in some cases can be predicted and constrained. This doctoral thesis aims to demonstrate that it is possible to have constant-time and memory efficient aggregation functions with approximate computing mechanisms for constrained environments. In order to achieve this goal, the work has been structured in three research challenges. First we introduce a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model de¿ned by the applications that consume data. The subscription-based programing model enables multiple users to deploy their own data-processing services. On top of this runtime, we present the Amortized Monoid Tree Aggregator general sliding window aggregation framework, which seamlessly combines the following features: amortized O(1) time complexity and a worst-case of O(log n) between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortized O(1) computational cost for single evictions and supports bulk evictions with cost O(log n); and it requires a local memory space of O(log n). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed Key-Value Stores to keep window elements and partial aggregations. Specially motivated by edge computing scenarios, we contribute Approximate and Amortized Monoid Tree Aggregator (A2MTA). It is, to our knowledge, the first general purpose sliding window programable framework that combines constant-time aggregations with error bounded approximate computing techniques. A2MTA uses statistical analysis of the stream data in order to perform inaccurate aggregations, providing a critical reduction of needed resources for massive stream data aggregation, and an improvement of performance.La ràpida evolució de les plataformes d'anàlisi de dades ha resultat en un increment de la demanda de processament de fluxos continus de dades en temps real. Des de la internet de les coses fins al monitoratge de telemetria generada en grans servidors, una demanda recurrent per escenaris emergents es la necessitat de processar grans quantitats de dades amb latències molt baixes, generalment fent el processat de les dades tant a prop dels origines com sigui possible. Les dades son generades com a fluxos continus per dispositius que utilitzen una varietat de localitzacions i protocols. Aquests processat de les dades s pot fer en temps real amb les transformacions efectuant-se al vol, i en aquest cas la utilització de plataformes de processat d'streams és necessària. Les plataformes de processat d'streams cal que absorbeixin pics de freqüència de dades. Les dades es generen com a series temporals que s'agreguen fent servir multiples operadors, on les finestres són l'abstracció més habitual. Per a satisfer les baixes latències i maleabilitat requerides, els operadors necesiten tenir un cost computacional mínim, inclús amb extenses finestres de dades per a agregar. La computació aproximada ha sigut durant decades un paradigma rellevant per l'anàlisi de dades on cal millorar el rendiment de diferents algorismes i reduir-ne el temps de computació, la memòria requerida, l'ample de banda o el consum energètic. A canvi d'aquestes millores, els resultats poden patir d'una falta d'exactitud que pot ser estimada i controlada. Aquesta tesi doctoral vol demostrar que es posible tenir funcions d'agregació pel processat d'streams que tinc un cost de temps constant, sigui eficient en termes de memoria i faci ús de computació aproximada. Per aconseguir aquests objectius, aquesta tesi està dividida en tres reptes. Primer presentem un entorn per a la construcció dinàmica de topologies de computació d'streams de dades utilitzant codi d'usuari. Aquestes topologies es construeixen fent servir un model de subscripció a streams, en el que les aplicación consumidores de dades amplien les topologies mentre s'estan executant. Aquest entorn permet multiples entitats ampliant una mateixa topologia. A sobre d'aquest entorn, presentem un framework de propòsit general per a l'agregació de finestres de dades anomenat AMTA (Amortized Monoid Tree Aggregator). Aquest framework combina: temps amortitzat constant per a totes les operacions, amb un cas pitjor logarítmic; programable tant en termes d'agregació com en termes d'expulsió d'elements de la finestra. L'expulsió massiva d'elements de la finestra es considera una operació atòmica, amb un cost amortitzat constant; i requereix espai en memoria local per a O(log n) elements de la finestra. Aquest framework pot computar agregacions sobre multiples dimensions de dades, i ha estat dissenyat per desacoplar la computació de les dades del seu desat, podent tenir els continguts de la finestra distribuits en diferents màquines. Motivats per la computació en l'edge (edge computing), hem contribuit A2MTA (Approximate and Amortized Monoid Tree Aggregator). Des de el nostre coneixement, es el primer framework de propòsit general per a la computació de finestres que combina un cost constant per a totes les seves operacions amb tècniques de computació aproximada amb control de l'error. A2MTA fa us d'anàlisis estadístics per a poder fer agregacions amb error limitat, reduint críticament els recursos necessaris per a la computació de grans quantitats de dades

    SAVCBS 2003: Specification and Verification of Component-Based Systems

    Get PDF
    These are the proceedings for the SAVCBS 2003 workshop. This workshop was held at ESEC/FSE 2003 in Helsinki Finland in September 2003

    RHEA: a reactive, heterogeneous, extensible and abstract framework for dataflow programming

    Get PDF
    Το υπολογιστικό μοντέλο ροών δεδομένων μας επιτρέπει να γράφουμε προγράμματα με παραλληλία υψηλού βαθμού, τα οποία θα εκτελεστούν σε ένα ετερογενές δίκτυο, με έναν συμπαγή και ευανάγνωστο τρόπο. Το κύριο πλεονέκτημα είναι το γεγονός ότι το σύστημα μπορεί να χωριστεί εννοιολογικά σε διάφορα ανεξάρτητα μέρη τα οποία μπορούν να εκτε- λεστούν παράλληλα και σε διαφορετικές μηχανές. Ως εκ τούτου, ο ταυτοχρονισμός και η κατανομή είναι υπονοούμενα και ο προγραμματιστής έχει λίγη, ως καθόλου, ευθύνη γι’ αυτά. Το προγραμματιστικό περιβάλλον που προτείνεται στην παρούσα πτυχιακή εργασία συνιστά το θεμελιώδες σύστημα που καθιστά δυνατό αυτόν τον τρόπο προγραμματισμού σε γλώσσες βασισμένες στο JVM (πχ Java, Scala, Closure), ενώ ταυτόχρονα κάνει πιο εύκολη την ενσωμάτωση άλλων τεχνολογιών που βασίζονται στο PubSub μοντέλο, με σκοπό να απομακρυνθούμε από τη χρήση προστακτικών γλώσσών και να υπεισέλθουμε σε ένα υψηλότερο επίπεδο αφαίρεσης. Ιδιαίτερη έμφαση δόθηκε σε τρεις τομείς: Μεγάλα Δεδομένα, Ρομποτική και Διαδίκτυο των Πραγμάτων.The dataflow computational model enables writing highly parallel programs, which will be deployed on a heterogeneous network, in a concise and readable way. The main advan- tage is the fact that the system can be conceptually separated into several independent components that can be run in parallel and deployed on different machines. Therefore, concurrency and distribution is implicit and little or no responsibility is given to the pro- grammer. The framework proposed in this thesis constitutes the underlying system that make this style of programming possible in JVM-based languages (e.g. Java, Scala, Clo- jure), while at the same time making it easy to integrate other technologies that rely on the PubSub model, in order to move away from imperative languages and enter a higher level of abstraction. Particular emphasis was put on three domains, namely Big Data, Robotics and IoT

    TagNet: a scalable tag-based information-centric network

    Get PDF
    The Internet has changed dramatically since the time it was created. What was originally a system to connect relatively few remote users to mainframe computers, has now become a global network of billions of diverse devices, serving a large user population, more and more characterized by wireless communication, user mobility, and large-scale, content-rich, multi-user applications that are stretching the basic end-to-end, point-to-point design of TCP/IP. In recent years, researchers have introduced the concept of Information Centric Networking (ICN). The ambition of ICN is to redesign the Internet with a new service model more suitable to today's applications and users. The main idea of ICN is to address information rather than hosts. This means that a user could access information directly, at the network level, without having to first find out which host to contact to obtain that information. The ICN architectures proposed so far are based on a "pull" communication service. This is because today's Internet carries primarily video traffic that is easy to serve through pull communication primitives. Another common design choice in ICN is to name content, typically with hierarchical names similar to file names or URLs. This choice is once again rooted in the use of URLs to access Web content. However, names offer only a limited expressiveness and may or may not aggregate well at a global scale. In this thesis we present a new ICN architecture called TagNet. TagNet intends to offer a richer communication model and a new addressing scheme that is at the same time more expressive than hierarchical names from the viewpoint of applications, and more effective from the viewpoint of the network for the purpose of routing and forwarding. For the service model, TagNet extends the mainstream "pull" ICN with an efficient "push" network-level primitive. Such push service is important for many applications such as social media, news feeds, and Internet of Things. Push communication could be implemented on top of a pull primitive, but all such implementations would suffer for high traffic overhead and/or poor performance. As for the addressing scheme, TagNet defines and uses different types of addresses for different purposes. Thus TagNet allows applications to describe information by means of sets of tags. Such tag-based descriptors are true content-based addresses, in the sense that they characterize the multi-dimensional nature of information without forcing a partitioning of the information space as is done with hierarchical names. Furthermore, descriptors are completely user-defined, and therefore give more flexibility and expressive power to users and applications, and they also aggregate by subset. By their nature, descriptors have no relation to the network topology and are not intended to identify content univocally. Therefore, TagNet complements descriptors with locators and identifiers. Locators are network-defined addresses that can be used to forward packets between known nodes (as in the current IP network); content identifiers are unique identifiers for particular blocks of content, and therefore can be used for authentication and caching. In this thesis we propose a complete protocol stack for TagNet covering the routing scheme, forwarding algorithm, and congestion control at the transport level. We then evaluate the whole protocol stack showing that (1) the use of both push and pull services at the network level reduces network traffic significantly; (2) the tree-based routing scheme we propose scales well, with routing tables that can store billions of descriptors in a few gigabytes thanks to descriptor aggregation; (3) the forwarding engine with specialized matching algorithms for descriptors and locators achieves wire-speed forwarding rates; and (4) the congestion control is able to effectively and fairly allocate all the bandwidth available in the network while minimizing the download time of an object and avoiding congestion

    A framework for flexible integration in robotics and its applications for calibration and error compensation

    Get PDF
    Robotics has been considered as a viable automation solution for the aerospace industry to address manufacturing cost. Many of the existing robot systems augmented with guidance from a large volume metrology system have proved to meet the high dimensional accuracy requirements in aero-structure assembly. However, they have been mainly deployed as costly and dedicated systems, which might not be ideal for aerospace manufacturing having low production rate and long cycle time. The work described in this thesis is to provide technical solutions to improve the flexibility and cost-efficiency of such metrology-integrated robot systems. To address the flexibility, a software framework that supports reconfigurable system integration is developed. The framework provides a design methodology to compose distributed software components which can be integrated dynamically at runtime. This provides the potential for the automation devices (robots, metrology, actuators etc.) controlled by these software components to be assembled on demand for various assembly applications. To reduce the cost of deployment, this thesis proposes a two-stage error compensation scheme for industrial robots that requires only intermittent metrology input, thus allowing for one expensive metrology system to be used by a number of robots. Robot calibration is employed in the first stage to reduce the majority of robot inaccuracy then the metrology will correct the residual errors. In this work, a new calibration model for serial robots having a parallelogram linkage is developed that takes into account both geometric errors and joint deflections induced by link masses and weight of the end-effectors. Experiments are conducted to evaluate the two pieces of work presented above. The proposed framework is adopted to create a distributed control system that implements calibration and error compensation for a large industrial robot having a parallelogram linkage. The control system is formed by hot-plugging the control applications of the robot and metrology used together. Experimental results show that the developed error model was able to improve the 3 positional accuracy of the loaded robot from several millimetres to less than one millimetre and reduce half of the time previously required to correct the errors by using only the metrology. The experiments also demonstrate the capability of sharing one metrology system to more than one robot

    A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks

    Get PDF
    Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas. Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making processes of RITMOs. This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework. This work is intended to be a supporting methodological guide, based on widely used Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST Events. The proposed methodology was evaluated and demonstrated in various real-world use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and Visualisation methods, tools and technologies, under the umbrella of several research projects funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas. Este aumento, aliado ao robustecimento de uma classe média com maior poder económico, introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana. Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio- temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os RITMOs nos seus processos de decisão. Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla- taforma MobiTrafficBD. Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui- teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi- cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em plataformas MobiTrafficB

    Stream Processing Systems Benchmark: StreamBench

    Get PDF
    Batch processing technologies (Such as MapReduce, Hive, Pig) have matured and been widely used in the industry. These systems solved the issue processing big volumes of data successfully. However, first big amount of data need to be collected and stored in a database or file system. That is very time-consuming. Then it takes time to finish batch processing analysis jobs before get any results. While there are many cases that need analysed results from unbounded sequence of data in seconds or sub-seconds. To satisfy the increasing demand of processing such streaming data, several streaming processing systems are implemented and widely adopted, such as Apache Storm, Apache Spark, IBM InfoSphere Streams, and Apache Flink. They all support online stream processing, high scalability, and tasks monitoring. While how to evaluate stream processing systems before choosing one in production development is an open question. In this thesis, we introduce StreamBench, a benchmark framework to facilitate performance comparisons of stream processing systems. A common API component and a core set of workloads are defined. We implement the common API and run benchmarks for three widely used open source stream processing systems: Apache Storm, Flink, and Spark Streaming. A key feature of the StreamBench framework is that it is extensible -- it supports easy definition of new workloads, in addition to making it easy to benchmark new stream processing systems

    Scalable processing of aggregate functions for data streams in resource-constrained environments

    Get PDF
    The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large datacenters, a common demand for currently emerging scenarios is the need to process vast amounts of data with low latencies, generally performing the analysis process as close to the data source as possible. Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. However, data analytics might require to aggregate extensive windows of data. Approximate computing has been a central paradigm for decades in data analytics in order to improve the performance and reduce the needed resources, such as memory, computation time, bandwidth or energy. In exchange for these improvements, the aggregated results suffer from a level of inaccuracy that in some cases can be predicted and constrained. This doctoral thesis aims to demonstrate that it is possible to have constant-time and memory efficient aggregation functions with approximate computing mechanisms for constrained environments. In order to achieve this goal, the work has been structured in three research challenges. First we introduce a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model de¿ned by the applications that consume data. The subscription-based programing model enables multiple users to deploy their own data-processing services. On top of this runtime, we present the Amortized Monoid Tree Aggregator general sliding window aggregation framework, which seamlessly combines the following features: amortized O(1) time complexity and a worst-case of O(log n) between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortized O(1) computational cost for single evictions and supports bulk evictions with cost O(log n); and it requires a local memory space of O(log n). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed Key-Value Stores to keep window elements and partial aggregations. Specially motivated by edge computing scenarios, we contribute Approximate and Amortized Monoid Tree Aggregator (A2MTA). It is, to our knowledge, the first general purpose sliding window programable framework that combines constant-time aggregations with error bounded approximate computing techniques. A2MTA uses statistical analysis of the stream data in order to perform inaccurate aggregations, providing a critical reduction of needed resources for massive stream data aggregation, and an improvement of performance.La ràpida evolució de les plataformes d'anàlisi de dades ha resultat en un increment de la demanda de processament de fluxos continus de dades en temps real. Des de la internet de les coses fins al monitoratge de telemetria generada en grans servidors, una demanda recurrent per escenaris emergents es la necessitat de processar grans quantitats de dades amb latències molt baixes, generalment fent el processat de les dades tant a prop dels origines com sigui possible. Les dades son generades com a fluxos continus per dispositius que utilitzen una varietat de localitzacions i protocols. Aquests processat de les dades s pot fer en temps real amb les transformacions efectuant-se al vol, i en aquest cas la utilització de plataformes de processat d'streams és necessària. Les plataformes de processat d'streams cal que absorbeixin pics de freqüència de dades. Les dades es generen com a series temporals que s'agreguen fent servir multiples operadors, on les finestres són l'abstracció més habitual. Per a satisfer les baixes latències i maleabilitat requerides, els operadors necesiten tenir un cost computacional mínim, inclús amb extenses finestres de dades per a agregar. La computació aproximada ha sigut durant decades un paradigma rellevant per l'anàlisi de dades on cal millorar el rendiment de diferents algorismes i reduir-ne el temps de computació, la memòria requerida, l'ample de banda o el consum energètic. A canvi d'aquestes millores, els resultats poden patir d'una falta d'exactitud que pot ser estimada i controlada. Aquesta tesi doctoral vol demostrar que es posible tenir funcions d'agregació pel processat d'streams que tinc un cost de temps constant, sigui eficient en termes de memoria i faci ús de computació aproximada. Per aconseguir aquests objectius, aquesta tesi està dividida en tres reptes. Primer presentem un entorn per a la construcció dinàmica de topologies de computació d'streams de dades utilitzant codi d'usuari. Aquestes topologies es construeixen fent servir un model de subscripció a streams, en el que les aplicación consumidores de dades amplien les topologies mentre s'estan executant. Aquest entorn permet multiples entitats ampliant una mateixa topologia. A sobre d'aquest entorn, presentem un framework de propòsit general per a l'agregació de finestres de dades anomenat AMTA (Amortized Monoid Tree Aggregator). Aquest framework combina: temps amortitzat constant per a totes les operacions, amb un cas pitjor logarítmic; programable tant en termes d'agregació com en termes d'expulsió d'elements de la finestra. L'expulsió massiva d'elements de la finestra es considera una operació atòmica, amb un cost amortitzat constant; i requereix espai en memoria local per a O(log n) elements de la finestra. Aquest framework pot computar agregacions sobre multiples dimensions de dades, i ha estat dissenyat per desacoplar la computació de les dades del seu desat, podent tenir els continguts de la finestra distribuits en diferents màquines. Motivats per la computació en l'edge (edge computing), hem contribuit A2MTA (Approximate and Amortized Monoid Tree Aggregator). Des de el nostre coneixement, es el primer framework de propòsit general per a la computació de finestres que combina un cost constant per a totes les seves operacions amb tècniques de computació aproximada amb control de l'error. A2MTA fa us d'anàlisis estadístics per a poder fer agregacions amb error limitat, reduint críticament els recursos necessaris per a la computació de grans quantitats de dades.Postprint (published version
    corecore