14 research outputs found
Summary of the 12th international workshop on [email protected]
This year the 12th edition of the workshop [email protected] was held at the 20th International Conference on Model Driven Engineering Languages and Systems. The workshop took place in the city of Austin, Texas, USA, on the 18th of September 2017. The workshop was organized by Sebastian Götz, Nelly Bencomo, Kirstie Bellman and Gordon Blair. Here, we present a summary of the workshop and a synopsis of the topics discussed and highlighted during the workshop
Feature-Model-Guided Online Learning for Self-Adaptive Systems
A self-adaptive system can modify its own structure and behavior at runtime
based on its perception of the environment, of itself and of its requirements.
To develop a self-adaptive system, software developers codify knowledge about
the system and its environment, as well as how adaptation actions impact on the
system. However, the codified knowledge may be insufficient due to design time
uncertainty, and thus a self-adaptive system may execute adaptation actions
that do not have the desired effect. Online learning is an emerging approach to
address design time uncertainty by employing machine learning at runtime.
Online learning accumulates knowledge at runtime by, for instance, exploring
not-yet executed adaptation actions. We address two specific problems with
respect to online learning for self-adaptive systems. First, the number of
possible adaptation actions can be very large. Existing online learning
techniques randomly explore the possible adaptation actions, but this can lead
to slow convergence of the learning process. Second, the possible adaptation
actions can change as a result of system evolution. Existing online learning
techniques are unaware of these changes and thus do not explore new adaptation
actions, but explore adaptation actions that are no longer valid. We propose
using feature models to give structure to the set of adaptation actions and
thereby guide the exploration process during online learning. Experimental
results involving four real-world systems suggest that considering the
hierarchical structure of feature models may speed up convergence by 7.2% on
average. Considering the differences between feature models before and after an
evolution step may speed up convergence by 64.6% on average. [...
Recommended from our members
On Implementing Autonomic Systems with a Serverless Computing Approach: The Case of Self-Partitioning Cloud Caches
The research community has made significant advances towards realizing self-tuning cloud caches; notwithstanding, existing products still require manual expert tuning to maximize performance. Cloud (software) caches are built to swiftly serve requests; thus, avoiding costly functionality additions not directly related to the request-serving control path is critical. We show that serverless computing cloud services can be leveraged to solve the complex optimization problems that arise during self-tuning loops and can be used to optimize cloud caches for free. To illustrate that our approach is feasible and useful, we implement SPREDS (Self-Partitioning REDiS), a modified version of Redis that optimizes memory management in the multi-instance Redis scenario. A cost analysis shows that the serverless computing approach can lead to significant cost savings: The cost of running the controller as a serverless microservice is 0.85% of the cost of the always-on alternative. Through this case study, we make a strong case for implementing the controller of autonomic systems using a serverless computing approach
Qos‐aware approximate query processing for smart cities spatial data streams
Large amounts of georeferenced data streams arrive daily to stream processing systems. This is attributable to the overabundance of affordable IoT devices. In addition, interested practitioners desire to exploit Internet of Things (IoT) data streams for strategic decision‐making purposes. However, mobility data are highly skewed and their arrival rates fluctuate. This nature poses an extra challenge on data stream processing systems, which are required in order to achieve prespecified latency and accuracy goals. In this paper, we propose ApproxSSPS, which is a system for approximate processing of geo‐referenced mobility data, at scale with quality of service guarantees. We focus on stateful aggregations (e.g., means, counts) and top‐N queries. ApproxSSPS features a controller that interactively learns the latency statistics and calculates proper sampling rates to meet latency or/and accuracy targets. An overarching trait of ApproxSSPS is its ability to strike a plausible balance between latency and accuracy targets. We evaluate ApproxSSPS on Apache Spark Structured Streaming with real mobility data. We also compared ApproxSSPS against a state‐of‐the‐art online adaptive processing system. Our extensive experiments prove that ApproxSSPS can fulfill latency and accuracy targets with varying sets of parameter configurations and load intensities (i.e., transient peaks in data loads versus slow arriving streams). Moreover, our results show that ApproxSSPS outperforms the baseline counterpart by significant magnitudes. In short, ApproxSSPS is a novel spatial data stream processing system that can deliver real accurate results in a timely manner, by dynamically specifying the limits on data samples
Energy-aware cluster reconfiguration algorithm for the big data analytics platform spark
The development of Cloud computing and data analytics technologies has made it possible to process big data faster. Distributed computing schemes, for instance, can help to reduce the time required for data analysis and thus enhance its efficiency. However, fewer researchers have paid attention to the problem of the high-energy consumption of the cluster, placing a heavy burden on the environment, especially when the number of nodes is extremely large. As a consequence, the principle of sustainable development is violated. Considering this problem, this paper proposes an approach that can be applied to remove less-efficient nodes or to migrate over-utilized nodes of the cluster so as to adjust the load of the cluster properly and thereby achieve the goal of energy conservation. Furthermore, in order to testify the performance of the proposed methodology, we present the simulation results implemented by using CloudSim
Performance Evaluation Analysis of Spark Streaming Backpressure for Data-Intensive Pipelines
A significant rise in the adoption of streaming applications has changed the decisionmaking processes in the last decade. This movement has led to the emergence of several Big Data technologies for in-memory processing, such as the systems Apache Storm, Spark, Heron, Samza, Flink, and others. Spark Streaming, a widespread open-source implementation, processes data-intensive applications that often require large amounts of memory. However, Spark Unified Memory Manager cannot properly manage sudden or intensive data surges and their related inmemory caching needs, resulting in performance and throughput degradation, high latency, a large number of garbage collection operations, out-of-memory issues, and data loss. This work
presents a comprehensive performance evaluation of Spark Streaming backpressure to investigate the hypothesis that it could support data-intensive pipelines under specific pressure requirements. The results reveal that backpressure is suitable only for small and medium pipelines for stateless and stateful applications. Furthermore, it points out the Spark Streaming limitations that lead to in-memory-based issues for data-intensive pipelines and stateful applications. In addition, the work indicates potential solutions.N/
Automatic Rescaling and Tuning of Big Data Applications on Container-Based Virtual Environments
Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 524V01[Resumo]
As aplicacións Big Data actuais evolucionaron dun xeito significativo, dende
fluxos de traballo baseados en procesamento por lotes ata outros máis complexos
que poden requirir múltiples etapas de procesamento usando diferentes tecnoloxías,
e mesmo executándose en tempo real. Doutra banda, para despregar estas aplicacións,
os clusters ‘commodity’ foron substituídos nalgúns casos por paradigmas máis
flexibles como o Cloud, ou mesmo por outros emerxentes como a computación ‘serverless’,
precisando ambos paradigmas de tecnoloxías de virtualización. Esta Tese
propón dúas contornas que proporcionan modos alternativos de realizar unha análise
en profundidade e unha mellor xestión dos recursos de aplicacións Big Data despregadas
en contornas virtuais baseadas en contedores software. Por unha banda, a
contorna BDWatchdog permite realizar unha análise de gran fino e en tempo real
en termos do uso dos recursos do sistema e do perfilado do código. Doutra banda,
descríbese unha contorna para o reescalado dinámico e en tempo real dos recursos
segundo un conxunto de políticas configurables. A primeira política proposta
céntrase no reescalado automático dos recursos dos contedores segundo o uso real
que as aplicacións fan dos mesmos, proporcionando así unha contorna ‘serverless’.
Ademais, preséntase unha política alternativa centrada na xestión enerxética que
permite implementar os conceptos de limitación e presuposto de potencia, que poden
aplicarse a contedores, aplicacións ou mesmo usuarios. En xeral, as contornas
propostas nesta Tese tratan de poñer de relevo o potencial de aplicar novos xeitos de
analizar e axustar os recursos das aplicacións Big Data despregadas en clusters de
contedores, mesmo en tempo real. Os casos de uso presentados son exemplos diso,
demostrando que as aplicacións Big Data poden adaptarse a novas tecnoloxías ou
paradigmas sen teren que cambiar as súas características máis intrínsecas.[Resumen]
Las aplicaciones Big Data actuales han evolucionado de forma significativa, desde
flujos de trabajo basados en procesamiento por lotes hasta otros más complejos que
pueden requerir múltiples etapas de procesamiento usando distintas tecnologías, e incluso
ejecutándose en tiempo real. Por otra parte, para desplegar estas aplicaciones,
los clusters ‘commodity’ se han reemplazado en algunos casos por paradigmas más
flexibles como el Cloud, o incluso por otros emergentes como la computación ‘serverless’,
requiriendo ambos paradigmas de tecnologías de virtualización. Esta Tesis
propone dos entornos que proporcionan formas alternativas de realizar un análisis en
profundidad y una mejor gestión de los recursos de aplicaciones Big Data desplegadas
en entornos virtuales basados en contenedores software. Por un lado, el entorno
BDWatchdog permite realizar un análisis de grano fino y en tiempo real en lo que
respecta a la monitorización de los recursos del sistema y al perfilado del código. Por
otro lado, se describe un entorno para el reescalado dinámico y en tiempo real de
los recursos de acuerdo a un conjunto de políticas configurables. La primera política
propuesta se centra en el reescalado automático de los recursos de los contenedores
de acuerdo al uso real que las aplicaciones hacen de los mismos, proporcionando así
un entorno ‘serverless’. Además, se presenta una política alternativa centrada en la
gestión energética que permite implementar los conceptos de limitación y presupuesto
de potencia, pudiendo aplicarse a contenedores, aplicaciones o incluso usuarios.
En general, los entornos propuestos en esta Tesis tratan de resaltar el potencial de
aplicar nuevas formas de analizar y ajustar los recursos de las aplicaciones Big Data
desplegadas en clusters de contenedores, incluso en tiempo real. Los casos de uso
que se han presentado son ejemplos de esto, demostrando que las aplicaciones Big
Data pueden adaptarse a nuevas tecnologías o paradigmas sin tener que cambiar su
características más intrínsecas.[Abstract]
Current Big Data applications have significantly evolved from its origins, moving
from mostly batch workloads to more complex ones that may involve many processing
stages using different technologies or even working in real time. Moreover, to
deploy these applications, commodity clusters have been in some cases replaced
in favor of newer and more flexible paradigms such as the Cloud or even emerging
ones such as serverless computing, usually involving virtualization techniques.
This Thesis proposes two frameworks that provide alternative ways to perform indepth
analysis and improved resource management for Big Data applications deployed
on virtual environments based on software containers. On the one hand,
the BDWatchdog framework is capable of performing real-time, fine-grain analysis
in terms of system resource monitoring and code profiling. On the other hand, a
framework for the dynamic and real-time scaling of resources according to several
tuning policies is described. The first proposed policy revolves around the automatic
scaling of the containers’ resources according to the real usage of the applications,
thus providing a serverless environment. Furthermore, an alternative policy focused
on energy management is presented in a scenario where power capping and budgeting
functionalities are implemented for containers, applications or even users.
Overall, the frameworks proposed in this Thesis aim to showcase how novel ways
of analyzing and tuning the resources given to Big Data applications in container
clusters are possible, even in real time. The supported use cases that were presented
are examples of this, and show how Big Data applications can be adapted to newer
technologies or paradigms without having to lose their distinctive characteristics
Challenges and Opportunities in Applied System Innovation
This book introduces and provides solutions to a variety of problems faced by society, companies and individuals in a quickly changing and technology-dependent world. The wide acceptance of artificial intelligence, the upcoming fourth industrial revolution and newly designed 6G technologies are seen as the main enablers and game changers in this environment. The book considers these issues not only from a technological viewpoint but also on how society, labor and the economy are affected, leading to a circular economy that affects the way people design, function and deploy complex systems
Online disturbance prediction for enhanced availability in smart grids
A gradual move in the electric power industry towards Smart Grids brings new challenges to the system's efficiency and dependability. With a growing complexity and massive introduction of renewable generation, particularly at the distribution level, the number of faults and, consequently, disturbances (errors and failures) is expected to increase significantly. This threatens to compromise grid's availability as traditional, reactive management approaches may soon become insufficient. On the other hand, with grids' digitalization, real-time status data are becoming available. These data may be used to develop advanced management and control methods for a sustainable, more efficient and more dependable grid. A proactive management approach, based on the use of real-time data for predicting near-future disturbances and acting in their anticipation, has already been identified by the Smart Grid community as one of the main pillars of dependability of the future grid. The work presented in this dissertation focuses on predicting disturbances in Active Distributions Networks (ADNs) that are a part of the Smart Grid that evolves the most. These are distribution networks with high share of (renewable) distributed generation and with systems in place for real-time monitoring and control. Our main goal is to develop a methodology for proactive network management, in a sense of proactive mitigation of disturbances, and to design and implement a method for their prediction. We focus on predicting voltage sags as they are identified as one of the most frequent and severe disturbances in distribution networks. We address Smart Grid dependability in a holistic manner by considering its cyber and physical aspects. As a result, we identify Smart Grid dependability properties and develop a taxonomy of faults that contribute to better understanding of the overall dependability of the future grid. As the process of grid's digitization is still ongoing there is a general problem of a lack of data on the grid's status and especially disturbance-related data. These data are necessary to design an accurate disturbance predictor. To overcome this obstacle we introduce a concept of fault injection to simulation of power systems. We develop a framework to simulate a behavior of distribution networks in the presence of faults, and fluctuating generation and load that, alone or combined, may cause disturbances. With the framework we generate a large set of data that we use to develop and evaluate a voltage-sag disturbance predictor. To quantify how prediction and proactive mitigation of disturbances enhance availability we create an availability model of a proactive management. The model is generic and may be applied to evaluate the effect of proactive management on availability in other types of systems, and adapted for quantifying other types of properties as well. Also, we design a metric and a method for optimizing failure prediction to maximize availability with proactive approach. In our conclusion, the level of availability improvement with proactive approach is comparable to the one when using high-reliability and costly components. Following the results of the case study conducted for a 14-bus ADN, grid's availability may be improved by up to an order of magnitude if disturbances are managed proactively instead of reactively. The main results and contributions may be summarized as follows: (i) Taxonomy of faults in Smart Grid has been developed; (ii) Methodology and methods for proactive management of disturbances have been proposed; (iii) Model to quantify availability with proactive management has been developed; (iv) Simulation and fault-injection framework has been designed and implemented to generate disturbance-related data; (v) In the scope of a case study, a voltage-sag predictor, based on machine- learning classification algorithms, has been designed and the effect of proactive disturbance management on downtime and availability has been quantified
Automatisierte Analyse integrierter Software-Produktlinien-Spezifikationen
Der Trend zur Digitalisierung führt zu neuen Anwendungsszenarien (z.B. Industrie 4.0, Internet der Dinge, intelligente Stromnetze), die laufzeitadaptive Software-Systeme erfordern, die sich durch kontinuierliche Rekonfiguration an verändernde Umgebungsbedingungen anpassen. Integrierte Software-Produktlinien-Spezifikationen ermöglichen die präzise Beschreibung von Konsistenzeigenschaften derartiger Systeme in einer einheitlichen Repräsentation. So bietet die Spezifikationssprache Clafer sowohl Sprachmittel zur Charakterisierung der Laufzeitvariabilität eines Systems als auch für die rekonfigurierbaren Bestandteile der Systemarchitektur sowie komplexer Abhängigkeiten. In Clafer-Spezifikationen werden hierzu Sprachkonstrukte aus UML-Klassendiagrammen und Meta-Modellierungssprachen zusammen mit Feature-orientierten Modellierungstechniken und Constraints in Prädikatenlogik erster Stufe kombiniert. Durch die beträchtliche Ausdrucksstärke neigen derartige integrierte Produktlinien-Spezifikationen in der Praxis dazu, sehr komplex zu werden (z.B. aufgrund versteckter Abhängigkeiten zwischen Konfigurationsoptionen und Komponenten). Sie sind daher äußerst anfällig für Spezifikationsfehler in Form von Inkonsistenzen oder Entwurfsschwächen in Form von Anomalien. Inkonsistenzen und Anomalien müssen jedoch möglichst früh im Entwurfsprozess erkannt und behoben werden, um drastische Folgekosten zur Laufzeit eines Systems zu vermeiden. Aus diesem Grund sind statische Analysetechniken zur automatisierten Analyse integrierter Software-Produktlinien-Spezifikationen unabdingbar.
Existierende Ansätze zur Konsistenzprüfung erfordern, dass der Suchraum für die Instanzsuche vorab entweder manuell oder durch heuristisch identifizierte Schranken eingeschränkt wird. Da, falls keine Instanz gefunden werden kann, nicht bekannt ist, ob dies durch einen zu klein gewählten Suchraum oder eine tatsächliche Inkonsistenz verursacht wurde, sind existierende Analyseverfahren inhärent unvollständig und praktisch nur eingeschränkt nutzbar. Darüber hinaus wurden bisher noch keine Analysen zur Identifikation von Anomalien vorgeschlagen, wie sie beispielsweise in Variabilitätsmodellen auftreten können.
Weiterhin erlauben existierende Verfahren zwar die Handhabung von ganzzahligen Attributen, ermöglichen jedoch keine effiziente Analyse von Spezifikationen die zusätzlich reellwertige Attribute aufweisen.
In dieser Arbeit präsentieren wir einen Ansatz zur automatisierten Analyse integrierter Software-Produktlinien-Spezifikationen, die in der Sprache Clafer spezifiziert sind. Hierfür präsentieren wir eine ganzheitliche Spezifikation der strukturellen Konsistenzeigenschaften laufzeitadaptiver Software-Systeme und schlagen neuartige Anomalietypen vor, die in Clafer-Spezifikationen auftreten können. Wir charakterisieren eine Kernsprache, die eine vollständige und korrekte Analyse von Clafer-Spezifikationen ermöglicht. Wir führen zusätzlich eine neuartige semantische Repräsentation als mathematisches Optimierungsproblem ein, die über die Kernsprache hinaus eine effiziente Analyse praxisrelevanter Clafer-Spezifikationen ermöglicht und die Anwendung etablierter Standard-Lösungsverfahren erlaubt. Die Methoden und Techniken dieser Arbeit werden anhand eines durchgängigen Beispiels eines selbst-adaptiven Kommunikationssystems illustriert und prototypisch implementiert. Die experimentelle Evaluation zeigt die Effektivität unseres Analyseverfahrens sowie erhebliche Verbesserungen der Laufzeiteffizienz im Vergleich zu etablierten Verfahren