9 research outputs found

    ConTaaS: An Approach to Internet-Scale Contextualisation for Developing Efficient Internet of Things Applications

    Get PDF
    The Internet of Things (IoT) is a new internet evolution that involves connecting billions of sensors and other devices to the Internet. Such IoT devices or IoT things can communicate directly. They also allow Internet users and applications to access and distil their data, control their functions, and harness the information and functionality provided by multiple IoT devices to offer novel smart services. IoT devices collectively generate massive amounts of data with an incredible velocity. Processing IoT device data and distilling high-value information from them presents an Internet-scale computational challenge. Contextualisation of IoT data can help improve the value of information extracted from IoT. However, existing contextualisation techniques can only handle small datasets from a modest number of IoT devices. In this paper, we propose a general-purpose architecture and related techniques for the contextualisation of IoT data. In particular, we introduce a Contextualisation-as-a-Service (ConTaaS) architecture that incorporates scalability improving techniques, as well as a proof-of-concept implementation of all these that utilises elastic cloud-based infrastructure to achieve near real-time contextualisation of IoT data. Experimental evaluations validating the efficiency of ConTaaS are also provided in this paper

    Knowledge-base and techniques for effective service-oriented programming & management of hybrid processes

    Full text link
    Recent advances in Web 2.0, SOA, crowd-sourcing, social and collaboration technologies, as well as cloud-computing, have truly transformed the Internet into a global development and deployment platform. As a result, developers have been presented with ubiquitous access to countless Web-services, resources and tools. However, while enabling tremendous automation and reuse opportunities, new productivity challenges have also emerged: The exploitation of services and resources nonetheless requires skilled programmers and a development-centric approach; it is thus inevitably susceptible to the same repetitive, error-prone and time consuming integration work each time a developer integrates a new API. Business Process Management on the other hand were proposed to support service-based integration. It provided the benefit of automation and modelling, which appealed to non-technical domain-experts. The problem however: it proves too rigid for unstructured processes. Thus, without this level of support, building new application either requires extensive manual programming or resorting to homebrew solutions. Alternatively, with the proliferation of SaaS, various such tools could be used for independent portions of the overall process - although this either presupposes conforming to the in-built process, or results in "shadow processes" via use of e-mail or the like, in order to exchange information and share decisions. There has therefore been an inevitable gap in technological support between structured and unstructured processes. To address these challenges, this thesis deals with transitioning process-support from structured to unstructured. We have been motivated to harness the foundational capabilities of BPM for its application to unstructured processes. We propose to achieve this by: First, addressing the productivity challenges of Web-services integration - simplifying this process - whilst encouraging an incremental curation and collective reuse approach. We then extend this to propose an innovative Hybrid-Process Management Platform that holistically combines structured, semi-structured and unstructured activities, based on a unified task-model that encapsulates a spectrum of process specificity. We have thus aimed to bridge the current lacking technology gap. The approach presented has been exposed as service-based libraries and tools. Whereby, we have devised several use-case scenarios and conducted user-studies in order to evaluate the overall effectiveness of our proposed work

    Virtual Machine Image Management for Elastic Resource Usage in Grid Computing

    Get PDF
    Grid Computing has evolved from an academic concept to a powerful paradigm in the area of high performance computing (HPC). Over the last few years, powerful Grid computing solutions were developed that allow the execution of computational tasks on distributed computing resources. Grid computing has recently attracted many commercial customers. To enable commercial customers to be able to execute sensitive data in the Grid, strong security mechanisms must be put in place to secure the customers' data. In contrast, the development of Cloud Computing, which entered the scene in 2006, was driven by industry: it was designed with respect to security from the beginning. Virtualization technology is used to separate the users e.g., by putting the different users of a system inside a virtual machine, which prevents them from accessing other users' data. The use of virtualization in the context of Grid computing has been examined early and was found to be a promising approach to counter the security threats that have appeared with commercial customers. One main part of the work presented in this thesis is the Image Creation Station (ICS), a component which allows users to administer their virtual execution environments (virtual machines) themselves and which is responsible for managing and distributing the virtual machines in the entire system. In contrast to Cloud computing, which was designed to allow even inexperienced users to execute their computational tasks in the Cloud easily, Grid computing is much more complex to use. The ICS makes it easier to use the Grid by overcoming traditional limitations like installing needed software on the compute nodes that users use to execute the computational tasks. This allows users to bring commercial software to the Grid for the first time, without the need for local administrators to install the software to computing nodes that are accessible by all users. Moreover, the administrative burden is shifted from the local Grid site's administrator to the users or experienced software providers that allow the provision of individually tailored virtual machines to each user. But the ICS is not only responsible for enabling users to manage their virtual machines themselves, it also ensures that the virtual machines are available on every site that is part of the distributed Grid system. A second aspect of the presented solution focuses on the elasticity of the system by automatically acquiring free external resources depending on the system's current workload. In contrast to existing systems, the presented approach allows the system's administrator to add or remove resource sets during runtime without needing to restart the entire system. Moreover, the presented solution allows users to not only use existing Grid resources but allows them to scale out to Cloud resources and use these resources on-demand. By ensuring that unused resources are shut down as soon as possible, the computational costs of a given task are minimized. In addition, the presented solution allows each user to specify which resources can be used to execute a particular job. This is useful when a job processes sensitive data e.g., that is not allowed to leave the company. To obtain a comparable function in today's systems, a user must submit her computational task to a particular resource set, losing the ability to automatically schedule if more than one set of resources can be used. In addition, the proposed solution prioritizes each set of resources by taking different metrics into account (e.g. the level of trust or computational costs) and tries to schedule the job to resources with the highest priority first. It is notable that the priority often mimics the physical distance from the resources to the user: a locally available Cluster usually has a higher priority due to the high level of trust and the computational costs, that are usually lower than the costs of using Cloud resources. Therefore, this scheduling strategy minimizes the costs of job execution by improving security at the same time since data is not necessarily transferred to remote resources and the probability of attacks by malicious external users is minimized. Bringing both components together results in a system that adapts automatically to the current workload by using external (e.g., Cloud) resources together with existing locally available resources or Grid sites and provides individually tailored virtual execution environments to the system's users

    Design and implementation of a telemetry platform for high-performance computing environments

    Get PDF
    A new generation of high-performance and distributed computing applications and services rely on adaptive and dynamic architectures and execution strategies to run efficiently, resiliently, and at scale in today’s HPC environments. These architectures require insights into their execution behaviour and the state of their execution environment at various levels of detail, in order to make context-aware decisions. HPC telemetry provides this information. It describes the continuous stream of time series and event data that is generated on HPC systems by the hardware, operating systems, services, runtime systems, and applications. Current HPC ecosystems do not provide the conceptual models, infrastructure, and interfaces to collect, store, analyse, and integrate telemetry in a structured and efficient way. Consequently, applications and services largely depend on one-off solutions and custom-built technologies to achieve these goals; introducing significant development overheads that inhibit portability and mobility. To facilitate a broader mix of applications, more efficient application development, and swift adoption of adaptive architectures in production, a comprehensive framework for telemetry management and analysis must be provided as part of future HPC ecosystem designs. This thesis provides the blueprint for such a framework: it proposes a new approach to telemetry management in HPC: the Telemetry Platform concept. Departing from the observation that telemetry data and the corresponding analysis, and integration pat- terns on modern multi-tenant HPC systems have a lot of similarities to the patterns observed in large-scale data analytics or “Big Data” platforms, the telemetry platform concept takes the data platform paradigm and architectural approach and applies them to HPC telemetry. The result is the blueprint for a system that provides services for storing, searching, analysing, and integrating telemetry data in HPC applications and other HPC system services. It allows users to create and share telemetry data-driven insights using everything from simple time-series analysis to complex statistical and machine learning models while at the same time hiding many of the inherent complexities of data management such as data transport, clean-up, storage, cataloguing, access management, and providing appropriate and scalable analytics and integration capabilities. The main contributions of this research are (1) the application of the data platform concept to HPC telemetry data management and usage; (2) a graph-based, time-variant telemetry data model that captures structures and properties of platform and applications and in which telemetry data can be organized; (3) an architecture blueprint and prototype of a concrete implementation and integration architecture of the telemetry platform; and (4) a proposal for decoupled HPC application architectures, separating telemetry data management, and feedback-control-loop logic from the core application code. First experimental results with the prototype implementation suggest that the telemetry platform paradigm can reduce overhead and redundancy in the development of telemetry-based application architectures, and lower the barrier for HPC systems research and the provisioning of new, innovative HPC system services

    Exploratory Cluster Analysis from Ubiquitous Data Streams using Self-Organizing Maps

    Get PDF
    This thesis addresses the use of Self-Organizing Maps (SOM) for exploratory cluster analysis over ubiquitous data streams, where two complementary problems arise: first, to generate (local) SOM models over potentially unbounded multi-dimensional non-stationary data streams; second, to extrapolate these capabilities to ubiquitous environments. Towards this problematic, original contributions are made in terms of algorithms and methodologies. Two different methods are proposed regarding the first problem. By focusing on visual knowledge discovery, these methods fill an existing gap in the panorama of current methods for cluster analysis over data streams. Moreover, the original SOM capabilities in performing both clustering of observations and features are transposed to data streams, characterizing these contributions as versatile compared to existing methods, which target an individual clustering problem. Also, additional methodologies that tackle the ubiquitous aspect of data streams are proposed in respect to the second problem, allowing distributed and collaborative learning strategies. Experimental evaluations attest the effectiveness of the proposed methods and realworld applications are exemplified, namely regarding electric consumption data, air quality monitoring networks and financial data, motivating their practical use. This research study is the first to clearly address the use of the SOM towards ubiquitous data streams and opens several other research opportunities in the future

    A platform-independent domain-specific modeling language for multiagent systems

    Get PDF
    Associated with the increasing acceptance of agent-based computing as a novel software engineering paradigm, recently a lot of research addresses the development of suitable techniques to support the agent-oriented software development. The state-of-the-art in agent-based software development is to (i) design the agent systems basing on an agent-based methodology and (ii) take the resulting design artifact as a base to manually implement the agent system using existing agent-oriented programming languages or general purpose languages like Java. Apart from failures made when manually transform an abstract specification into a concrete implementation, the gap between design and implementation may also result in the divergence of design and implementation. The framework discussed in this dissertation presents a platform-independent domain-specific modeling language for MASs called Dsml4MAS that allows modeling agent systems in a platform-independent and graphical manner. Apart from the abstract design, Dsml4MAS also allows to automatically (i) check the generated design artifacts against a formal semantic specification to guarantee the well-formedness of the design and (ii) translate the abstract specification into a concrete implementation. Taking both together, Dsml4MAS ensures that for any well-formed design, an associated implementation will be generated closing the gap between design and code.Aufgrund wachsender Akzeptanz von Agentensystemen zur Behandlung komplexer Problemstellungen wird der Schwerpunkt auf dem Gebiet der agentenorientierten Softwareentwicklung vor allem auf die Erforschung von geeignetem Entwicklungswerkzeugen gesetzt. Stand der Forschung ist es dabei das Agentendesign mittels einer Agentenmethodologie zu spezifizieren und die resultierenden Artefakte als Grundlage zur manuellen Programmierung zu verwenden. Fehler, die bei dieser manuellen ÜberfĂŒhrung entstehen, machen insbesondere das abstrakte Design weniger nĂŒtzlich in Hinsicht auf die Nachhaltigkeit der entwickelten Softwareapplikation. Das in dieser Dissertation diskutierte Rahmenwerk erörtert eine plattformunabhĂ€ngige domĂ€nenspezifische Modellierungssprache fĂŒr Multiagentensysteme namens Dsml4MAS. Dsml4MAS erlaubt es Agentensysteme auf eine plattformunabhĂ€ngige und graphische Art und Weise darzustellen. Die Modellierungssprache umfasst (i) eine abstrakte Syntax, die das Vokabular der Sprache definiert, (ii) eine konkrete Syntax, die die graphische Darstellung spezifiziert sowie (iii) eine formale Semantik, die dem Vokabular eine prĂ€zise Bedeutung gibt. Dsml4MAS ist Bestandteil einer (semi-automatischen) Methodologie, die es (i) erlaubt die abstrakte Spezifikation schrittweise bis hin zur konkreten Implementierung zu konkretisieren und (ii) die InteroperabilitĂ€t zu alternativen Softwareparadigmen wie z.B. Dienstorientierte Architekturen zu gewĂ€hrleisten
    corecore