39 research outputs found

    Improving Usability And Scalability Of Big Data Workflows In The Cloud

    Get PDF
    Big data workflows have recently emerged as the next generation of data-centric workflow technologies to address the five “V” challenges of big data: volume, variety, velocity, veracity, and value. More formally, a big data workflow is the computerized modeling and automation of a process consisting of a set of computational tasks and their data interdependencies to process and analyze data of ever increasing in scale, complexity, and rate of acquisition. The convergence of big data and workflows creates new challenges in workflow community. First, the variety of big data results in a need for integrating large number of remote Web services and other heterogeneous task components that can consume and produce data in various formats and models into a uniform and interoperable workflow. Existing approaches fall short in addressing the so-called shimming problem only in an adhoc manner and unable to provide a generic solution. We automatically insert a piece of code called shims or adaptors in order to resolve the data type mismatches. Second, the volume of big data results in a large number of datasets that needs to be queried and analyzed in an effective and personalized manner. Further, there is also a strong need for sharing, reusing, and repurposing existing tasks and workflows across different users and institutes. To overcome such limitations, we propose a folksonomy- based social workflow recommendation system to improve workflow design productivity and efficient dataset querying and analyzing. Third, the volume of big data results in the need to process and analyze data of ever increasing in scale, complexity, and rate of acquisition. But a scalable distributed data model is still missing that abstracts and automates data distribution, parallelism, and scalable processing. We propose a NoSQL collectional data model that addresses this limitation. Finally, the volume of big data combined with the unbound resource leasing capability foreseen in the cloud, facilitates data scientists to wring actionable insights from the data in a time and cost efficient manner. We propose BARENTS scheduler that supports high-performance workflow scheduling in a heterogeneous cloud-computing environment with a single objective to minimize the workflow makespan under a user provided budget constraint

    Semantic Interaction in Web-based Retrieval Systems : Adopting Semantic Web Technologies and Social Networking Paradigms for Interacting with Semi-structured Web Data

    Get PDF
    Existing web retrieval models for exploration and interaction with web data do not take into account semantic information, nor do they allow for new forms of interaction by employing meaningful interaction and navigation metaphors in 2D/3D. This thesis researches means for introducing a semantic dimension into the search and exploration process of web content to enable a significantly positive user experience. Therefore, an inherently dynamic view beyond single concepts and models from semantic information processing, information extraction and human-machine interaction is adopted. Essential tasks for semantic interaction such as semantic annotation, semantic mediation and semantic human-computer interaction were identified and elaborated for two general application scenarios in web retrieval: Web-based Question Answering in a knowledge-based dialogue system and semantic exploration of information spaces in 2D/3D

    Intelligent business processes composition based on mas, semantic and cloud integration (IPCASCI)

    Get PDF
    [EN]Component reuse is one of the techniques that most clearly contributes to the evolution of the software industry by providing efficient mechanisms to create quality software. Reuse increases both software reliability, due to the fact that it uses previously tested software components, and development productivity, and leads to a clear reduction in cost. Web services have become are an standard for application development on cloud computing environments and are essential in business process development. These services facilitate a software construction that is relatively fast and efficient, two aspects which can be improved by defining suitable models of reuse. This research work is intended to define a model which contains the construction requirements of new services from service composition. To this end, the composition is based on tested Web services and artificial intelligent tools at our disposal. It is believed that a multi-agent architecture based on virtual organizations is a suitable tool to facilitate the construction of cloud computing environments for business processes from other existing environments, and with help from ontological models as well as tools providing the standard BPEL (Business Process Execution Language). In the context of this proposal, we must generate a new business process from the available services in the platform, starting with the requirement specifications that the process should meet. These specifications will be composed of a semi-free description of requirements to describe the new service. The virtual organizations based on a multi-agent system will manage the tasks requiring intelligent behaviour. This system will analyse the input (textual description of the proposal) in order to deconstruct it into computable functionalities, which will be subsequently treated. Web services (or business processes) stored to be reused have been created from the perspective of SOA architectures and associated with an ontological component, which allows the multi-agent system (based on virtual organizations) to identify the services to complete the reuse process. The proposed model develops a service composition by applying a standard BPEL once the services that will compose the solution business process have been identified. This standard allows us to compose Web services in an easy way and provides the advantage of a direct mapping from Business Process Management Notation diagrams

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse

    Predictive Modeling for Navigating Social Media

    Get PDF
    Social media changes the way people use the Web. It has transformed ordinary Web users from information consumers to content contributors. One popular form of content contribution is social tagging, in which users assign tags to Web resources. By the collective efforts of the social tagging community, a new information space has been created for information navigation. Navigation allows serendipitous discovery of information by examining the information objects linked to one another in the social tagging space. In this dissertation, we study prediction tasks that facilitate navigation in social tagging systems. For social tagging systems to meet complex navigation needs of users, two issues are fundamental, namely link sparseness and object selection. Link sparseness is observed for many resources that are untagged or inadequately tagged, hindering navigation to the resources. Object selection is concerned when there are a large number of information objects that are linked to the current object, requiring to select the more interesting or relevant ones for guiding navigation effectively. This dissertation focuses on three dimensions, namely the semantic, social and temporal dimensions, to address link sparseness and object selection. To address link sparseness, we study the task of tag prediction. This task aims to enrich tags for the untagged or inadequately tagged resources, such that the predicted tags can serve as navigable links to these resources. For this task, we take a topic modeling approach to exploit the latent semantic relationships between resource content and tags. To address object selection, we study the task of personalized tag recommendation and trend discovery using social annotations. Personalized tag recommendation leverages the collective wisdom from the social tagging community to recommend tags that are semantically relevant to the target resource, while being tailored to the tagging preferences of individual users. For this task, we propose a probabilistic framework which leverages the implicit social links between like-minded users, i.e. who show similar tagging preferences, to recommend suitable tags. Social tags capture the interest of the users in the annotated resources at different times. These social annotations allow us to construct temporal profiles for the annotated resources. By analyzing these temporal profiles, we unveil the non-trivial temporal trends of the annotated resources, which provide novel metrics for selecting relevant and interesting resources for guiding navigation. For trend discovery using social annotations, we propose a trend discovery process which enables us to analyze trends for a multitude of semantics encapsulated in the temporal profiles of the annotated resources

    Development and Evaluation of a Holistic, Cloud-driven and Microservices-based Architecture for Automated Semantic Annotation of Web Documents

    Get PDF
    The Semantic Web is based on the concept of representing information on the web such that computers can both understand and process them. This implies defining context for web information to give them a well-defined meaning. Semantic Annotation defines the process of adding annotation data to web information for the much-needed context. However, despite several solutions and techniques for semantic annotation, it is still faced with challenges which have hindered the growth of the semantic web. With recent significant technological innovations such as Cloud Computing, Internet of Things as well as Mobile Computing and their various integrations with semantic technologies to proffer solutions in IT, little has been done towards leveraging these technologies to address semantic annotation challenges. Hence, this research investigates leveraging cloud computing paradigm to address some semantic annotation challenges, with focus on an automated system for providing semantic annotation as a service. Firstly, considering the current disparate nature observable with most semantic annotation solutions, a holistic perspective to semantic annotation is proposed based on a set of requirements. Then, a capability assessment towards the feasibility of leveraging cloud computing is conducted which produces a Cloud Computing Capability Model for Holistic Semantic Annotation. Furthermore, an investigation into application deployment patterns in the cloud and how they relate to holistic semantic annotation was conducted. A set of determinant factors that define different patterns for application deployment in the cloud were identified and these resulted into the development of a Cloud Computing Maturity Model and the conceptualisation of a “Cloud-Driven” development methodology for holistic semantic annotation in the cloud. Some key components of the “Cloud-Driven” concept include Microservices, Operating System-Level Virtualisation and Orchestration. With the role Microservices Software Architectural Patterns play towards developing solutions that can fully maximise cloud computing benefits; CloudSea: a holistic, cloud-driven and microservices-based architecture for automated semantic annotation of web documents is proposed as a novel approach to semantic annotation. The architecture draws from the theory of “Design Patterns” in Software Engineering towards its design and development which subsequently resulted into the development of twelve Design Patterns and a Pattern Language for Holistic Semantic Annotation, based on the CloudSea architectural design. As proof-of-concept, a prototype implementation for CloudSea was developed and deployed in the cloud based on the “Cloud-Driven” methodology and a functionality evaluation was carried out on it. A comparative evaluation of the CloudSea architecture was also conducted in relation to current semantic annotation solutions; both proposed in academic literature and existing as industry solutions. In addition, to evaluate the proposed Cloud Computing Maturity Model for Holistic Semantic Annotation, an experimental evaluation of the model was conducted by developing and deploying six instances of the prototype and deploying them differently, based on the patterns described in the model. This empirical investigation was implemented by testing the instances for performance through series of API load tests and results obtained confirmed the validity of both the “Cloud-Driven” methodology and the entire model
    corecore