3,632 research outputs found


    Get PDF
    In service-oriented environments, services are put together in the form of a workflow with the aim of distributed problem solving. Capturing the execution details of the services' transformations is a significant advantage of using workflows. These execution details, referred to as provenance information, are usually traced automatically and stored in provenance stores. Provenance data contains the data recorded by a workflow engine during a workflow execution. It identifies what data is passed between services, which services are involved, and how results are eventually generated for particular sets of input values. Provenance information is of great importance and has found its way through areas in computer science such as: Bioinformatics, database, social, sensor networks, etc. Current exploitation and application of provenance data is very limited as provenance systems started being developed for specific applications. Thus, applying learning and knowledge discovery methods to provenance data can provide rich and useful information on workflows and services. Therefore, in this work, the challenges with workflows and services are studied to discover the possibilities and benefits of providing solutions by using provenance data. A multifunctional architecture is presented which addresses the workflow and service issues by exploiting provenance data. These challenges include workflow composition, abstract workflow selection, refinement, evaluation, and graph model extraction. The specific contribution of the proposed architecture is its novelty in providing a basis for taking advantage of the previous execution details of services and workflows along with artificial intelligence and knowledge management techniques to resolve the major challenges regarding workflows. The presented architecture is application-independent and could be deployed in any area. The requirements for such an architecture along with its building components are discussed. Furthermore, the responsibility of the components, related works and the implementation details of the architecture along with each component are presented

    A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams

    Get PDF
    Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby models—as defined by quality metrics—then the user’s behavior is automatically flagged as a potentially significant behavioral change. Decision tree, sequence pattern and Hidden Markov modeling being used in this study. These three types of modeling can expose different aspect of user’s behavior. In case of decision tree modeling, the specific changes in user behavior can automatically characterized by differencing the data-mined decision-tree models. The sequence pattern modeling can shed light on how the user changes his sequence of actions and Hidden Markov modeling can identifies the learning transition points. This research describes how model-quality monitoring and these three types of modeling as a generic framework can aid recognition and diagnoses of behavioral changes in a case study of cognitive rehabilitation via emailing. The date stream mining techniques mentioned are used to monitor patient goals as part of a clinical plan to aid cognitive rehabilitation. In this context, real time data mining aids clinicians in tracking user behaviors as they attempt to achieve their goals. This generic framework can be widely applicable to other real-time data-intensive analysis problems. In order to illustrate this fact, the similar Hidden Markov modeling is being used for analyzing the transactional behavior of a telecommunication company for fraud detection. Fraud similarly can be considered as a potentially significant transaction behavioral change


    Get PDF
    The purpose of this study was to conduct data-driven research by employing learning analytics methodology and Big Data in learning management systems (LMSs), and then to identify and compare learners’ interaction patterns in different achievement groups through different course processes in Massive Private Online Courses (MPOCs). Learner interaction is the foundation of a successful online learning experience. However, the uncertainties about the temporal and sequential patterns of online interaction and the lack of knowledge about using dynamic interaction traces in LMSs have prevented research on ways to improve interactive qualities and learning effectiveness in online learning. Also, most research focuses on the most popular online learning organization form, Massive Open Online Courses (MOOCs), and little online learning research has been conducted to investigate learners’ interaction behaviors in another important online learning organization form: MPOCs. To fill these needs, the study pays attention to investigate the frequent and effective interaction patterns in different achievement groups as well as in different course processes, and attaches importance to LMS trace data (log data) in better serving learners and instructors in online learning. Further, the learning analytics methodology and techniques are introduced here into online interaction research. I assume that learners with different achievements express different interaction characteristics. Therefore, the hypotheses in this study are: 1) the interaction activity patterns of the high-achievement group and the low-achievement group are different; 2) in both groups, interaction activity patterns evolve through different course processes (such as the learning process and the exam process). The final purpose is to find interaction activity patterns that characterize the different achievement groups in specific MPOCs courses. Some learning analytics approaches, including Hidden Markov models (HMMs) and other related measures, are taken into account to identify frequently occurring interaction activity sequence patterns of High/Low achievement groups in the Learning/Exam processes under MPOCs settings. The results demonstrate that High-achievement learners especially focused on content learning, assignments, and quizzes to consolidate their knowledge construction in both Learning and Exam processes, while Low-achievement learners significantly did not perform the same. Further, High-achievement learners adjusted their learning strategies based on the goals of different course processes; Low-achievement learners were inactive in the learning process and opportunistic in the exam process. In addition, despite achievements or course processes, all learners were most interested in checking their performance statements, but they engaged little in forum discussion and group learning. In sum, the comparative analysis implies that certain interaction patterns may distinguish the High-achievement learners from the Low-achievement ones, and learners change their patterns more or less based on different course processes. This study provides an attempt to conduct learner interaction research by employing learning analytics techniques. In the short term, the results will give in-depth knowledge of the dynamic interaction patterns of MPOCs learners. In the long term, the results will help learners to gain insight into and evaluate their learning, help instructors identify at-risk learners and adjust instructional strategies, help developers and administrators to build recommendation systems based on objective and comprehensive information, all of which in turn will help to improve the achievements of all learner groups in specific MPOC courses

    An Event-based Analysis Framework for Open Source Software Development Projects

    Get PDF
    The increasing popularity and success of Open Source Software (OSS) development projects has drawn significant attention of academics and open source participants over the last two decades. As one of the key areas in OSS research, assessing and predicting OSS performance is of great value to both OSS communities and organizations who are interested in investing in OSS projects. Most existing research, however, has considered OSS project performance as the outcome of static cross-sectional factors such as number of developers, project activity level, and license choice. While variance studies can identify some predictors of project outcomes, they tend to neglect the actual process of development. Without a closer examination of how events occur, an understanding of OSS projects is incomplete. This dissertation aims to combine both process and variance strategy, to investigate how OSS projects change over time through their development processes; and to explore how these changes affect project performance. I design, instantiate, and evaluate a framework and an artifact, EventMiner, to analyze OSS projects’ evolution through development activities. This framework integrates concepts from various theories such as distributed cognition (DCog) and complexity theory, applying data mining techniques such as decision trees, motif analysis, and hidden Markov modeling to automatically analyze and interpret the trace data of 103 OSS projects from an open source repository. The results support the construction of process theories on OSS development. The study contributes to literature in DCog, design routines, OSS development, and OSS performance. The resulting framework allows OSS researchers who are interested in OSS development processes to share and reuse data and data analysis processes in an open-source manner

    Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective

    Full text link
    Data-driven decision making is becoming an integral part of manufacturing companies. Data is collected and commonly used to improve efficiency and produce high quality items for the customers. IoT-based and other forms of object tracking are an emerging tool for collecting movement data of objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over space and time. Movement data can provide valuable insights like process bottlenecks, resource utilization, effective working time etc. that can be used for decision making and improving efficiency. Turning movement data into valuable information for industrial management and decision making requires analysis methods. We refer to this process as movement analytics. The purpose of this document is to review the current state of work for movement analytics both in manufacturing and more broadly. We survey relevant work from both a theoretical perspective and an application perspective. From the theoretical perspective, we put an emphasis on useful methods from two research areas: machine learning, and logic-based knowledge representation. We also review their combinations in view of movement analytics, and we discuss promising areas for future development and application. Furthermore, we touch on constraint optimization. From an application perspective, we review applications of these methods to movement analytics in a general sense and across various industries. We also describe currently available commercial off-the-shelf products for tracking in manufacturing, and we overview main concepts of digital twins and their applications


    Get PDF
    This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested
    • …