509 research outputs found

    Causal impact analysis for app releases in google play

    Get PDF
    App developers would like to understand the impact of their own and their competitors' software releases. To address this we introduce Causal Impact Release Analysis for app stores, and our tool, CIRA, that implements this analysis. We mined 38,858 popular Google Play apps, over a period of 12 months. For these apps, we identified 26,339 releases for which there was adequate prior and posterior time series data to facilitate causal impact analysis. We found that 33% of these releases caused a statistically significant change in user ratings. We use our approach to reveal important characteristics that distinguish causal significance in Google Play. To explore the actionability of causal impact analysis, we elicited the opinions of app developers: 56 companies responded, 78% concurred with the causal assessment, of which 33% claimed that their company would consider changing its app release strategy as a result of our findings

    Large Scale Data Mining for IT Service Management

    Get PDF
    More than ever, businesses heavily rely on IT service delivery to meet their current and frequently changing business requirements. Optimizing the quality of service delivery improves customer satisfaction and continues to be a critical driver for business growth. The routine maintenance procedure plays a key function in IT service management, which typically involves problem detection, determination and resolution for the service infrastructure. Many IT Service Providers adopt partial automation for incident diagnosis and resolution where the operation of the system administrators and automation operation are intertwined. Often the system administrators\u27 roles are limited to helping triage tickets to the processing teams for problem resolving. The processing teams are responsible to perform a complex root cause analysis, providing the system statistics, event and ticket data. A large scale of system statistics, event and ticket data aggravate the burden of problem diagnosis on both the system administrators and the processing teams during routine maintenance procedures. Alleviating human efforts involved in IT service management dictates intelligent and efficient solutions to maximize the automation of routine maintenance procedures. Three research directions are identified and considered to be helpful for IT service management optimization: (1) Automatically determine problem categories according to the symptom description in a ticket; (2) Intelligently discover interesting temporal patterns from system events; (3) Instantly identify temporal dependencies among system performance statistics data. Provided with ticket, event, and system performance statistics data, the three directions can be effectively addressed with a data-driven solution. The quality of IT service delivery can be improved in an efficient and effective way. The dissertation addresses the research topics outlined above. Concretely, we design and develop data-driven solutions to help system administrators better manage the system and alleviate the human efforts involved in IT Service management, including (1) a knowledge guided hierarchical multi-label classification method for IT problem category determination based on both the symptom description in a ticket and the domain knowledge from the system administrators; (2) an efficient expectation maximization approach for temporal event pattern discovery based on a parametric model; (3) an online inference on time-varying temporal dependency discovery from large-scale time series data

    Estimation of Transfer Entropy

    Get PDF

    Multivariate Time Series Data Causal Discovery

    Get PDF
    One of the goals for Artificial Intelligence is to achieve human-like intelligence. To that end, several solutions were proposed over the decades, where causal structure discovery was proposed as a viable tool for enabling human-like reasoning. It can be treated as two stages, first causal discovery that examines the cause-effect relationships between variables, which are then used in the second stage, referred to as causal parameter inference, to perform causal inference using counterfactual/logic-like reasoning similar to how human beings approach a problem. Generally speaking, there are two types of causal discovery algorithms: those that work with random variables and those that work with time series data. The focus of this thesis will be on the latter. Performing causal studies on real world dataset is very challenging for time series data as it is prevalent to run into missing values. Currently, all existing causal algorithms require evenly-sampled time series data which unfortunately are not always available. In this thesis I proposed a systems that can address this difficulties that is hindering causal learning on real world datasets. The proposed system performs causal discovery using time series data with missing entries (i.e., sparsely sampled data at varying intervals). The solution put forward for this task is comprised of two parts: data filling with Gaussian Process Regression, and causal learning using a the traditional Vector Autoregressive Model or Machine Learning based approach. For the first part, experiments have shown that Gaussian Process Regression outperformed all the benchmark filling techniques such as K Nearest Neighbour regression, Parametric Linear filling as well as random variable filling. The obtained Root Mean Square Error for GPR filled was the smallest under across all filling percentages, comfortably beating benchmark algorithms by margins (RMSE difference varies from 0.05 to 1.5). As for the second part, an Echo State Network for causal learning is used due to its fast running time and higher prediction capabilities when compared with other causal learning algorithms available in the industry such as algorithms like Structural Expectation Maximization (SEM), and Subsampled Linear Auto-Regression Absolute Coefficients algorithm (SLARAC). When working with a 10 percent missing entries, the proposed system is capable of obtaining an MCC score of 0.31 on a -1 to +1 scale where +1 represents perfect prediction and -1 represents complete no usefulness of the result. The MCC score received from the proposed system significantly outperformed other methods such as SEM and SLARAC. To showcase the ability of the proposed system to adapt causal relationships on real world engineering applications, the experiment was conducted using a chemical refinery dataset called the Tennessee Eastman (TE) dataset

    Assessment Of Blood Pressure Regulatory Controls To Detect Hypovolemia And Orthostatic Intolerance

    Get PDF
    Regulation of blood pressure is vital for maintaining organ perfusion and homeostasis. A significant decline in arterial blood pressure could lead to fainting and hypovolemic shock. In contrast to young and healthy, people with impaired autonomic control due to aging or disease find regulating blood pressure rather demanding during orthostatic challenge. This thesis performed an assessment of blood pressure regulatory controls during orthostatic challenge via traditional as well as novel approaches with two distinct applications 1) to design a robust automated system for early identification of hypovolemia and 2) to assess orthostatic tolerance in humans. In chapter 3, moderate intensity hemorrhage was simulated via lower-body negative pressure (LBNP) with an aim to identify moderate intensity hemorrhage (-30 and -40 mmHg LBNP) from resting baseline. Utilizing features extracted from common vital sign monitors, a classification accuracy of 82% and 91% was achieved for differentiating -30 and -40 mmHg LBNP, respectively from baseline. In chapter 4, cause-and-effect relationship between the representative signals of the cardiovascular and postural systems to ascertain blood pressure homeostasis during standing was performed. The degree of causal interaction between the two systems, studied via convergent cross mapping (CCM), showcased the existence of a significant bi-directional interaction between the representative signals of two systems to regulate blood pressure. Therefore, the two systems should be accounted for jointly when addressing physiology behind fall. Further, in chapter 5, the potential of artificial gravity (2-g) induced via short-arm human centrifuge at feet towards evoking blood pressure regulatory controls analogous to standing was investigated. The observation of no difference in the blood pressure regulatory controls, during 2-g centrifugation compared to standing, strongly supported the hypothesis of artificial hypergravity for mitigating cardiovascular deconditioning, hence minimizing post-flight orthostatic intolerance

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    TIME SERIES ANALYSIS AND CLUSTERING TO CHARACTERIZE CARDIORESPIRATORY INSTABILITY PATTERNS IN STEP-DOWN UNIT PATIENTS

    Get PDF
    Background: Cardiorespiratory instability (CRI) in noninvasively monitored step-down unit (SDU) patients has a variety of etiologies, and therefore likely manifests in different patterns of vital signs (VS) changes. Objective: We sought to describe differences in admission characteristics and outcomes between patients with and without CRI. We explored use of clustering techniques to identify VS patterns within initial CRI epoch (CRI1) and assessed inter-cluster differences in admission characteristics, outcomes and medications. Methods: Admission characteristics and continuous monitoring data (frequency 1/20 Hz) were recorded in 307 patients. Vital sign (VS) deviations beyond local instability trigger criteria for 3 consecutive minutes or for 4 out of a 5 minute moving window were classified as CRI events. We identified CRI1 in 133 patients, derived statistical features of CRI1 epoch and employed hierarchical and k-means clustering techniques. We tested several clustering solutions and used 10-fold cross validation and ANOVA to establish best solution. Inter-cluster differences in admission characteristics, outcomes and medications were assessed. Main Results: Patients transferred to the SDU from units with higher monitoring capability were more likely to develop CRI (n=133, CRI 44% vs no CRI n=174, 31%, p=.042). Patients with at least one event of CRI had longer hospital length of stay (CRI 11.3 + 10.2 days vs no CRI 7.8 + 9.2, p=.001) and SDU unit stay (CRI 6.1 + 4.9 days vs no CRI 3.5 + 2.9, p< .001). Four main clusters(C) were derived. Clusters were significantly different based on age (p=0.001; younger patients in C1 and older in C2), number of comorbidities (p<0.01; more C2 patients had ≥2), and admission source (p=0.008; more C1 and C4 patients transferred in from a higher intensity monitoring unit). Patients with CRI differed significantly (p<.05) from those without CRI based on medication categories. Conclusions: CRI1 was associated with prolonged hospital and SDU length of stay. Patients transferred from a higher level of care were more likely to develop CRI, suggesting that they are sicker. Future study will be needed to determine if there are common physiologic underpinnings of VS clusters which might inform monitoring practices and clinical decision-making when CRI first manifests

    The 8th International Conference on Time Series and Forecasting

    Get PDF
    The aim of ITISE 2022 is to create a friendly environment that could lead to the establishment or strengthening of scientific collaborations and exchanges among attendees. Therefore, ITISE 2022 is soliciting high-quality original research papers (including significant works-in-progress) on any aspect time series analysis and forecasting, in order to motivating the generation and use of new knowledge, computational techniques and methods on forecasting in a wide range of fields
    • …
    corecore