41 research outputs found

    Optimal partitioning of directed acyclic graphs with dependent costs between clusters

    Full text link
    Many statistical inference contexts, including Bayesian Networks (BNs), Markov processes and Hidden Markov Models (HMMS) could be supported by partitioning (i.e.~mapping) the underlying Directed Acyclic Graph (DAG) into clusters. However, optimal partitioning is challenging, especially in statistical inference as the cost to be optimised is dependent on both nodes within a cluster, and the mapping of clusters connected via parent and/or child nodes, which we call dependent clusters. We propose a novel algorithm called DCMAP for optimal cluster mapping with dependent clusters. Given an arbitrarily defined, positive cost function based on the DAG and cluster mappings, we show that DCMAP converges to find all optimal clusters, and returns near-optimal solutions along the way. Empirically, we find that the algorithm is time-efficient for a DBN model of a seagrass complex system using a computation cost function. For a 25 and 50-node DBN, the search space size was 9.91×1099.91\times 10^9 and 1.51×10211.51\times10^{21} possible cluster mappings, respectively, but near-optimal solutions with 88\% and 72\% similarity to the optimal solution were found at iterations 170 and 865, respectively. The first optimal solution was found at iteration 934 (95% CI 926,971)(\text{95\% CI } 926,971), and 2256 (2150,2271)(2150,2271) with a cost that was 4\% and 0.2\% of the naive heuristic cost, respectively

    Guidelines for model adaptation: A study of the transferability of a general seagrass ecosystem Dynamic Bayesian Networks model

    Get PDF
    In general, it is not feasible to collect enough empirical data to capture the entire range of processes that define a complex system, either intrinsically or when viewing the system from a different geographical or temporal perspective. In this context, an alternative approach is to consider model transferability, which is the act of translating a model built for one environment to another less well-known situation. Model transferability and adaptability may be extremely beneficial—approaches that aid in the reuse and adaption of models, particularly for sites with limited data, would benefit from widespread model uptake. Besides the reduced effort required to develop a model, data collection can be simplified when transferring a model to a different application context. The research presented in this paper focused on a case study to identify and implement guidelines for model adaptation. Our study adapted a general Dynamic Bayesian Networks (DBN) of a seagrass ecosystem to a new location where nodes were similar, but the conditional probability tables varied. We focused on two species of seagrass (Zostera noltei and Zostera marina) located in Arcachon Bay, France. Expert knowledge was used to complement peer-reviewed literature to identify which components needed adjustment including parameterization and quantification of the model and desired outcomes. We adopted both linguistic labels and scenario-based elicitation to elicit from experts the conditional probabilities used to quantify the DBN. Following the proposed guidelines, the model structure of the general DBN was retained, but the conditional probability tables were adapted for nodes that characterized the growth dynamics in Zostera spp. population located in Arcachon Bay, as well as the seasonal variation on their reproduction. Particular attention was paid to the light variable as it is a crucial driver of growth and physiology for seagrasses. Our guidelines provide a way to adapt a general DBN to specific ecosystems to maximize model reuse and minimize re-development effort. Especially important from a transferability perspective are guidelines for ecosystems with limited data, and how simulation and prior predictive approaches can be used in these contexts

    Predicting fatigue using countermovement jump force-time signatures:PCA can distinguish neuromuscular versus metabolic fatigue

    Get PDF
    Purpose This study investigated the relationship between the ground reaction force-time profile of a countermovement jump (CMJ) and fatigue, specifically focusing on predicting the onset of neuromuscular versus metabolic fatigue using the CMJ. Method Ten recreational athletes performed 5 CMJs at time points prior to, immediately following, and at 0.5, 1, 3, 6, 24 and 48 h after training, which comprised repeated sprint sessions of low, moderate, or high workloads. Features of the concentric portion of the CMJ force-time signature at the measurement time points were analysed using Principal Components Analysis (PCA) and functional PCA (fPCA) to better understand fatigue onset given training workload. In addition, Linear Mixed Effects (LME) models were developed to predict the onset of fatigue. Results The first two Principal Components (PCs) using PCA explained 68% of the variation in CMJ features, capturing variation between athletes through weighted combinations of force, concentric time and power. The next two PCs explained 9.9% of the variation and revealed fatigue effects between 6 to 48 h after training for PC3, and contrasting neuromuscular and metabolic fatigue effects in PC4. fPCA supported these findings and further revealed contrasts between metabolic and neuromuscular fatigue effects in the first and second half of the force-time curve in PC3, and a double peak effect in PC4. Subsequently, CMJ measurements up to 0.5 h after training were used to predict relative peak CMJ force, with mean squared errors of 0.013 and 0.015 at 6 and 48 h corresponding to metabolic and neuromuscular fatigue. Conclusion The CMJ was found to provide a strong predictor of neuromuscular and metabolic fatigue, after accounting for force, concentric time and power. This method can be used to assist coaches to individualise future training based on CMJ response to the immediate session

    clusterBMA: Bayesian model averaging for clustering

    Full text link
    Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one `best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and results in inferences that are sensitive to the particular model and parameters chosen. Bayesian model averaging (BMA) is a popular approach for combining results across multiple models that offers some attractive benefits in this setting, including probabilistic interpretation of the combined cluster structure and quantification of model-based uncertainty. In this work we introduce clusterBMA, a method that enables weighted model averaging across results from multiple unsupervised clustering algorithms. We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model. From a consensus matrix representing a weighted average of the clustering solutions across models, we apply symmetric simplex matrix factorisation to calculate final probabilistic cluster allocations. In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters, combining allocation probabilities from 'hard' and 'soft' clustering algorithms, and measuring model-based uncertainty in averaged cluster allocation. This method is implemented in an accompanying R package of the same name

    The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces

    Full text link
    Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows. In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades. The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices. This paper explores the question "Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces -- even for legacy PDFs?" We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers. Through this project, we've developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars. We've also released a production reading interface for research papers that will incorporate the best features as they mature. We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges

    Multi-objective mission flight planning in civil unmanned aerial systems

    Get PDF
    Unmanned Aerial Vehicles (UAVs) are emerging as an ideal platform for a wide range of civil applications such as disaster monitoring, atmospheric observation and outback delivery. However, the operation of UAVs is currently restricted to specially segregated regions of airspace outside of the National Airspace System (NAS). Mission Flight Planning (MFP) is an integral part of UAV operation that addresses some of the requirements (such as safety and the rules of the air) of integrating UAVs in the NAS. Automated MFP is a key enabler for a number of UAV operating scenarios as it aids in increasing the level of onboard autonomy. For example, onboard MFP is required to ensure continued conformance with the NAS integration requirements when there is an outage in the communications link. MFP is a motion planning task concerned with finding a path between a designated start waypoint and goal waypoint. This path is described with a sequence of 4 Dimensional (4D) waypoints (three spatial and one time dimension) or equivalently with a sequence of trajectory segments (or tracks). It is necessary to consider the time dimension as the UAV operates in a dynamic environment. Existing methods for generic motion planning, UAV motion planning and general vehicle motion planning cannot adequately address the requirements of MFP. The flight plan needs to optimise for multiple decision objectives including mission safety objectives, the rules of the air and mission efficiency objectives. Online (in-flight) replanning capability is needed as the UAV operates in a large, dynamic and uncertain outdoor environment. This thesis derives a multi-objective 4D search algorithm entitled Multi- Step A* (MSA*) based on the seminal A* search algorithm. MSA* is proven to find the optimal (least cost) path given a variable successor operator (which enables arbitrary track angle and track velocity resolution). Furthermore, it is shown to be of comparable complexity to multi-objective, vector neighbourhood based A* (Vector A*, an extension of A*). A variable successor operator enables the imposition of a multi-resolution lattice structure on the search space (which results in fewer search nodes). Unlike cell decomposition based methods, soundness is guaranteed with multi-resolution MSA*. MSA* is demonstrated through Monte Carlo simulations to be computationally efficient. It is shown that multi-resolution, lattice based MSA* finds paths of equivalent cost (less than 0.5% difference) to Vector A* (the benchmark) in a third of the computation time (on average). This is the first contribution of the research. The second contribution is the discovery of the additive consistency property for planning with multiple decision objectives. Additive consistency ensures that the planner is not biased (which results in a suboptimal path) by ensuring that the cost of traversing a track using one step equals that of traversing the same track using multiple steps. MSA* mitigates uncertainty through online replanning, Multi-Criteria Decision Making (MCDM) and tolerance. Each trajectory segment is modeled with a cell sequence that completely encloses the trajectory segment. The tolerance, measured as the minimum distance between the track and cell boundaries, is the third major contribution. Even though MSA* is demonstrated for UAV MFP, it is extensible to other 4D vehicle motion planning applications. Finally, the research proposes a self-scheduling replanning architecture for MFP. This architecture replicates the decision strategies of human experts to meet the time constraints of online replanning. Based on a feedback loop, the proposed architecture switches between fast, near-optimal planning and optimal planning to minimise the need for hold manoeuvres. The derived MFP framework is original and shown, through extensive verification and validation, to satisfy the requirements of UAV MFP. As MFP is an enabling factor for operation of UAVs in the NAS, the presented work is both original and significant

    Guidelines for model adaptation : A study of the transferability of a general seagrass ecosystem Dynamic Bayesian Networks model

    No full text
    In general, it is not feasible to collect enough empirical data to capture the entire range of processes that define a complex system, either intrinsically or when viewing the system from a different geographical or temporal perspective. In this context, an alternative approach is to consider model transferability, which is the act of translating a model built for one environment to another less well-known situation. Model transferability and adaptability may be extremely beneficial—approaches that aid in the reuse and adaption of models, particularly for sites with limited data, would benefit from widespread model uptake. Besides the reduced effort required to develop a model, data collection can be simplified when transferring a model to a different application context. The research presented in this paper focused on a case study to identify and implement guidelines for model adaptation. Our study adapted a general Dynamic Bayesian Networks (DBN) of a seagrass ecosystem to a new location where nodes were similar, but the conditional probability tables varied. We focused on two species of seagrass (Zostera noltei and Zostera marina) located in Arcachon Bay, France. Expert knowledge was used to complement peer-reviewed literature to identify which components needed adjustment including parameterization and quantification of the model and desired outcomes. We adopted both linguistic labels and scenario-based elicitation to elicit from experts the conditional probabilities used to quantify the DBN. Following the proposed guidelines, the model structure of the general DBN was retained, but the conditional probability tables were adapted for nodes that characterized the growth dynamics in Zostera spp. population located in Arcachon Bay, as well as the seasonal variation on their reproduction. Particular attention was paid to the light variable as it is a crucial driver of growth and physiology for seagrasses. Our guidelines provide a way to adapt a general DBN to specific ecosystems to maximize model reuse and minimize re-development effort. Especially important from a transferability perspective are guidelines for ecosystems with limited data, and how simulation and prior predictive approaches can be used in these contexts.</p

    Classifying ball trajectories in invasion sports using dynamic time warping : A basketball case study

    No full text
    Comparison and classification of ball trajectories can provide insight to support coaches and players in analysing their plays or opposition plays. This is challenging due to the innate variability and uncertainty of ball trajectories in space and time. We propose a framework based on Dynamic Time Warping (DTW) to cluster, compare and characterise trajectories in relation to play outcomes. Seventy-two international women's basketball games were analysed, where features such as ball trajectory, possession time and possession outcome were recorded. DTW was used to quantify the alignment-adjusted distance between three dimensional (two spatial, one temporal) trajectories. This distance, along with final location for the play (usually the shot), was then used to cluster trajectories. These clusters supported the conventional wisdom of higher scoring rates for fast breaks, but also identified other contextual factors affecting scoring rate, including bias towards one side of the court. In addition, some high scoring rate clusters were associated with greater mean change in the direction of ball movement, supporting the notion of entropy affecting effectiveness. Coaches and other end users could use such a framework to help make better use of their time by honing in on groups of effective or problematic plays for manual video analysis, for both their team and when scouting opponent teams and suggests new predictors for machine learning to analyse and predict trajectory-based sports.</p

    Bayesian hierarchical modelling of basketball tracking data: A case study of spatial entropy and spatial effectiveness

    No full text
    Spatio-temporal data in sport is increasing rapidly, however suitable statistical methods for analysing this data are underdeveloped. The current study establishes the need for spatial statistical methods, propose a Bayesian hierarchical model as an appropriate method for comparing spatial variables, and test this model across three spatial scales. The need for spatial statistical methods was established through the identification of spatial autocorrelation. This necessitated the use of a Bayesian hierarchical model to test for an association between spatial ball movement entropy and spatial effectiveness. Posterior distribution results showed a generally positive association such that increases in entropy were associated with increases in effectiveness. The strength and confidence of the associations were impacted by the spatial scale, with the 6 × 6 grid showing the most conclusive evidence of a positive relationship; the 4 × 4 grid was mostly positive, however with a large variation; and finally, the basket-centric scale results were less conclusive. The results of the current study demonstrate the suitability of a Bayesian hierarchical model for testing for associations or differences between spatial variables. With the increase in spatial analyses in sport, this study presents an appropriate statistical method for dealing with complex problems associated with spatial analyses
    corecore