42 research outputs found
Optimal partitioning of directed acyclic graphs with dependent costs between clusters
Many statistical inference contexts, including Bayesian Networks (BNs),
Markov processes and Hidden Markov Models (HMMS) could be supported by
partitioning (i.e.~mapping) the underlying Directed Acyclic Graph (DAG) into
clusters. However, optimal partitioning is challenging, especially in
statistical inference as the cost to be optimised is dependent on both nodes
within a cluster, and the mapping of clusters connected via parent and/or child
nodes, which we call dependent clusters. We propose a novel algorithm called
DCMAP for optimal cluster mapping with dependent clusters. Given an arbitrarily
defined, positive cost function based on the DAG and cluster mappings, we show
that DCMAP converges to find all optimal clusters, and returns near-optimal
solutions along the way. Empirically, we find that the algorithm is
time-efficient for a DBN model of a seagrass complex system using a computation
cost function. For a 25 and 50-node DBN, the search space size was and possible cluster mappings, respectively, but
near-optimal solutions with 88\% and 72\% similarity to the optimal solution
were found at iterations 170 and 865, respectively. The first optimal solution
was found at iteration 934 , and 2256
with a cost that was 4\% and 0.2\% of the naive heuristic cost, respectively
Guidelines for model adaptation: A study of the transferability of a general seagrass ecosystem Dynamic Bayesian Networks model
In general, it is not feasible to collect enough empirical data to capture the entire range of processes that define a complex system, either intrinsically or when viewing the system from a different geographical or temporal perspective. In this context, an alternative approach is to consider model transferability, which is the act of translating a model built for one environment to another less well-known situation. Model transferability and adaptability may be extremely beneficial—approaches that aid in the reuse and adaption of models, particularly for sites with limited data, would benefit from widespread model uptake. Besides the reduced effort required to develop a model, data collection can be simplified when transferring a model to a different application context. The research presented in this paper focused on a case study to identify and implement guidelines for model adaptation. Our study adapted a general Dynamic Bayesian Networks (DBN) of a seagrass ecosystem to a new location where nodes were similar, but the conditional probability tables varied. We focused on two species of seagrass (Zostera noltei and Zostera marina) located in Arcachon Bay, France. Expert knowledge was used to complement peer-reviewed literature to identify which components needed adjustment including parameterization and quantification of the model and desired outcomes. We adopted both linguistic labels and scenario-based elicitation to elicit from experts the conditional probabilities used to quantify the DBN. Following the proposed guidelines, the model structure of the general DBN was retained, but the conditional probability tables were adapted for nodes that characterized the growth dynamics in Zostera spp. population located in Arcachon Bay, as well as the seasonal variation on their reproduction. Particular attention was paid to the light variable as it is a crucial driver of growth and physiology for seagrasses. Our guidelines provide a way to adapt a general DBN to specific ecosystems to maximize model reuse and minimize re-development effort. Especially important from a transferability perspective are guidelines for ecosystems with limited data, and how simulation and prior predictive approaches can be used in these contexts
Predicting seagrass ecosystem resilience to marine heatwave events of variable duration, frequency and re-occurrence patterns with gaps
Background: Seagrass, a vital primary producer habitat, is crucial for maintaining high biodiversity and offers numerous ecosystem services globally. The increasing severity and frequency of marine heatwaves, exacerbated by climate change, pose significant risks to seagrass meadows. Aims: This study acknowledges the uncertainty and variability of marine heatwave scenarios and aims to aid managers and policymakers in understanding simulated responses of seagrass to different durations, frequencies and recurrence gaps of marine heatwaves. Materials and Methods: Using expert knowledge and observed data, we refined a global Dynamic Bayesian Network (DBN) model for a specific case study on Halophila ovalis in Leschenault Estuary, Australia. The model evaluates the potential impact of marine heatwaves on seagrass resilience, examining stress resistance, recovery and extinction risk. Results: Simulations of different marine heatwave scenarios reveal significant impacts on seagrass ecosystems. Scenarios ranged from 30- to 90-day heatwaves, with longer durations causing more significant biomass decline, reduced resistance, higher extinction risk and prolonged recovery. For instance, recovery time may increase from 18 to 26 months with four 60-day and from 24 to 47 months with four 90-day marine heatwave events. Increasing the frequency of marine heatwaves from one to four annual events, with no gaps between occurrences, could raise extinction risk from 11% to 55% for 60-day events and from 17% to 83% for 90-day events. However, introducing gaps between heatwaves enhanced resilience, with spaced events showing lower extinction risks and quicker recovery than consecutive yearly events. Discussion: The study demonstrates the DBN model\u27s utility in simulating the impact of marine heatwaves on seagrass, providing tools for risk-informed assessment of management and restoration efforts. While these simulations align with existing research on temperature impacts on seagrass, they are not empirical. Conclusion: Further research is necessary to expand our understanding of climate change effects on seagrass ecosystems, guide policy and develop strategies to strengthen marine ecosystem resilience
Predicting fatigue using countermovement jump force-time signatures:PCA can distinguish neuromuscular versus metabolic fatigue
Purpose This study investigated the relationship between the ground reaction force-time profile of a countermovement jump (CMJ) and fatigue, specifically focusing on predicting the onset of neuromuscular versus metabolic fatigue using the CMJ. Method Ten recreational athletes performed 5 CMJs at time points prior to, immediately following, and at 0.5, 1, 3, 6, 24 and 48 h after training, which comprised repeated sprint sessions of low, moderate, or high workloads. Features of the concentric portion of the CMJ force-time signature at the measurement time points were analysed using Principal Components Analysis (PCA) and functional PCA (fPCA) to better understand fatigue onset given training workload. In addition, Linear Mixed Effects (LME) models were developed to predict the onset of fatigue. Results The first two Principal Components (PCs) using PCA explained 68% of the variation in CMJ features, capturing variation between athletes through weighted combinations of force, concentric time and power. The next two PCs explained 9.9% of the variation and revealed fatigue effects between 6 to 48 h after training for PC3, and contrasting neuromuscular and metabolic fatigue effects in PC4. fPCA supported these findings and further revealed contrasts between metabolic and neuromuscular fatigue effects in the first and second half of the force-time curve in PC3, and a double peak effect in PC4. Subsequently, CMJ measurements up to 0.5 h after training were used to predict relative peak CMJ force, with mean squared errors of 0.013 and 0.015 at 6 and 48 h corresponding to metabolic and neuromuscular fatigue. Conclusion The CMJ was found to provide a strong predictor of neuromuscular and metabolic fatigue, after accounting for force, concentric time and power. This method can be used to assist coaches to individualise future training based on CMJ response to the immediate session
clusterBMA: Bayesian model averaging for clustering
Various methods have been developed to combine inference across multiple sets
of results for unsupervised clustering, within the ensemble clustering
literature. The approach of reporting results from one `best' model out of
several candidate clustering models generally ignores the uncertainty that
arises from model selection, and results in inferences that are sensitive to
the particular model and parameters chosen. Bayesian model averaging (BMA) is a
popular approach for combining results across multiple models that offers some
attractive benefits in this setting, including probabilistic interpretation of
the combined cluster structure and quantification of model-based uncertainty.
In this work we introduce clusterBMA, a method that enables weighted model
averaging across results from multiple unsupervised clustering algorithms. We
use clustering internal validation criteria to develop an approximation of the
posterior model probability, used for weighting the results from each model.
From a consensus matrix representing a weighted average of the clustering
solutions across models, we apply symmetric simplex matrix factorisation to
calculate final probabilistic cluster allocations. In addition to outperforming
other ensemble clustering methods on simulated data, clusterBMA offers unique
features including probabilistic allocation to averaged clusters, combining
allocation probabilities from 'hard' and 'soft' clustering algorithms, and
measuring model-based uncertainty in averaged cluster allocation. This method
is implemented in an accompanying R package of the same name
The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces
Scholarly publications are key to the transfer of knowledge from scholars to
others. However, research papers are information-dense, and as the volume of
the scientific literature grows, the need for new technology to support the
reading process grows. In contrast to the process of finding papers, which has
been transformed by Internet technology, the experience of reading research
papers has changed little in decades. The PDF format for sharing research
papers is widely used due to its portability, but it has significant downsides
including: static content, poor accessibility for low-vision readers, and
difficulty reading on mobile devices. This paper explores the question "Can
recent advances in AI and HCI power intelligent, interactive, and accessible
reading interfaces -- even for legacy PDFs?" We describe the Semantic Reader
Project, a collaborative effort across multiple institutions to explore
automatic creation of dynamic reading interfaces for research papers. Through
this project, we've developed ten research prototype interfaces and conducted
usability studies with more than 300 participants and real-world users showing
improved reading experiences for scholars. We've also released a production
reading interface for research papers that will incorporate the best features
as they mature. We structure this paper around challenges scholars and the
public face when reading research papers -- Discovery, Efficiency,
Comprehension, Synthesis, and Accessibility -- and present an overview of our
progress and remaining open challenges
Multi-objective mission flight planning in civil unmanned aerial systems
Unmanned Aerial Vehicles (UAVs) are emerging as an ideal platform for a wide range of civil applications such as disaster monitoring, atmospheric observation and outback delivery. However, the operation of UAVs is currently restricted to specially segregated regions of airspace outside of the National Airspace System (NAS). Mission Flight Planning (MFP) is an integral part of UAV operation that addresses some of the requirements (such as safety and the rules of the air) of integrating UAVs in the NAS. Automated MFP is a key enabler for a number of UAV operating scenarios as it aids in increasing the level of onboard autonomy. For example, onboard MFP is required to ensure continued conformance with the NAS integration requirements when there is an outage in the communications link. MFP is a motion planning task concerned with finding a path between a designated start waypoint and goal waypoint. This path is described with a sequence of 4 Dimensional (4D) waypoints (three spatial and one time dimension) or equivalently with a sequence of trajectory segments (or tracks). It is necessary to consider the time dimension as the UAV operates in a dynamic environment. Existing methods for generic motion planning, UAV motion planning and general vehicle motion planning cannot adequately address the requirements of MFP. The flight plan needs to optimise for multiple decision objectives including mission safety objectives, the rules of the air and mission efficiency objectives. Online (in-flight) replanning capability is needed as the UAV operates in a large, dynamic and uncertain outdoor environment. This thesis derives a multi-objective 4D search algorithm entitled Multi- Step A* (MSA*) based on the seminal A* search algorithm. MSA* is proven to find the optimal (least cost) path given a variable successor operator (which enables arbitrary track angle and track velocity resolution). Furthermore, it is shown to be of comparable complexity to multi-objective, vector neighbourhood based A* (Vector A*, an extension of A*). A variable successor operator enables the imposition of a multi-resolution lattice structure on the search space (which results in fewer search nodes). Unlike cell decomposition based methods, soundness is guaranteed with multi-resolution MSA*. MSA* is demonstrated through Monte Carlo simulations to be computationally efficient. It is shown that multi-resolution, lattice based MSA* finds paths of equivalent cost (less than 0.5% difference) to Vector A* (the benchmark) in a third of the computation time (on average). This is the first contribution of the research. The second contribution is the discovery of the additive consistency property for planning with multiple decision objectives. Additive consistency ensures that the planner is not biased (which results in a suboptimal path) by ensuring that the cost of traversing a track using one step equals that of traversing the same track using multiple steps. MSA* mitigates uncertainty through online replanning, Multi-Criteria Decision Making (MCDM) and tolerance. Each trajectory segment is modeled with a cell sequence that completely encloses the trajectory segment. The tolerance, measured as the minimum distance between the track and cell boundaries, is the third major contribution. Even though MSA* is demonstrated for UAV MFP, it is extensible to other 4D vehicle motion planning applications. Finally, the research proposes a self-scheduling replanning architecture for MFP. This architecture replicates the decision strategies of human experts to meet the time constraints of online replanning. Based on a feedback loop, the proposed architecture switches between fast, near-optimal planning and optimal planning to minimise the need for hold manoeuvres. The derived MFP framework is original and shown, through extensive verification and validation, to satisfy the requirements of UAV MFP. As MFP is an enabling factor for operation of UAVs in the NAS, the presented work is both original and significant
Classifying ball trajectories in invasion sports using dynamic time warping : A basketball case study
Comparison and classification of ball trajectories can provide insight to support coaches and players in analysing their plays or opposition plays. This is challenging due to the innate variability and uncertainty of ball trajectories in space and time. We propose a framework based on Dynamic Time Warping (DTW) to cluster, compare and characterise trajectories in relation to play outcomes. Seventy-two international women's basketball games were analysed, where features such as ball trajectory, possession time and possession outcome were recorded. DTW was used to quantify the alignment-adjusted distance between three dimensional (two spatial, one temporal) trajectories. This distance, along with final location for the play (usually the shot), was then used to cluster trajectories. These clusters supported the conventional wisdom of higher scoring rates for fast breaks, but also identified other contextual factors affecting scoring rate, including bias towards one side of the court. In addition, some high scoring rate clusters were associated with greater mean change in the direction of ball movement, supporting the notion of entropy affecting effectiveness. Coaches and other end users could use such a framework to help make better use of their time by honing in on groups of effective or problematic plays for manual video analysis, for both their team and when scouting opponent teams and suggests new predictors for machine learning to analyse and predict trajectory-based sports.</p