74 research outputs found
TRUST-LAPSE: An Explainable and Actionable Mistrust Scoring Framework for Model Monitoring
Continuous monitoring of trained ML models to determine when their
predictions should and should not be trusted is essential for their safe
deployment. Such a framework ought to be high-performing, explainable, post-hoc
and actionable. We propose TRUST-LAPSE, a "mistrust" scoring framework for
continuous model monitoring. We assess the trustworthiness of each input
sample's model prediction using a sequence of latent-space embeddings.
Specifically, (a) our latent-space mistrust score estimates mistrust using
distance metrics (Mahalanobis distance) and similarity metrics (cosine
similarity) in the latent-space and (b) our sequential mistrust score
determines deviations in correlations over the sequence of past input
representations in a non-parametric, sliding-window based algorithm for
actionable continuous monitoring. We evaluate TRUST-LAPSE via two downstream
tasks: (1) distributionally shifted input detection, and (2) data drift
detection. We evaluate across diverse domains - audio and vision using public
datasets and further benchmark our approach on challenging, real-world
electroencephalograms (EEG) datasets for seizure detection. Our latent-space
mistrust scores achieve state-of-the-art results with AUROCs of 84.1 (vision),
73.9 (audio), and 77.1 (clinical EEGs), outperforming baselines by over 10
points. We expose critical failures in popular baselines that remain
insensitive to input semantic content, rendering them unfit for real-world
model monitoring. We show that our sequential mistrust scores achieve high
drift detection rates; over 90% of the streams show < 20% error for all
domains. Through extensive qualitative and quantitative evaluations, we show
that our mistrust scores are more robust and provide explainability for easy
adoption into practice.Comment: Keywords: Mistrust Scores, Latent-Space, Model monitoring,
Trustworthy AI, Explainable AI, Semantic-guided A
Spatiotemporal Modeling of Multivariate Signals With Graph Neural Networks and Structured State Space Models
Multivariate signals are prevalent in various domains, such as healthcare,
transportation systems, and space sciences. Modeling spatiotemporal
dependencies in multivariate signals is challenging due to (1) long-range
temporal dependencies and (2) complex spatial correlations between sensors. To
address these challenges, we propose representing multivariate signals as
graphs and introduce GraphS4mer, a general graph neural network (GNN)
architecture that captures both spatial and temporal dependencies in
multivariate signals. Specifically, (1) we leverage Structured State Spaces
model (S4), a state-of-the-art sequence model, to capture long-term temporal
dependencies and (2) we propose a graph structure learning layer in GraphS4mer
to learn dynamically evolving graph structures in the data. We evaluate our
proposed model on three distinct tasks and show that GraphS4mer consistently
improves over existing models, including (1) seizure detection from
electroencephalography signals, outperforming a previous GNN with
self-supervised pretraining by 3.1 points in AUROC; (2) sleep staging from
polysomnography signals, a 4.1 points improvement in macro-F1 score compared to
existing sleep staging models; and (3) traffic forecasting, reducing MAE by
8.8% compared to existing GNNs and by 1.4% compared to Transformer-based
models
Recommended from our members
A short and engaging adaptive working memory intervention for children with Developmental Language Disorder: Effects on language and working memory
Recent research has suggested that working memory training interventions may benefit chil-dren with Developmental Language Disorder (DLD). The current study investigated a short and engaging adaptive working memory intervention that targeted executive skills and aimed to improve both language comprehension and working memory abilities in children with DLD. Forty-seven 6- to 10-year-old children with DLD were randomly allocated to an executive working memory training intervention (n=24) or an active control group (n=23). A pre-test/intervention/post-test/9-month-follow-up design was used. Outcome measures in-cluded assessments of language (to evaluate far transfer of the training) and working memory (to evaluate near transfer of the training). Hierarchical multiple regression analyses control-ling for pre-intervention performance and age found group to be a significant predictor of sen-tence comprehension and of performance on six untrained working memory measures at post-intervention and 9-month follow-up. Children in the intervention group showed signifi-cantly higher language comprehension and working memory scores at both time points than children in the active control group. The intervention programme showed potential to im-prove working memory and language comprehension in children with DLD and demonstrated several advantages: it involved short sessions over a short period; caused little disruption in the school day; and was enjoyed by children
Population Health Solutions for Assessing Cognitive Impairment in Geriatric Patients.
In December 2017, the National Academy of Neuropsychology convened an interorganizational Summit on Population Health Solutions for Assessing Cognitive Impairment in Geriatric Patients in Denver, Colorado. The Summit brought together representatives of a broad range of stakeholders invested in the care of older adults to focus on the topic of cognitive health and aging. Summit participants specifically examined questions of who should be screened for cognitive impairment and how they should be screened in medical settings. This is important in the context of an acute illness given that the presence of cognitive impairment can have significant implications for care and for the management of concomitant diseases as well as pose a major risk factor for dementia. Participants arrived at general principles to guide future screening approaches in medical populations and identified knowledge gaps to direct future research. Key learning points of the summit included: recognizing the importance of educating patients and healthcare providers about the value of assessing current and baseline cognition;emphasizing that any screening tool must be appropriately normalized and validated in the population in which it is used to obtain accurate information, including considerations of language, cultural factors, and education; andrecognizing the great potential, with appropriate caveats, of electronic health records to augment cognitive screening and tracking of changes in cognitive health over time
A community-maintained standard library of population genetic models
The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances in population genetic simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here, we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Protocol for establishing a core outcome set for evaluation in studies of pulmonary exacerbations in people with cystic fibrosis
Introduction: Pulmonary exacerbations are associated with increased morbidity and mortality in people with cystic fibrosis (CF). There is no consensus about which outcomes should be evaluated in studies of pulmonary exacerbations or how these outcomes should be measured. Outcomes of importance to people with lived experience of the disease are frequently omitted or inconsistently reported in studies, which limits the value of such studies for informing practice and policy. To better standardise outcome reporting and measurement, we aim to develop a core outcome set for studies of pulmonary exacerbations in people with CF (COS-PEX) and consensus recommendations for measurement of core outcomes. Methods and analysis: Preliminary work for development of COS-PEX has been reported, including (1) systematic reviews of outcomes and methods for measurement reported in existing studies of pulmonary exacerbations; (2) workshops with people affected by CF within Australia; and (3) a Bayesian knowledge expert elicitation workshop with health professionals to ascertain outcomes of importance. Here we describe a protocol for the additional stages required for COS-PEX development and consensus methods for measurement of core outcomes. These include (1) an international two-round online Delphi survey and (2) consensus workshops to review and endorse the proposed COS-PEX and to agree with methods for measurement. Ethics and dissemination: National mutual ethics scheme approval has been provided by the Child and Adolescent Health Service Human Research Ethics Committee (RGS 4926). Results will be disseminated via consumer and research networks and peer-reviewed publications. This study is registered with the Core Outcome Measures in Effectiveness Trials database
MSH3 polymorphisms and protein levels affect CAG repeat instability in huntington's disease mice
Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)~100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases
Catching Element Formation In The Act
Gamma-ray astronomy explores the most energetic photons in nature to address
some of the most pressing puzzles in contemporary astrophysics. It encompasses
a wide range of objects and phenomena: stars, supernovae, novae, neutron stars,
stellar-mass black holes, nucleosynthesis, the interstellar medium, cosmic rays
and relativistic-particle acceleration, and the evolution of galaxies. MeV
gamma-rays provide a unique probe of nuclear processes in astronomy, directly
measuring radioactive decay, nuclear de-excitation, and positron annihilation.
The substantial information carried by gamma-ray photons allows us to see
deeper into these objects, the bulk of the power is often emitted at gamma-ray
energies, and radioactivity provides a natural physical clock that adds unique
information. New science will be driven by time-domain population studies at
gamma-ray energies. This science is enabled by next-generation gamma-ray
instruments with one to two orders of magnitude better sensitivity, larger sky
coverage, and faster cadence than all previous gamma-ray instruments. This
transformative capability permits: (a) the accurate identification of the
gamma-ray emitting objects and correlations with observations taken at other
wavelengths and with other messengers; (b) construction of new gamma-ray maps
of the Milky Way and other nearby galaxies where extended regions are
distinguished from point sources; and (c) considerable serendipitous science of
scarce events -- nearby neutron star mergers, for example. Advances in
technology push the performance of new gamma-ray instruments to address a wide
set of astrophysical questions.Comment: 14 pages including 3 figure
- …