4,849 research outputs found
Examining systemic and dispositional factors impacting historically disenfranchised schools across North Carolina
This mixed method sequential explanatory study provided analysis of North Carolina (NC) school leaders’ dispositions in eliminating opportunity gaps, outlined in NC’s strategic plan. The study’s quantitative phase used descriptive and correlation analysis of eight Likert subscales around four tenets of transformative leadership (Shields, 2011) and aspects of critical race theory (Bell, 1992; Ladson-Billings, 1998; Ladson-Billings & Tate, 2006) to understand systemic inequities and leadership attitudes.
The qualitative phase comprised three analyses of education leadership dispositions and systemic factors in NC schools. The first analysis of State Board of Education meeting minutes from 2018–2023 quantified and analyzed utterances of racism and critical race, outlined the sociopolitical context of such utterances, and identified systemic patterns and state leader dispositions. The second analysis of five interviews of K–12 graduates identified persistent and systemic factors influencing NC education 3 decades after Brown v. Board of Education (1954) and within the context of Leandro v. State of NC (1997), where the NC Supreme Court recognized the state constitutional right for every student to access a “sound basic education.” The final qualitative analysis consisted of five interviews of current NC public school system leaders, for personal narratives of the state of NC schools compared to patterns from lived experiences of NC K–12 graduates.
The study’s findings suggested NC school and state education leaders experience a racialized dichotomy between willingness for change (equity intentions) and execution of transformative action (practice). Although leaders at the board and school levels recognize the need for inclusivity and equity, a struggle to transcend systemic challenges, especially rooted in racial biases and power dynamics is evident. This study may identify leadership qualities needed for change in NC to address systemic inequities for improving educational access and inform policy to uphold all students’ constitutional right to a sound, basic education
Energy storage design and integration in power systems by system-value optimization
Energy storage can play a crucial role in decarbonising power systems by balancing
power and energy in time. Wider power system benefits that arise from these
balancing technologies include lower grid expansion, renewable curtailment, and
average electricity costs. However, with the proliferation of new energy storage
technologies, it becomes increasingly difficult to identify which technologies are
economically viable and how to design and integrate them effectively.
Using large-scale energy system models in Europe, the dissertation shows that solely
relying on Levelized Cost of Storage (LCOS) metrics for technology assessments can
mislead and that traditional system-value methods raise important questions about
how to assess multiple energy storage technologies. Further, the work introduces a
new complementary system-value assessment method called the market-potential
method, which provides a systematic deployment analysis for assessing multiple
storage technologies under competition. However, integrating energy storage in
system models can lead to the unintended storage cycling effect, which occurs in
approximately two-thirds of models and significantly distorts results. The thesis
finds that traditional approaches to deal with the issue, such as multi-stage optimization
or mixed integer linear programming approaches, are either ineffective
or computationally inefficient. A new approach is suggested that only requires
appropriate model parameterization with variable costs while keeping the model
convex to reduce the risk of misleading results.
In addition, to enable energy storage assessments and energy system research around
the world, the thesis extended the geographical scope of an existing European opensource
model to global coverage. The new build energy system model ‘PyPSA-Earth’
is thereby demonstrated and validated in Africa. Using PyPSA-Earth, the thesis
assesses for the first time the system value of 20 energy storage technologies across
multiple scenarios in a representative future power system in Africa. The results offer
insights into approaches for assessing multiple energy storage technologies under
competition in large-scale energy system models. In particular, the dissertation
addresses extreme cost uncertainty through a comprehensive scenario tree and finds
that, apart from lithium and hydrogen, only seven energy storage are optimizationrelevant
technologies. The work also discovers that a heterogeneous storage design
can increase power system benefits and that some energy storage are more important
than others. Finally, in contrast to traditional methods that only consider single
energy storage, the thesis finds that optimizing multiple energy storage options
tends to significantly reduce total system costs by up to 29%.
The presented research findings have the potential to inform decision-making processes
for the sizing, integration, and deployment of energy storage systems in
decarbonized power systems, contributing to a paradigm shift in scientific methodology
and advancing efforts towards a sustainable future
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Data-assisted modeling of complex chemical and biological systems
Complex systems are abundant in chemistry and biology; they can be multiscale, possibly high-dimensional or stochastic, with nonlinear dynamics and interacting components. It is often nontrivial (and sometimes impossible), to determine and study the macroscopic quantities of interest and the equations they obey. One can only (judiciously or randomly) probe the system, gather observations and study trends. In this thesis, Machine Learning is used as a complement to traditional modeling and numerical methods to enable data-assisted (or data-driven) dynamical systems. As case studies, three complex systems are sourced from diverse fields: The first one is a high-dimensional computational neuroscience model of the Suprachiasmatic Nucleus of the human brain, where bifurcation analysis is performed by simply probing the system. Then, manifold learning is employed to discover a latent space of neuronal heterogeneity. Second, Machine Learning surrogate models are used to optimize dynamically operated catalytic reactors. An algorithmic pipeline is presented through which it is possible to program catalysts with active learning. Third, Machine Learning is employed to extract laws of Partial Differential Equations describing bacterial Chemotaxis. It is demonstrated how Machine Learning manages to capture the rules of bacterial motility in the macroscopic level, starting from diverse data sources (including real-world experimental data). More importantly, a framework is constructed though which already existing, partial knowledge of the system can be exploited. These applications showcase how Machine Learning can be used synergistically with traditional simulations in different scenarios: (i) Equations are available but the overall system is so high-dimensional that efficiency and explainability suffer, (ii) Equations are available but lead to highly nonlinear black-box responses, (iii) Only data are available (of varying source and quality) and equations need to be discovered. For such data-assisted dynamical systems, we can perform fundamental tasks, such as integration, steady-state location, continuation and optimization. This work aims to unify traditional scientific computing and Machine Learning, in an efficient, data-economical, generalizable way, where both the physical system and the algorithm matter
Optimal speed trajectory and energy management control for connected and automated vehicles
Connected and automated vehicles (CAVs) emerge as a promising solution to improve urban mobility, safety, energy efficiency, and passenger comfort with the development of communication technologies, such as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I). This thesis proposes several control approaches for CAVs with electric powertrains, including hybrid electric vehicles (HEVs) and battery electric vehicles (BEVs), with the main objective to improve energy efficiency by optimising vehicle speed trajectory and energy management system. By types of vehicle control, these methods can be categorised into three main scenarios, optimal energy management for a single CAV (single-vehicle), energy-optimal strategy for the vehicle following scenario (two-vehicle), and optimal autonomous intersection management for CAVs (multiple-vehicle).
The first part of this thesis is devoted to the optimal energy management for a single automated series HEV with consideration of engine start-stop system (SSS) under battery charge sustaining operation. A heuristic hysteresis power threshold strategy (HPTS) is proposed to optimise the fuel economy of an HEV with SSS and extra penalty fuel for engine restarts. By a systematic tuning process, the overall control performance of HPTS can be fully optimised for different vehicle parameters and driving cycles.
In the second part, two energy-optimal control strategies via a model predictive control (MPC) framework are proposed for the vehicle following problem. To forecast the behaviour of the preceding vehicle, a neural network predictor is utilised and incorporated into a nonlinear MPC method, of which the fuel and computational efficiencies are verified to be effective through comparisons of numerical examples between a practical adaptive cruise control strategy and an impractical optimal control method. A robust MPC (RMPC) via linear matrix inequality (LMI) is also utilised to deal with the uncertainties existing in V2V communication and modelling errors. By conservative relaxation and approximation, the RMPC problem is formulated as a convex semi-definite program, and the simulation results prove the robustness of the RMPC and the rapid computational efficiency resorting to the convex optimisation.
The final part focuses on the centralised and decentralised control frameworks at signal-free intersections, where the energy consumption and the crossing time of a group of CAVs are minimised. Their crossing order and velocity trajectories are optimised by convex second-order cone programs in a hierarchical scheme subject to safety constraints. It is shown that the centralised strategy with consideration of turning manoeuvres is effective and outperforms a benchmark solution invoking the widely used first-in-first-out policy. On the other hand, the decentralised method is proposed to further improve computational efficiency and enhance the system robustness via a tube-based RMPC. The numerical examples of both frameworks highlight the importance of examining the trade-off between energy consumption and travel time, as small compromises in travel time could produce significant energy savings.Open Acces
Data- og ekspertdreven variabelseleksjon for prediktive modeller i helsevesenet : mot økt tolkbarhet i underbestemte maskinlæringsproblemer
Modern data acquisition techniques in healthcare generate large collections of data from multiple sources, such as novel diagnosis and treatment methodologies. Some concrete examples are electronic healthcare record systems, genomics, and medical images. This leads to situations with often unstructured, high-dimensional heterogeneous patient cohort data where classical statistical methods may not be sufficient for optimal utilization of the data and informed decision-making. Instead, investigating such data structures with modern machine learning techniques promises to improve the understanding of patient health issues and may provide a better platform for informed decision-making by clinicians. Key requirements for this purpose include (a) sufficiently accurate predictions and (b) model interpretability. Achieving both aspects in parallel is difficult, particularly for datasets with few patients, which are common in the healthcare domain. In such cases, machine learning models encounter mathematically underdetermined systems and may overfit easily on the training data. An important approach to overcome this issue is feature selection, i.e., determining a subset of informative features from the original set of features with respect to the target variable. While potentially raising the predictive performance, feature selection fosters model interpretability by identifying a low number of relevant model parameters to better understand the underlying biological processes that lead to health issues.
Interpretability requires that feature selection is stable, i.e., small changes in the dataset do not lead to changes in the selected feature set. A concept to address instability is ensemble feature selection, i.e. the process of repeating the feature selection multiple times on subsets of samples of the original dataset and aggregating results in a meta-model. This thesis presents two approaches for ensemble feature selection, which are tailored towards high-dimensional data in healthcare: the Repeated Elastic Net Technique for feature selection (RENT) and the User-Guided Bayesian Framework for feature selection (UBayFS). While RENT is purely data-driven and builds upon elastic net regularized models, UBayFS is a general framework for ensembles with the capabilities to include expert knowledge in the feature selection process via prior weights and side constraints. A case study modeling the overall survival of cancer patients compares these novel feature selectors and demonstrates their potential in clinical practice.
Beyond the selection of single features, UBayFS also allows for selecting whole feature groups (feature blocks) that were acquired from multiple data sources, as those mentioned above. Importance quantification of such feature blocks plays a key role in tracing information about the target variable back to the acquisition modalities. Such information on feature block importance may lead to positive effects on the use of human, technical, and financial resources if systematically integrated into the planning of patient treatment by excluding the acquisition of non-informative features. Since a generalization of feature importance measures to block importance is not trivial, this thesis also investigates and compares approaches for feature block importance rankings.
This thesis demonstrates that high-dimensional datasets from multiple data sources in the medical domain can be successfully tackled by the presented approaches for feature selection. Experimental evaluations demonstrate favorable properties of both predictive performance, stability, as well as interpretability of results, which carries a high potential for better data-driven decision support in clinical practice.Moderne datainnsamlingsteknikker i helsevesenet genererer store datamengder fra flere kilder, som for eksempel nye diagnose- og behandlingsmetoder. Noen konkrete eksempler er elektroniske helsejournalsystemer, genomikk og medisinske bilder. Slike pasientkohortdata er ofte ustrukturerte, høydimensjonale og heterogene og hvor klassiske statistiske metoder ikke er tilstrekkelige for optimal utnyttelse av dataene og god informasjonsbasert beslutningstaking. Derfor kan det være lovende å analysere slike datastrukturer ved bruk av moderne maskinlæringsteknikker for å øke forståelsen av pasientenes helseproblemer og for å gi klinikerne en bedre plattform for informasjonsbasert beslutningstaking. Sentrale krav til dette formålet inkluderer (a) tilstrekkelig nøyaktige prediksjoner og (b) modelltolkbarhet. Å oppnå begge aspektene samtidig er vanskelig, spesielt for datasett med få pasienter, noe som er vanlig for data i helsevesenet. I slike tilfeller må maskinlæringsmodeller håndtere matematisk underbestemte systemer og dette kan lett føre til at modellene overtilpasses treningsdataene. Variabelseleksjon er en viktig tilnærming for å håndtere dette ved å identifisere en undergruppe av informative variabler med hensyn til responsvariablen. Samtidig som variabelseleksjonsmetoder kan lede til økt prediktiv ytelse, fremmes modelltolkbarhet ved å identifisere et lavt antall relevante modellparametere. Dette kan gi bedre forståelse av de underliggende biologiske prosessene som fører til helseproblemer.
Tolkbarhet krever at variabelseleksjonen er stabil, dvs. at små endringer i datasettet ikke fører til endringer i hvilke variabler som velges. Et konsept for å adressere ustabilitet er ensemblevariableseleksjon, dvs. prosessen med å gjenta variabelseleksjon flere ganger på en delmengde av prøvene i det originale datasett og aggregere resultater i en metamodell. Denne avhandlingen presenterer to tilnærminger for ensemblevariabelseleksjon, som er skreddersydd for høydimensjonale data i helsevesenet: "Repeated Elastic Net Technique for feature selection" (RENT) og "User-Guided Bayesian Framework for feature selection" (UBayFS). Mens RENT er datadrevet og bygger på elastic net-regulariserte modeller, er UBayFS et generelt rammeverk for ensembler som muliggjør inkludering av ekspertkunnskap i variabelseleksjonsprosessen gjennom forhåndsbestemte vekter og sidebegrensninger. En case-studie som modellerer overlevelsen av kreftpasienter sammenligner disse nye variabelseleksjonsmetodene og demonstrerer deres potensiale i klinisk praksis.
Utover valg av enkelte variabler gjør UBayFS det også mulig å velge blokker eller grupper av variabler som representerer de ulike datakildene som ble nevnt over. Kvantifisering av viktigheten av variabelgrupper spiller en nøkkelrolle for forståelsen av hvorvidt datakildene er viktige for responsvariablen. Tilgang til slik informasjon kan føre til at bruken av menneskelige, tekniske og økonomiske ressurser kan forbedres dersom informasjonen integreres systematisk i planleggingen av pasientbehandlingen. Slik kan man redusere innsamling av ikke-informative variabler. Siden generaliseringen av viktighet av variabelgrupper ikke er triviell, undersøkes og sammenlignes også tilnærminger for rangering av viktigheten til disse variabelgruppene.
Denne avhandlingen viser at høydimensjonale datasett fra flere datakilder fra det medisinske domenet effektivt kan håndteres ved bruk av variabelseleksjonmetodene som er presentert i avhandlingen. Eksperimentene viser at disse kan ha positiv en effekt på både prediktiv ytelse, stabilitet og tolkbarhet av resultatene. Bruken av disse variabelseleksjonsmetodene bærer et stort potensiale for bedre datadrevet beslutningsstøtte i klinisk praksis
Pairwise versus mutual independence: visualisation, actuarial applications and central limit theorems
Accurately capturing the dependence between risks, if it exists, is an increasingly relevant topic of actuarial research. In recent years, several authors have started to relax the traditional 'independence assumption', in a variety of actuarial settings. While it is known that 'mutual independence' between random variables is not equivalent to their 'pairwise independence', this thesis aims to provide a better understanding of the materiality of this difference. The distinction between mutual and pairwise independence matters because, in practice, dependence is often assessed via pairs only, e.g., through correlation matrices, rank-based measures of association, scatterplot matrices, heat-maps, etc. Using such pairwise methods, it is possible to miss some forms of dependence. In this thesis, we explore how material the difference between pairwise and mutual independence is, and from several angles.
We provide relevant background and motivation for this thesis in Chapter 1, then conduct a literature review in Chapter 2.
In Chapter 3, we focus on visualising the difference between pairwise and mutual independence. To do so, we propose a series of theoretical examples (some of them new) where random variables are pairwise independent but (mutually) dependent, in short, PIBD. We then develop new visualisation tools and use them to illustrate what PIBD variables can look like. We showcase that the dependence involved is possibly very strong. We also use our visualisation tools to identify subtle forms of dependence, which would otherwise be hard to detect.
In Chapter 4, we review common dependence models (such has elliptical distributions and Archimedean copulas) used in actuarial science and show that they do not allow for the possibility of PIBD data. We also investigate concrete consequences of the 'nonequivalence' between pairwise and mutual independence. We establish that many results which hold for mutually independent variables do not hold under sole pairwise independent. Those include results about finite sums of random variables, extreme value theory and bootstrap methods. This part thus illustrates what can potentially 'go wrong' if one assumes mutual independence where only pairwise independence holds.
Lastly, in Chapters 5 and 6, we investigate the question of what happens for PIBD variables 'in the limit', i.e., when the sample size goes to infi nity. We want to see if the 'problems' caused by dependence vanish for sufficiently large samples. This is a broad question, and we concentrate on the important classical Central Limit Theorem (CLT), for which we fi nd that the answer is largely negative. In particular, we construct new sequences of PIBD variables (with arbitrary margins) for which a CLT does not hold. We derive explicitly the asymptotic distribution of the standardised mean of our sequences, which allows us to illustrate the extent of the 'failure' of a CLT for PIBD variables. We also propose a general methodology to construct dependent K-tuplewise independent (K an arbitrary integer) sequences of random variables with arbitrary margins. In the case K = 3, we use this methodology to derive explicit examples of triplewise independent sequences for which no CLT hold. Those results illustrate that mutual independence is a crucial assumption within CLTs, and that having larger samples is not always a viable solution to the problem of non-independent data
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
Large language models (LLMs) exhibit a wide range of promising capabilities
-- from step-by-step planning to commonsense reasoning -- that may provide
utility for robots, but remain prone to confidently hallucinated predictions.
In this work, we present KnowNo, which is a framework for measuring and
aligning the uncertainty of LLM-based planners such that they know when they
don't know and ask for help when needed. KnowNo builds on the theory of
conformal prediction to provide statistical guarantees on task completion while
minimizing human help in complex multi-step planning settings. Experiments
across a variety of simulated and real robot setups that involve tasks with
different modes of ambiguity (e.g., from spatial to numeric uncertainties, from
human preferences to Winograd schemas) show that KnowNo performs favorably over
modern baselines (which may involve ensembles or extensive prompt tuning) in
terms of improving efficiency and autonomy, while providing formal assurances.
KnowNo can be used with LLMs out of the box without model-finetuning, and
suggests a promising lightweight approach to modeling uncertainty that can
complement and scale with the growing capabilities of foundation models.
Website: https://robot-help.github.ioComment: Conference on Robot Learning (CoRL) 2023, Oral Presentatio
Differentially Private Synthetic Heavy-tailed Data
The U.S. Census Longitudinal Business Database (LBD) product contains
employment and payroll information of all U.S. establishments and firms dating
back to 1976 and is an invaluable resource for economic research. However, the
sensitive information in LBD requires confidentiality measures that the U.S.
Census in part addressed by releasing a synthetic version (SynLBD) of the data
to protect firms' privacy while ensuring its usability for research activities,
but without provable privacy guarantees. In this paper, we propose using the
framework of differential privacy (DP) that offers strong provable privacy
protection against arbitrary adversaries to generate synthetic heavy-tailed
data with a formal privacy guarantee while preserving high levels of utility.
We propose using the K-Norm Gradient Mechanism (KNG) with quantile regression
for DP synthetic data generation. The proposed methodology offers the
flexibility of the well-known exponential mechanism while adding less noise. We
propose implementing KNG in a stepwise and sandwich order, such that new
quantile estimation relies on previously sampled quantiles, to more efficiently
use the privacy-loss budget. Generating synthetic heavy-tailed data with a
formal privacy guarantee while preserving high levels of utility is a
challenging problem for data curators and researchers. However, we show that
the proposed methods can achieve better data utility relative to the original
KNG at the same privacy-loss budget through a simulation study and an
application to the Synthetic Longitudinal Business Database
- …