246 research outputs found
Energy storage design and integration in power systems by system-value optimization
Energy storage can play a crucial role in decarbonising power systems by balancing
power and energy in time. Wider power system benefits that arise from these
balancing technologies include lower grid expansion, renewable curtailment, and
average electricity costs. However, with the proliferation of new energy storage
technologies, it becomes increasingly difficult to identify which technologies are
economically viable and how to design and integrate them effectively.
Using large-scale energy system models in Europe, the dissertation shows that solely
relying on Levelized Cost of Storage (LCOS) metrics for technology assessments can
mislead and that traditional system-value methods raise important questions about
how to assess multiple energy storage technologies. Further, the work introduces a
new complementary system-value assessment method called the market-potential
method, which provides a systematic deployment analysis for assessing multiple
storage technologies under competition. However, integrating energy storage in
system models can lead to the unintended storage cycling effect, which occurs in
approximately two-thirds of models and significantly distorts results. The thesis
finds that traditional approaches to deal with the issue, such as multi-stage optimization
or mixed integer linear programming approaches, are either ineffective
or computationally inefficient. A new approach is suggested that only requires
appropriate model parameterization with variable costs while keeping the model
convex to reduce the risk of misleading results.
In addition, to enable energy storage assessments and energy system research around
the world, the thesis extended the geographical scope of an existing European opensource
model to global coverage. The new build energy system model ‘PyPSA-Earth’
is thereby demonstrated and validated in Africa. Using PyPSA-Earth, the thesis
assesses for the first time the system value of 20 energy storage technologies across
multiple scenarios in a representative future power system in Africa. The results offer
insights into approaches for assessing multiple energy storage technologies under
competition in large-scale energy system models. In particular, the dissertation
addresses extreme cost uncertainty through a comprehensive scenario tree and finds
that, apart from lithium and hydrogen, only seven energy storage are optimizationrelevant
technologies. The work also discovers that a heterogeneous storage design
can increase power system benefits and that some energy storage are more important
than others. Finally, in contrast to traditional methods that only consider single
energy storage, the thesis finds that optimizing multiple energy storage options
tends to significantly reduce total system costs by up to 29%.
The presented research findings have the potential to inform decision-making processes
for the sizing, integration, and deployment of energy storage systems in
decarbonized power systems, contributing to a paradigm shift in scientific methodology
and advancing efforts towards a sustainable future
Advancing Time-Dependent Earthquake Risk Modelling
Catastrophe (CAT) risk models are commonly used in the (re)insurance industry and by public organizations to estimate potential losses due to natural hazards like earthquakes. Conventional earthquake risk modelling involves several significant modelling assumptions, which mainly neglect: (a) the interaction between adjacent faults; (b) the long-term elastic-rebound behaviour of faults; (c) the short-term hazard increase associated with aftershocks; and (d) the damage accumulation in building assets that results from the occurrence of multiple earthquakes in a short time window. Several recent earthquake events/sequences (e.g., 2010/2012 Canterbury earthquakes, New Zealand; 2019 Ridgecrest earthquakes, USA; 2023 Turkey-Syria earthquakes) have emphasised the simplicity of these assumptions and the need for earthquake risk models to start accounting for the short-and long-term time-dependent characteristics of earthquake risk. This thesis introduces an end-to-end framework for time-dependent earthquake risk modelling that incorporates (a) advancements in long-term time-dependent fault and aftershock modelling in the hazard component of the risk modelling framework; and (b) vulnerability models that account for the damage accumulation due to multiple ground motions occurring in a short period of time. The long-term time-dependent fault model used incorporates the elastic-rebound motivated methodologies of the latest Uniform California Earthquake Rupture Forecast (UCERF3) and explicitly accounts for fault-interaction triggering between major known faults. The Epidemic-Type Aftershock Sequence (ETAS) model is used to simulate aftershocks, representing the short-term hazard increase observed after large mainshocks. Damage-dependent fragility and vulnerability models are then used to account for damage accumulation. Sensitivity analyses of direct economic losses to these time dependencies are also conducted, providing valuable guidance on integrating time dependencies in earthquake risk modelling
Streamlining Knowledge Graph Construction with a fa\c{c}ade: The SPARQL Anything project
What should a data integration framework for knowledge engineers look like?
Recent research on Knowledge Graph construction proposes the design of a
fa\c{c}ade, a notion borrowed from object-oriented software engineering. This
idea is applied to SPARQL Anything, a system that allows querying heterogeneous
resources as-if they were in RDF, in plain SPARQL 1.1, by overloading the
SERVICE clause. SPARQL Anything supports a wide variety of file formats, from
popular ones (CSV, JSON, XML, Spreadsheets) to others that are not supported by
alternative solutions (Markdown, YAML, DOCx, Bibtex). Features include querying
Web APIs with high flexibility, parametrised queries, and chaining multiple
transformations into complex pipelines. In this paper, we describe the design
rationale and software architecture of the SPARQL Anything system. We provide
references to an extensive set of reusable, real-world scenarios from various
application domains. We report on the value-to-users of the founding
assumptions of its design, compared to alternative solutions through a
community survey and a field report from the industry.Comment: 15 page
Understanding and Quantifying Phosphorus Transport from Septic Systems to Lake Auburn
Rural areas often use septic systems to treat household wastewater, which may pose a phosphorus (P) loading risk to nearby water bodies if systems fail or if the soil types are unsuitable for P retention. In the Lake Auburn watershed, septic systems may be a source of phosphorus loading to Lake Auburn, an unfiltered drinking water supply. Site evaluations from municipal permits reveal patterns of septic system locations and soil types in septic drain fields. Many septic drain fields have shallow depths to groundwater or a restrictive layer, which may lead to inadequate P retention in the soil. Areas with a high density of septic systems near the two largest inlets to the lake may be P loading hotspots. Near the Basin inlet, shallow soil depths and proximity to the lake suggest that failing systems may be a source of P loading, and that there may not be robust P removal in these drain fields. A high density cluster of systems near Townsend Brook is located on a sand and gravel aquifer, with coarse sands that have less capacity to retain P than finer-textured soils. The creation of a model that simulates 200 years (1900 to 2100) of septic system operation demonstrates that septic systems may create a legacy P issue, because P loading was estimated to increase even after new development ceased. The model shows that policy changes may be able to decrease the septic system P load, but that the impact of such changes on P loading estimates may not be substantial for decades. In Auburn, such policy changes also have land use implications that would increase watershed sources of P loading through additional deforestation, impervious surface, and septic systems from new development. Other studies on Lake Auburn demonstrate that land-based P loading from new development may be higher than the P load reductions from improved wastewater treatment estimated in this study, suggesting that any change to septic system policy that does not also restrict development where it has previously been restricted is likely to lead to a net increase in the cumulative P load
An uncertainty prediction approach for active learning - application to earth observation
Mapping land cover and land usage dynamics are crucial in remote sensing since farmers
are encouraged to either intensify or extend crop use due to the ongoing rise in the world’s
population. A major issue in this area is interpreting and classifying a scene captured in
high-resolution satellite imagery. Several methods have been put forth, including neural
networks which generate data-dependent models (i.e. model is biased toward data) and
static rule-based approaches with thresholds which are limited in terms of diversity(i.e.
model lacks diversity in terms of rules). However, the problem of having a machine learning
model that, given a large amount of training data, can classify multiple classes over different
geographic Sentinel-2 imagery that out scales existing approaches remains open.
On the other hand, supervised machine learning has evolved into an essential part of many
areas due to the increasing number of labeled datasets. Examples include creating classifiers
for applications that recognize images and voices, anticipate traffic, propose products, act
as a virtual personal assistant and detect online fraud, among many more. Since these
classifiers are highly dependent from the training datasets, without human interaction or
accurate labels, the performance of these generated classifiers with unseen observations
is uncertain. Thus, researchers attempted to evaluate a number of independent models
using a statistical distance. However, the problem of, given a train-test split and classifiers
modeled over the train set, identifying a prediction error using the relation between train
and test sets remains open.
Moreover, while some training data is essential for supervised machine learning, what
happens if there is insufficient labeled data? After all, assigning labels to unlabeled datasets
is a time-consuming process that may need significant expert human involvement. When
there aren’t enough expert manual labels accessible for the vast amount of openly available
data, active learning becomes crucial. However, given a large amount of training and
unlabeled datasets, having an active learning model that can reduce the training cost of
the classifier and at the same time assist in labeling new data points remains an open
problem.
From the experimental approaches and findings, the main research contributions, which
concentrate on the issue of optical satellite image scene classification include: building
labeled Sentinel-2 datasets with surface reflectance values; proposal of machine learning
models for pixel-based image scene classification; proposal of a statistical distance based
Evidence Function Model (EFM) to detect ML models misclassification; and proposal of
a generalised sampling approach for active learning that, together with the EFM enables
a way of determining the most informative examples.
Firstly, using a manually annotated Sentinel-2 dataset, Machine Learning (ML) models
for scene classification were developed and their performance was compared to Sen2Cor the reference package from the European Space Agency – a micro-F1 value of 84%
was attained by the ML model, which is a significant improvement over the corresponding
Sen2Cor performance of 59%. Secondly, to quantify the misclassification of the ML models,
the Mahalanobis distance-based EFM was devised. This model achieved, for the labeled
Sentinel-2 dataset, a micro-F1 of 67.89% for misclassification detection. Lastly, EFM was
engineered as a sampling strategy for active learning leading to an approach that attains
the same level of accuracy with only 0.02% of the total training samples when compared
to a classifier trained with the full training set.
With the help of the above-mentioned research contributions, we were able to provide
an open-source Sentinel-2 image scene classification package which consists of ready-touse
Python scripts and a ML model that classifies Sentinel-2 L1C images generating a
20m-resolution RGB image with the six studied classes (Cloud, Cirrus, Shadow, Snow,
Water, and Other) giving academics a straightforward method for rapidly and effectively
classifying Sentinel-2 scene images. Additionally, an active learning approach that uses, as
sampling strategy, the observed prediction uncertainty given by EFM, will allow labeling
only the most informative points to be used as input to build classifiers; Sumário:
Uma Abordagem de Previsão de Incerteza para
Aprendizagem Ativa – Aplicação à Observação da Terra
O mapeamento da cobertura do solo e a dinâmica da utilização do solo são cruciais na
deteção remota uma vez que os agricultores são incentivados a intensificar ou estender as
culturas devido ao aumento contínuo da população mundial. Uma questão importante
nesta área é interpretar e classificar cenas capturadas em imagens de satélite de alta resolução.
Várias aproximações têm sido propostas incluindo a utilização de redes neuronais
que produzem modelos dependentes dos dados (ou seja, o modelo é tendencioso em relação
aos dados) e aproximações baseadas em regras que apresentam restrições de diversidade
(ou seja, o modelo carece de diversidade em termos de regras). No entanto, a criação de
um modelo de aprendizagem automática que, dada uma uma grande quantidade de dados
de treino, é capaz de classificar, com desempenho superior, as imagens do Sentinel-2 em
diferentes áreas geográficas permanece um problema em aberto.
Por outro lado, têm sido utilizadas técnicas de aprendizagem supervisionada na resolução
de problemas nas mais diversas áreas de devido à proliferação de conjuntos de dados etiquetados.
Exemplos disto incluem classificadores para aplicações que reconhecem imagem
e voz, antecipam tráfego, propõem produtos, atuam como assistentes pessoais virtuais e
detetam fraudes online, entre muitos outros. Uma vez que estes classificadores são fortemente
dependente do conjunto de dados de treino, sem interação humana ou etiquetas
precisas, o seu desempenho sobre novos dados é incerta. Neste sentido existem propostas
para avaliar modelos independentes usando uma distância estatística. No entanto, o problema
de, dada uma divisão de treino-teste e um classificador, identificar o erro de previsão
usando a relação entre aqueles conjuntos, permanece aberto.
Mais ainda, embora alguns dados de treino sejam essenciais para a aprendizagem supervisionada,
o que acontece quando a quantidade de dados etiquetados é insuficiente? Afinal,
atribuir etiquetas é um processo demorado e que exige perícia, o que se traduz num envolvimento
humano significativo. Quando a quantidade de dados etiquetados manualmente por
peritos é insuficiente a aprendizagem ativa torna-se crucial. No entanto, dada uma grande
quantidade dados de treino não etiquetados, ter um modelo de aprendizagem ativa que
reduz o custo de treino do classificador e, ao mesmo tempo, auxilia a etiquetagem de novas
observações permanece um problema em aberto.
A partir das abordagens e estudos experimentais, as principais contribuições deste trabalho,
que se concentra na classificação de cenas de imagens de satélite óptico incluem:
criação de conjuntos de dados Sentinel-2 etiquetados, com valores de refletância de superfície;
proposta de modelos de aprendizagem automática baseados em pixels para classificação de cenas de imagens de satétite; proposta de um Modelo de Função de Evidência (EFM)
baseado numa distância estatística para detetar erros de classificação de modelos de aprendizagem;
e proposta de uma abordagem de amostragem generalizada para aprendizagem
ativa que, em conjunto com o EFM, possibilita uma forma de determinar os exemplos mais
informativos.
Em primeiro lugar, usando um conjunto de dados Sentinel-2 etiquetado manualmente,
foram desenvolvidos modelos de Aprendizagem Automática (AA) para classificação de cenas
e seu desempenho foi comparado com o do Sen2Cor – o produto de referência da
Agência Espacial Europeia – tendo sido alcançado um valor de micro-F1 de 84% pelo classificador,
o que representa uma melhoria significativa em relação ao desempenho Sen2Cor
correspondente, de 59%. Em segundo lugar, para quantificar o erro de classificação dos
modelos de AA, foi concebido o Modelo de Função de Evidência baseado na distância de
Mahalanobis. Este modelo conseguiu, para o conjunto de dados etiquetado do Sentinel-2
um micro-F1 de 67,89% na deteção de classificação incorreta. Por fim, o EFM foi utilizado
como uma estratégia de amostragem para a aprendizagem ativa, uma abordagem
que permitiu atingir o mesmo nível de desempenho com apenas 0,02% do total de exemplos
de treino quando comparado com um classificador treinado com o conjunto de treino
completo.
Com a ajuda das contribuições acima mencionadas, foi possível desenvolver um pacote
de código aberto para classificação de cenas de imagens Sentinel-2 que, utilizando num
conjunto de scripts Python, um modelo de classificação, e uma imagem Sentinel-2 L1C,
gera a imagem RGB correspondente (com resolução de 20m) com as seis classes estudadas
(Cloud, Cirrus, Shadow, Snow, Water e Other), disponibilizando à academia um método
direto para a classificação de cenas de imagens do Sentinel-2 rápida e eficaz. Além disso, a
abordagem de aprendizagem ativa que usa, como estratégia de amostragem, a deteção de
classificacão incorreta dada pelo EFM, permite etiquetar apenas os pontos mais informativos
a serem usados como entrada na construção de classificadores
Knowledge Graph Building Blocks: An easy-to-use Framework for developing FAIREr Knowledge Graphs
Knowledge graphs and ontologies provide promising technical solutions for
implementing the FAIR Principles for Findable, Accessible, Interoperable, and
Reusable data and metadata. However, they also come with their own challenges.
Nine such challenges are discussed and associated with the criterion of
cognitive interoperability and specific FAIREr principles (FAIR + Explorability
raised) that they fail to meet. We introduce an easy-to-use, open source
knowledge graph framework that is based on knowledge graph building blocks
(KGBBs). KGBBs are small information modules for knowledge-processing, each
based on a specific type of semantic unit. By interrelating several KGBBs, one
can specify a KGBB-driven FAIREr knowledge graph. Besides implementing semantic
units, the KGBB Framework clearly distinguishes and decouples an internal
in-memory data model from data storage, data display, and data access/export
models. We argue that this decoupling is essential for solving many problems of
knowledge management systems. We discuss the architecture of the KGBB Framework
as we envision it, comprising (i) an openly accessible KGBB-Repository for
different types of KGBBs, (ii) a KGBB-Engine for managing and operating FAIREr
knowledge graphs (including automatic provenance tracking, editing changelog,
and versioning of semantic units); (iii) a repository for KGBB-Functions; (iv)
a low-code KGBB-Editor with which domain experts can create new KGBBs and
specify their own FAIREr knowledge graph without having to think about semantic
modelling. We conclude with discussing the nine challenges and how the KGBB
Framework provides solutions for the issues they raise. While most of what we
discuss here is entirely conceptual, we can point to two prototypes that
demonstrate the principle feasibility of using semantic units and KGBBs to
manage and structure knowledge graphs
Algorithms for Triangles, Cones & Peaks
Three different geometric objects are at the center of this dissertation: triangles, cones and peaks.
In computational geometry, triangles are the most basic shape for planar subdivisions.
Particularly, Delaunay triangulations are a widely used for manifold applications in engineering, geographic information systems, telecommunication networks, etc.
We present two novel parallel algorithms to construct the Delaunay triangulation of a given point set.
Yao graphs are geometric spanners that connect each point of a given set to its nearest neighbor in each of cones drawn around it.
They are used to aid the construction of Euclidean minimum spanning trees
or in wireless networks for topology control and routing.
We present the first implementation of an optimal -time sweepline algorithm to construct Yao graphs.
One metric to quantify the importance of a mountain peak is its isolation.
Isolation measures the distance between a peak and the closest point of higher elevation.
Computing this metric from high-resolution digital elevation models (DEMs) requires efficient algorithms.
We present a novel sweep-plane algorithm that can calculate the isolation of all peaks on Earth in mere minutes
- …