12,271 research outputs found
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
We target the problem of automatically synthesizing proofs of semantic
equivalence between two programs made of sequences of statements. We represent
programs using abstract syntax trees (AST), where a given set of
semantics-preserving rewrite rules can be applied on a specific AST pattern to
generate a transformed and semantically equivalent program. In our system, two
programs are equivalent if there exists a sequence of application of these
rewrite rules that leads to rewriting one program into the other. We propose a
neural network architecture based on a transformer model to generate proofs of
equivalence between program pairs. The system outputs a sequence of rewrites,
and the validity of the sequence is simply checked by verifying it can be
applied. If no valid sequence is produced by the neural network, the system
reports the programs as non-equivalent, ensuring by design no programs may be
incorrectly reported as equivalent. Our system is fully implemented for a given
grammar which can represent straight-line programs with function calls and
multiple types. To efficiently train the system to generate such sequences, we
develop an original incremental training technique, named self-supervised
sample selection. We extensively study the effectiveness of this novel training
approach on proofs of increasing complexity and length. Our system, S4Eq,
achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent
programsComment: 30 pages including appendi
Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics
Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts.
In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact -values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited.
In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical in least squares regression.
In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
Discovering the hidden structure of financial markets through bayesian modelling
Understanding what is driving the price of a financial asset is a question that is currently mostly unanswered. In this work we go beyond the classic one step ahead prediction and instead construct models that create new information on the behaviour of these time series. Our aim is to get a better understanding of the hidden structures that drive the moves of each financial time series and thus the market as a whole.
We propose a tool to decompose multiple time series into economically-meaningful variables to explain the endogenous and exogenous factors driving their underlying variability. The methodology we introduce goes beyond the direct model forecast. Indeed, since our model continuously adapts its variables and coefficients, we can study the time series of coefficients and selected variables. We also present a model to construct the causal graph of relations between these time series and include them in the exogenous factors.
Hence, we obtain a model able to explain what is driving the move of both each specific time series and the market as a whole. In addition, the obtained graph of the time series provides new information on the underlying risk structure of this environment. With this deeper understanding of the hidden structure we propose novel ways to detect and forecast risks in the market. We investigate our results with inferences up to one month into the future using stocks, FX futures and ETF futures, demonstrating its superior performance according to accuracy of large moves, longer-term prediction and consistency over time. We also go in more details on the economic interpretation of the new variables and discuss the created graph structure of the market.Open Acces
Statistical-dynamical analyses and modelling of multi-scale ocean variability
This thesis aims to provide a comprehensive analysis of multi-scale oceanic variabilities using various statistical and dynamical tools and explore the data-driven methods for correct statistical emulation of the oceans. We considered the classical, wind-driven, double-gyre ocean circulation model in quasi-geostrophic approximation and obtained its eddy-resolving solutions in terms of potential vorticity anomaly and geostrophic streamfunctions. The reference solutions possess two asymmetric gyres of opposite circulations and a strong meandering eastward jet separating them with rich eddy activities around it, such as the Gulf Stream in the North Atlantic and Kuroshio in the North Pacific.
This thesis is divided into two parts. The first part discusses a novel scale-separation method based on the local spatial correlations, called correlation-based decomposition (CBD), and provides a comprehensive analysis of mesoscale eddy forcing. In particular, we analyse the instantaneous and time-lagged interactions between the diagnosed eddy forcing and the evolving large-scale PVA using the novel `product integral' characteristics. The product integral time series uncover robust causality between two drastically different yet interacting flow quantities, termed `eddy backscatter'. We also show data-driven augmentation of non-eddy-resolving ocean models by feeding them the eddy fields to restore the missing eddy-driven features, such as the merging western boundary currents, their eastward extension and low-frequency variabilities of gyres.
In the second part, we present a systematic inter-comparison of Linear Regression (LR), stochastic and deep-learning methods to build low-cost reduced-order statistical emulators of the oceans. We obtain the forecasts on seasonal and centennial timescales and assess them for their skill, cost and complexity. We found that the multi-level linear stochastic model performs the best, followed by the ``hybrid stochastically-augmented deep learning models''. The superiority of these methods underscores the importance of incorporating core dynamics, memory effects and model errors for robust emulation of multi-scale dynamical systems, such as the oceans.Open Acces
Annals [...].
Pedometrics: innovation in tropics; Legacy data: how turn it useful?; Advances in soil sensing; Pedometric guidelines to systematic soil surveys.Evento online. Coordenado por: Waldir de Carvalho Junior, Helena Saraiva Koenow Pinheiro, Ricardo Simão Diniz Dalmolin
Knowledge-based artificial neural network modeling assessment: integrating heterogeneous genomics data to uncover lifespan regulation
Biological analytics and more advanced data analysis techniques have made remarkable advancements as the area of machine learning continues to grow. More specifically, genetic modeling and neural network building are gaining interest as it becomes a fundamental piece of most model building we see today. We propose a Knowledge-Based Artificial Neural Network (KBANN) to predict phenotype while providing insight to effected subsystems. Within KBANN, the input layers are a single or group of Gene Ontology (GO) terms while each layer’s input is a single number between 0 and 1, explaining how expressed the given term is. The expression number provides an average of the number of copies that a gene is producing at its current age compared to that over the average of its entire lifespan. Preliminary results show that KBANN model can potentially be used to predict lifespan phenotype using the Genotype-Tissue Expression data
How to Be a God
When it comes to questions concerning the nature of Reality, Philosophers and Theologians have the answers.
Philosophers have the answers that can’t be proven right. Theologians have the answers that can’t be proven wrong.
Today’s designers of Massively-Multiplayer Online Role-Playing Games create realities for a living. They can’t spend centuries mulling over the issues: they have to face them head-on. Their practical experiences can indicate which theoretical proposals actually work in practice.
That’s today’s designers. Tomorrow’s will have a whole new set of questions to answer.
The designers of virtual worlds are the literal gods of those realities. Suppose Artificial Intelligence comes through and allows us to create non-player characters as smart as us. What are our responsibilities as gods? How should we, as gods, conduct ourselves?
How should we be gods
DEEP REINFORCEMENT LEARNING AND MODEL PREDICTIVE CONTROL APPROACHES FOR THE SCHEDULED OPERATION OF DOMESTIC REFRIGERATORS
Excess capacity of the UK’s national grid is widely quoted to be reducing to around 4% over the coming years as a consequence of increased economic growth (and hence power usage) and reductions in power generation plants. There is concern that short term variations in power demand could lead to serious wide-scale disruption on a national scale. This is therefore spawning greater attention on augmenting traditional generation plants with renewable and localized energy storage technologies, and consideration of improved demand side responses (DSR), where power consumers are incentivized to switch off assets when the grid is under pressure. It is estimated, for instance, that refrigeration/HVAC systems alone could account for ~14% of the total UK energy usage, with refrigeration and water heating/cooling systems, in particular, being able to act as real-time ‘buffer’ technologies that can be demand-managed to accommodate transient demands by being switched-off for short periods without damaging their outputs. Large populations of thermostatically controlled loads (TCLs) hold significant potential for performing ancillary services in power systems since they are well-established and widely distributed around the power network. In the domestic sector, refrigerators and freezers collectively constitute a very large electrical load since they are continuously connected and are present in almost most households. The rapid proliferation of the ‘Internet of Things’ (IoT) now affords the opportunity to monitor and visualise smart buildings appliances performance and specifically, schedule the operation of the widely distributed domestic refrigerator and freezers to collectively improve energy efficiency and reduce peak power consumption on the electrical grid. To accomplish this, this research proposes the real-time estimation of the thermal mass of individual refrigerators in a network using on-line parameter identification, and the co-ordinated (ON-OFF) scheduling of the refrigerator compressors to maintain their respective temperatures within specified hysteresis bands—commensurate with accommodating food safety standards. Custom Model Predictive Control (MPC) schemes and a Machine Learning algorithm (Reinforcement Learning) are researched to realize an appropriate scheduling methodology which is implemented through COTS IoT hardware. Benefits afforded by the proposed schemes are investigated through experimental trials which show that the co-ordinated operation of domestic refrigerators can 1) reduce the peak power consumption as seen from the perspective of the electrical power grid (i.e. peak power shaving), 2) can adaptively control the temperature hysteresis band of individual refrigerators to increase operational efficiency, and 3) contribute to a widely distributed aggregated load shed for Demand Side Response purposes in order to aid grid stability. Comparative studies of measurements from experimental trials show that the co-ordinated scheduling of refrigerators allows energy savings of between 19% and 29% compared to their traditional isolated (non-co-operative) operation. Moreover, by adaptively changing the hysteresis bands of individual fridges in response to changes in thermal behaviour, a further 20% of savings in energy are possible at local refrigerator level, thereby providing benefits to both network supplier and individual consumer
- …