114 research outputs found
Recommended from our members
A Survey of Quantum-Cognitively Inspired Sentiment Analysis Models
Quantum theory, originally proposed as a physical theory to describe the motions of microscopic particles, has been applied to various non-physics domains involving human cognition and decision-making that are inherently uncertain and exhibit certain non-classical, quantum-like characteristics. Sentiment analysis is a typical example of such domains. In the last few years, by leveraging the modeling power of quantum probability (a non-classical probability stemming from quantum mechanics methodology) and deep neural networks, a range of novel quantum-cognitively inspired models for sentiment analysis have emerged and performed well. This survey presents a timely overview of the latest developments in this fascinating cross-disciplinary area. We first provide a background of quantum probability and quantum cognition at a theoretical level, analyzing their advantages over classical theories in modeling the cognitive aspects of sentiment analysis. Then, recent quantum-cognitively inspired models are introduced and discussed in detail, focusing on how they approach the key challenges of the sentiment analysis task. Finally, we discuss the limitations of the current research and highlight future research directions
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information â provided implicitly or explicitly â is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
Marine macroalgae as an alternative, environment-friendly, and bioactive feeding resource for animals
Doctoral thesis (PhD) - Nord University, 2023publishedVersio
Recommended from our members
Robust Machine Learning by Integrating Context
Intelligent software has the potential to transform our society. It is becoming the building block for many systems in the real world. However, despite the excellent performance of machine learning models on benchmarks, state-of-the-art methods like neural networks often fail once they encounter realistic settings. Since neural networks often learn correlations without reasoning with the right signals and knowledge, they fail when facing shifting distributions, unforeseen corruptions, and worst-case scenarios. Since neural networks are black-box models, they are not interpretable or trusted by the user. We need to build robust models for machine learning to be confidently and responsibly deployed in the most critical applications and systems.
In this dissertation, I introduce our robust machine learning systems advancements by tightly integrating context into algorithms. The context has two aspects: the intrinsic structure of natural data, and the extrinsic structure from domain knowledge. Both are crucial: By capitalizing on the intrinsic structure in natural data, my work has shown that we can create robust machine learning systems, even in the worst case, an analytical result that also enjoys strong empirical gains.
Through integrating external knowledge, such as the association between tasks and causal structure, my framework can instruct models to use the right signals for inference, enabling new opportunities for controllable and interpretable models.
This thesis consists of three parts. In the first part, I aim to cover three works that use the intrinsic structure as a constraint to achieve robust inference. I present our framework that performs test-time optimization to respect the natural constraint, which is captured by self-supervised tasks. I illustrate that test-time optimization improves out-of-distribution generalization and adversarial robustness. Besides the inference algorithm, I show that intrinsic structure through discrete representations also improves out-of-distribution robustness.
In the second part of the thesis, I then detail my work using external domain knowledge. I first introduce using causal structure from external domain knowledge to improve domain generalization robustness. I then show how the association of multiple tasks and regularization objectives helps robustness.
In the final part of this dissertation, I show three works on trustworthy and reliable foundation models, a general-purpose model that will be the foundation for many AI applications. I show a framework that uses context to secure, interpret, and control foundation models
Transparency: from tractability to model explanations
As artificial intelligence (AI) and machine learning (ML) models get increasingly incorporated into critical applications, ranging from medical diagnosis to loan approval, they show a tremendous potential to impact society in a beneficial way, however, this is predicated on establishing a transparent relationship between humans and automation. In particular, transparency requirements span across multiple dimensions, incorporating both technical and societal aspects, in order to promote the responsible use of AI/ML.
In this thesis we present contributions along both of these axes, starting with the technical side and model transparency, where we study ways to enhance tractable probabilistic models (TPMs) with properties that enable acquiring an in-depth understanding of their decision-making process. Following this, we expand the scope of our work, studying how providing explanations about a modelâs predictions influences the extent to which humans understand and collaborate with it, and finally we design an introductory course into the emerging field of explanations in AI to foster the competent use of the developed tools and
methodologies.
In more detail, the complex design of TPMs makes it very challenging to extract information that conveys meaningful insights, despite the fact that they are closely related to Bayesian networks (BNs), which readily provide such information. This has led to TPMs being viewed as black-boxes, in the sense that their internal representations are elusive, in contrast to BNs. The first part of this thesis challenges this view, focusing on the question of whether it is feasible to extend certain transparent features of BNs to TPMs. We start with considering the
problem of transforming TPMs into alternative graphical models in a way that makes their internal representations easy to inspect. Furthermore, we study the utility of existing algorithms in causal applications, where we identify some significant limitations. To remedy this situation, we propose a set of algorithms that result in transformations that accurately uncover the internal representations of TPMs.
Following this result, we look into the problem of incorporating probabilistic constraints into TPMs. Although it is well known that BNs satisfy this property, the complex structure of TPMs impedes applying the same arguments, thus advances on this problem have been very limited. Having said that, in this thesis we provide formal proofs that TPMs can be made to satisfy both probabilistic and causal constraints through parameter manipulation, showing that incorporating a constraint corresponds to solving a system of multilinear equations.
We conclude the technical contributions studying the problem of generating counterfactual instances for classifiers based on TPMs, motivated by the fact that BNs are the building blocks of most standard approaches to perform this task. In this thesis we propose a novel algorithm that we prove is guaranteed to generate valid counterfactuals. The resulting algorithm takes advantage of the multilinear structure of TPMs, generalizing existing approaches, while also allowing for incorporating a priori constraints that should be respected by the final counterfactuals.
In the second part of this thesis we go beyond model transparency, looking into the role of explanations in achieving an effective collaboration between human users and AI. To study this we design a behavioural experiment where we show that explanations provide unique insights, which cannot be obtained by looking at more traditional uncertainty measures. The findings of this experiment provide evidence supporting the view that explanations and uncertainty estimates have complementary functions, advocating in favour of incorporating elements of
both in order to promote a synergistic relationship between humans and AI. Finally, building on our findings, in this thesis we design a course on explanations in AI, where we focus on both the technical details of state-of-the-art algorithms as well as the overarching goals, limitations, and methodological approaches in the field. This contribution aims at ensuring that users can make competent use of explanations, a need that has also been highlighted by recent large scale social initiatives. The resulting course was offered by the University of Edinburgh, at an MSc level, where student evaluations, as well as their performance, showcased the courseâs effectiveness in achieving its primary goals
Recommended from our members
Structure-preserving machine learning for inverse problems
Inverse problems naturally arise in many scientific settings, and the study of these problems has been crucial in the development of important technologies such as medical imaging. In inverse problems, the goal is to estimate an underlying ground truth uâ, typically an image, from corresponding measurements y, where uâ and y are related by
y = N(A(uâ)) (1)
for some forward operator A and noise-generating process N (both of which are generally assumed to be known). Variational regularisation is a well-established approach that can be used to approximately solve inverse problems such as Problem (1). In this approach an image is reconstructed from measurements y by solving a minimisation problem such as
uË = argmin d(A(u),y) +αJ(u). (2)
While this approach has proven very successful, it generally requires the parts that make up the optimisation problem to be carefully chosen, and the optimisation problem may require considerable computational effort to solve. There is an active line of research into overcoming these issues using data-driven approaches, which aim to use multiple instances of data to inform a method that can be used on similar data. In this dissertation we investigate ways in which favourable properties of the variational regularisation approach can be combined with a data-driven approach to solving inverse problems.
In the first chapter of the dissertation, we propose a bilevel optimisation framework that can be used to optimise sampling patterns and regularisation parameters for variational image reconstruction in accelerated magnetic resonance imaging. We use this framework to learn sampling patterns that result in better image reconstructions than standard random variable density sampling patterns that sample with the same rate.
In the second chapter of the dissertation, we study the use of group symmetries in learned reconstruction methods for inverse problems. We show that group invariance of a functional implies that the corresponding proximal operator satisfies a group equivariance property. Applying this idea to model proximal operators as roto-translationally equivariant in an unrolled iterative reconstruction method, we show that reconstruction performance is more robust when tested on images in orientations not seen during training (compared to similar methods that model proximal operators to just be translationally equivariant) and that good methods can be learned with less training data.
In the final chapter of the dissertation, we propose a ResNet-styled neural network architecture that is provably nonexpansive. This architecture can be thought of as composing discretisations of gradient flows along learnable convex potentials. Appealing to a classical result on the numerical integration of ODEs, we show that constraining the operator norms of the weight operators is sufficient to give nonexpansiveness, and additional analysis in the case that the numerical integrator is the forward Euler method shows that the neural network is an averaged operator. This guarantees that its fixed point iterations are convergent, and makes it a natural candidate for a learned denoiser in a Plug-and-Play approach to solving inverse problemsCantab Capital Institute for the Mathematics of Informatio
Resilience, reliability, and coordination in autonomous multi-agent systems
Acknowledgements The research reported in this paper was funded and supported by various grants over the years: Robotics and AI in Nuclear (RAIN) Hub (EP/R026084/1); Future AI and Robotics for Space (FAIR-SPACE) Hub (EP/R026092/1); Offshore Robotics for Certification of Assets (ORCA) Hub (EP/R026173/1); the Royal Academy of Engineering under the Chair in Emerging Technologies scheme; Trustworthy Autonomous Systems âVerifiability Nodeâ (EP/V026801); Scrutable Autonomous Systems (EP/J012084/1); Supporting Security Policy with Effective Digital Intervention (EP/P011829/1); The International Technology Alliance in Network and Information Sciences.Peer reviewedPostprin
Probabilistic Parametric Curves for Sequence Modeling
ReprĂ€sentationen sequenzieller Daten basieren in der Regel auf der Annahme, dass beobachtete Sequenzen Realisierungen eines unbekannten zugrundeliegenden stochastischen Prozesses sind. Die Bestimmung einer solchen ReprĂ€sentation wird ĂŒblicherweise als Lernproblem ausgelegt und ergibt ein Sequenzmodell. Das Modell muss in diesem Zusammenhang in der Lage sein, die multimodale Natur der Daten zu erfassen, ohne einzelne Modi zu vermischen. Zur Modellierung eines zugrundeliegenden stochastischen Prozesses lernen hĂ€ufig verwendete, auf neuronalen Netzen basierende AnsĂ€tze entweder eine Wahrscheinlichkeitsverteilung zu parametrisieren oder eine implizite ReprĂ€sentation unter Verwendung stochastischer Eingaben oder Neuronen. Dabei integrieren diese Modelle in der Regel Monte Carlo Verfahren oder andere NĂ€herungslösungen, um die ParameterschĂ€tzung und probabilistische Inferenz zu ermöglichen. Dies gilt sogar fĂŒr regressionsbasierte AnsĂ€tze basierend auf Mixture Density Netzwerken, welche ebenso Monte Carlo Simulationen zur multi-modalen Inferenz benötigen. Daraus ergibt sich eine ForschungslĂŒcke fĂŒr vollstĂ€ndig regressionsbasierte AnsĂ€tze zur ParameterschĂ€tzung und probabilistischen Inferenz.
Infolgedessen stellt die vorliegende Arbeit eine probabilistische Erweiterung fĂŒr BĂ©zierkurven (-Kurven) als Basis fĂŒr die Modellierung zeitkontinuierlicher stochastischer Prozesse mit beschrĂ€nkter Indexmenge vor. Das vorgestellte Modell, bezeichnet als -Kurven - Modell, basiert auf Mixture Density Netzwerken (MDN) und BĂ©zierkurven, welche Kurvenkontrollpunkte als normalverteilt annehmen. Die Verwendung eines MDN-basierten Ansatzes steht im Einklang mit aktuellen Versuchen, UnsicherheitsschĂ€tzung als Regressionsproblem auszulegen, und ergibt ein generisches Modell, welches allgemein als Basismodell fĂŒr die probabilistische Sequenzmodellierung einsetzbar ist. Ein wesentlicher Vorteil des Modells ist unter anderem die Möglichkeit glatte, multi-modale Vorhersagen in einem einzigen Inferenzschritt zu generieren, ohne dabei Monte Carlo Simulationen zu benötigen. Durch die Verwendung von BĂ©zierkurven als Basis, kann das Modell auĂerdem theoretisch fĂŒr beliebig hohe Datendimensionen verwendet werden, indem die Kontrollpunkte in einen hochdimensionalen Raum eingebettet werden. Um die durch den Fokus auf beschrĂ€nkte Indexmengen existierenden theoretischen EinschrĂ€nkungen aufzuheben, wird zusĂ€tzlich eine konzeptionelle Erweiterung fĂŒr das -Kurven - Modell vorgestellt, mit der unendliche stochastische Prozesse modelliert werden können. Wesentliche Eigenschaften des vorgestellten Modells und dessen Erweiterung werden auf verschiedenen Beispielen zur Sequenzsynthese gezeigt.
Aufgrund der hinreichenden Anwendbarkeit des -Kurven - Modells auf die meisten AnwendungsfĂ€lle, wird dessen Tauglichkeit umfangreich auf verschiedenen MehrschrittprĂ€diktionsaufgaben unter Verwendung realer Daten evaluiert. ZunĂ€chst wird das Modell gegen hĂ€ufig verwendete probabilistische Sequenzmodelle im Kontext der Vorhersage von FuĂgĂ€ngertrajektorien evaluiert, wobei es sĂ€mtliche Vergleichsmodelle ĂŒbertrifft. In einer qualitativen Auswertung wird das Verhalten des Modells in einem Vorhersagekontext untersucht. AuĂerdem werden Schwierigkeiten bei der Bewertung probabilistischer Sequenzmodelle in einem multimodalen Setting diskutiert. DarĂŒber hinaus wird das Modell im Kontext der Vorhersage menschlicher Bewegungen angewendet, um die angestrebte Skalierbarkeit des Modells auf höherdimensionale Daten zu bewerten. Bei dieser Aufgabe ĂŒbertrifft das Modell allgemein verwendete einfache und auf neuronalen Netzen basierende Grundmodelle und ist in verschiedenen Situationen auf Augenhöhe mit verschiedenen State-of-the-Art-Modellen, was die Einsetzbarkeit in diesem höherdimensionalen Beispiel zeigt. Des Weiteren werden Schwierigkeiten bei der KovarianzschĂ€tzung und die GlĂ€ttungseigenschaften des -Kurven - Modells diskutiert
Probabilistic Parametric Curves for Sequence Modeling
This work proposes a probabilistic extension to BĂ©zier curves as a basis for effectively modeling stochastic processes with a bounded index set. The proposed stochastic process model is based on Mixture Density Networks and BĂ©zier curves with Gaussian random variables as control points. A key advantage of this model is given by the ability to generate multi-mode predictions in a single inference step, thus avoiding the need for Monte Carlo simulation
- âŠ