9,300 research outputs found
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemesâparticularly, Multi-Instance Learning and classical Machine Learning formulationsâto model student behaviour. Besides, Explainable Artificial Intelligence is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2,500 submissions from roughly 90 different students from a programming-related course in a Computer Science degree. The results obtained validate the proposal: the model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioural pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.This work has been partially funded by the âPrograma Redes-I3CE de investigacion en docencia universitaria del Instituto de Ciencias de la Educacion (REDES-I3CE-2020-5069)â of the University of Alicante. The third author is supported by grant APOSTD/2020/256 from âPrograma I+D+I de la Generalitat Valencianaâ
Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process
We develop a novel credit assignment algorithm for information processing
with spiking neurons without requiring feedback synapses. Specifically, we
propose an event-driven generalization of the forward-forward and the
predictive forward-forward learning processes for a spiking neural system that
iteratively processes sensory input over a stimulus window. As a result, the
recurrent circuit computes the membrane potential of each neuron in each layer
as a function of local bottom-up, top-down, and lateral signals, facilitating a
dynamic, layer-wise parallel form of neural computation. Unlike spiking neural
coding, which relies on feedback synapses to adjust neural electrical activity,
our model operates purely online and forward in time, offering a promising way
to learn distributed representations of sensory data patterns with temporal
spike signals. Notably, our experimental results on several pattern datasets
demonstrate that the even-driven forward-forward (ED-FF) framework works well
for training a dynamic recurrent spiking system capable of both classification
and reconstruction
Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning
The spread of rumors along with breaking events seriously hinders the truth
in the era of social media. Previous studies reveal that due to the lack of
annotated resources, rumors presented in minority languages are hard to be
detected. Furthermore, the unforeseen breaking events not involved in
yesterday's news exacerbate the scarcity of data resources. In this work, we
propose a novel zero-shot framework based on prompt learning to detect rumors
falling in different domains or presented in different languages. More
specifically, we firstly represent rumor circulated on social media as diverse
propagation threads, then design a hierarchical prompt encoding mechanism to
learn language-agnostic contextual representations for both prompts and rumor
data. To further enhance domain adaptation, we model the domain-invariant
structural features from the propagation threads, to incorporate structural
position representations of influential community response. In addition, a new
virtual response augmentation method is used to improve model training.
Extensive experiments conducted on three real-world datasets demonstrate that
our proposed model achieves much better performance than state-of-the-art
methods and exhibits a superior capacity for detecting rumors at early stages.Comment: AAAI 202
A Study of Neural Collapse Phenomenon: Grassmannian Frame, Symmetry, Generalization
In this paper, we extends original Neural Collapse Phenomenon by proving
Generalized Neural Collapse hypothesis. We obtain Grassmannian Frame structure
from the optimization and generalization of classification. This structure
maximally separates features of every two classes on a sphere and does not
require a larger feature dimension than the number of classes. Out of curiosity
about the symmetry of Grassmannian Frame, we conduct experiments to explore if
models with different Grassmannian Frames have different performance. As a
result, we discover the Symmetric Generalization phenomenon. We provide a
theorem to explain Symmetric Generalization of permutation. However, the
question of why different directions of features can lead to such different
generalization is still open for future investigation.Comment: 25 pages, 2 figure
Recommended from our members
Antecedents of business intelligence system use
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London.Organisational reliance on information has become vital for organisational competitiveness. With increasing data volumes, Business Intelligence (BI) becomes a cornerstone of the decision-support system. However, employee resistance to use Business Intelligence Systems (BIS) is evident. This creates a problem to organisations in realising the benefits of BIS. It is thus important to study the enablers of sustained use of BIS amongst employees.
This thesis identifies existing theories that can be used to study BI system use. It integrates and extends technology use theories through a framework focusing on Business Intelligence System Use (BISU). Empirical research is then conducted in Kuwaitâs telecom and banking industries through a close-ended, self-administered questionnaire using a five-point Likert scale. Responses were received from 211 BI users. The data was analysed using SmartPLS to study the convergent and discriminant validity and reliability. Partial least squares structural equation modelling (PLS-SEM) was used to study the direct and indirect relationships between constructs and answer the hypotheses. In addition to SmartPLS, SPSS was used for descriptive analysis.
The results indicated that UTAUT factors consisting of performance expectancy, effort expectancy and social influence positively impact BI system use. Voluntariness of use was found to positively moderate the relationship between social influence and BI system use. Furthermore, BI system quality positively impacts both performance expectancy and effort expectancy. The BI userâs self-efficacy also positively impacts effort expectancy. In addition, social influence was found to be positively influenced by organisational factors, namely top management support and information culture.
The findings of this research contribute to literature by determining and quantifying the factors that influence BISU through the lens of employee perspectives. This thesis also explains how employeesâ object-based beliefs about BI affect their behavioural beliefs, which in turn impact BISU. Limitations of this research include the omission of UTAUTâs facilitating conditions and the limited variance of respondent demographics
Image classification over unknown and anomalous domains
A longstanding goal in computer vision research is to develop methods that are simultaneously applicable to a broad range of prediction problems. In contrast to this, models often perform best when they are specialized to some task or data type. This thesis investigates the challenges of learning models that generalize well over multiple unknown or anomalous modes and domains in data, and presents new solutions for learning robustly in this setting.
Initial investigations focus on normalization for distributions that contain multiple sources (e.g. images in different styles like cartoons or photos). Experiments demonstrate the extent to which existing modules, batch normalization in particular, struggle with such heterogeneous data, and a new solution is proposed that can better handle data from multiple visual modes, using differing sample statistics for each.
While ideas to counter the overspecialization of models have been formulated in sub-disciplines of transfer learning, e.g. multi-domain and multi-task learning, these usually rely on the existence of meta information, such as task or domain labels. Relaxing this assumption gives rise to a new transfer learning setting, called latent domain learning in this thesis, in which training and inference are carried out over data from multiple visual domains, without domain-level annotations. Customized solutions are required for this, as the performance of standard models degrades: a new data augmentation technique that interpolates between latent domains in an unsupervised way is presented, alongside a dedicated module that sparsely accounts for hidden domains in data, without requiring domain labels to do so.
In addition, the thesis studies the problem of classifying previously unseen or anomalous modes in data, a fundamental problem in one-class learning, and anomaly detection in particular. While recent ideas have been focused on developing self-supervised solutions for the one-class setting, in this thesis new methods based on transfer learning are formulated. Extensive experimental evidence demonstrates that a transfer-based perspective benefits new problems that have recently been proposed in anomaly detection literature, in particular challenging semantic detection tasks
Machine learning for managing structured and semi-structured data
As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs.
Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs.
In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer gröĂere Datenmengen verfĂŒgbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlĂ€sslich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren ZusammenhĂ€nge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfĂŒgbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmĂ€Ăigen Gittern auf allgemeine (unregelmĂ€Ăige) Graphen betrachtet werden.
Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten ĂŒber EntitĂ€ten in maschinenlesbaren Form. Obwohl groĂe Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollstĂ€ndig, d. h. es fehlen Fakten. Die manuelle ĂberprĂŒfung und Erweiterung der Graphen wird aufgrund der groĂen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstĂŒtzt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der WissensgraphenvervollstĂ€ndigung lĂ€sst sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen EntitĂ€ten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame EntitĂ€ten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknĂŒpfen.
In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der VervollstĂ€ndigung von Wissensgraphen vor. FĂŒr das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, wĂ€hrend die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die LeistungsfĂ€higkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. FĂŒr die Link Prediction demonstrieren wir, wie die Vorhersage fĂŒr unbekannte EntitĂ€ten zur Trainingszeit verbessert werden kann, indem zusĂ€tzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfĂŒgbar sind. GestĂŒtzt auf Ergebnisse einer groĂ angelegten experimentellen Studie prĂ€sentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. AuĂerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugĂ€nglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. SchlieĂlich schlagen wir eine neuartige Metrik fĂŒr die Bewertung von Ranking-Ergebnissen vor, wie sie fĂŒr beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in FĂ€llen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen fĂŒr beide Aufgaben vorkommen
AIUCD 2022 - Proceedings
Lâundicesima edizione del Convegno Nazionale dellâAIUCD-Associazione di Informatica Umanistica ha per titolo Culture digitali. Intersezioni: filosofia, arti, media. Nel titolo Ăš presente, in maniera esplicita, la richiesta di una riflessione, metodologica e teorica, sullâinterrelazione tra tecnologie digitali, scienze dellâinformazione, discipline filosofiche, mondo delle arti e cultural studies
SYSTEMS METHODS FOR ANALYSIS OF HETEROGENEOUS GLIOBLASTOMA DATASETS TOWARDS ELUCIDATION OF INTER-TUMOURAL RESISTANCE PATHWAYS AND NEW THERAPEUTIC TARGETS
In this PhD thesis is described an endeavour to compile litterature about Glioblastoma key molecular mechanisms into a directed network followin Disease Maps standards, analyse its topology and compare results with quantitative analysis of multi-omics datasets in order to investigate Glioblastoma resistance mechanisms. The work also integrated implementation of Data Management good practices and procedures
SplitMixer: Fat Trimmed From MLP-like Models
We present SplitMixer, a simple and lightweight isotropic MLP-like
architecture, for visual recognition. It contains two types of interleaving
convolutional operations to mix information across spatial locations (spatial
mixing) and channels (channel mixing). The first one includes sequentially
applying two depthwise 1D kernels, instead of a 2D kernel, to mix spatial
information. The second one is splitting the channels into overlapping or
non-overlapping segments, with or without shared parameters, and applying our
proposed channel mixing approaches or 3D convolution to mix channel
information. Depending on design choices, a number of SplitMixer variants can
be constructed to balance accuracy, the number of parameters, and speed. We
show, both theoretically and experimentally, that SplitMixer performs on par
with the state-of-the-art MLP-like models while having a significantly lower
number of parameters and FLOPS. For example, without strong data augmentation
and optimization, SplitMixer achieves around 94% accuracy on CIFAR-10 with only
0.28M parameters, while ConvMixer achieves the same accuracy with about 0.6M
parameters. The well-known MLP-Mixer achieves 85.45% with 17.1M parameters. On
CIFAR-100 dataset, SplitMixer achieves around 73% accuracy, on par with
ConvMixer, but with about 52% fewer parameters and FLOPS. We hope that our
results spark further research towards finding more efficient vision
architectures and facilitate the development of MLP-like models. Code is
available at https://github.com/aliborji/splitmixer
- âŠ