73 research outputs found
On the Creativity of Large Language Models
Large Language Models (LLMs) are revolutionizing several areas of Artificial
Intelligence. One of the most remarkable applications is creative writing,
e.g., poetry or storytelling: the generated outputs are often of astonishing
quality. However, a natural question arises: can LLMs be really considered
creative? In this article we firstly analyze the development of LLMs under the
lens of creativity theories, investigating the key open questions and
challenges. Then, we discuss a set of "easy" and "hard" problems in machine
creativity, presenting them in relation to LLMs. Finally, we examine the
societal impact of these technologies with a particular focus on the creative
industries
Predicting Temporal Aspects of Movement for Predictive Replication in Fog Environments
To fully exploit the benefits of the fog environment, efficient management of
data locality is crucial. Blind or reactive data replication falls short in
harnessing the potential of fog computing, necessitating more advanced
techniques for predicting where and when clients will connect. While spatial
prediction has received considerable attention, temporal prediction remains
understudied.
Our paper addresses this gap by examining the advantages of incorporating
temporal prediction into existing spatial prediction models. We also provide a
comprehensive analysis of spatio-temporal prediction models, such as Deep
Neural Networks and Markov models, in the context of predictive replication. We
propose a novel model using Holt-Winter's Exponential Smoothing for temporal
prediction, leveraging sequential and periodical user movement patterns. In a
fog network simulation with real user trajectories our model achieves a 15%
reduction in excess data with a marginal 1% decrease in data availability
Enhancing the Reasoning Capabilities of Natural Language Inference Models with Attention Mechanisms and External Knowledge
Natural Language Inference (NLI) is fundamental to natural language understanding. The task summarises the natural language understanding capabilities within a simple formulation of determining whether a natural language hypothesis can be inferred from a given natural language premise. NLI requires an inference system to address the full complexity of linguistic as well as real-world commonsense knowledge and, hence, the inferencing and reasoning capabilities of an NLI system are utilised in other complex language applications such as summarisation and machine comprehension. Consequently, NLI has received significant recent attention from both academia and industry. Despite extensive research, contemporary neural NLI models face challenges arising from the sole reliance on training data to comprehend all the linguistic and real-world commonsense knowledge. Further, different attention mechanisms, crucial to the success of neural NLI models, present the prospects of better utilisation when employed in combination. In addition, the NLI research field lacks a coherent set of guidelines for the application of one of the most crucial regularisation hyper-parameters in the RNN-based NLI models -- dropout.
In this thesis, we present neural models capable of leveraging the attention mechanisms and the models that utilise external knowledge to reason about inference. First, a combined attention model to leverage different attention mechanisms is proposed. Experimentation demonstrates that the proposed model is capable of better modelling the semantics of long and complex sentences. Second, to address the limitation of the sole reliance on the training data, two novel neural frameworks utilising real-world commonsense and domain-specific external knowledge are introduced. Employing the rule-based external knowledge retrieval from the knowledge graphs, the first model takes advantage of the convolutional encoders and factorised bilinear pooling to augment the reasoning capabilities of the state-of-the-art NLI models. Utilising the significant advances in the research of contextual word representations, the second model, addresses the existing crucial challenges of external knowledge retrieval, learning the encoding of the retrieved knowledge and the fusion of the learned encodings to the NLI representations, in unique ways. Experimentation demonstrates the efficacy and superiority of the proposed models over previous state-of-the-art approaches. Third, for the limitation on dropout investigations, formulated on exhaustive evaluation, analysis and validation on the proposed RNN-based NLI models, a coherent set of guidelines is introduced
A dynamic weighted RBF-based ensemble for prediction of time series data from nuclear components
International audienceIn this paper, an ensemble approach is proposed for prediction of time series data based on a Support Vector Regression (SVR) algorithm with RBF loss function. We propose a strategy to build diverse sub-models of the ensemble based on the Feature Vector Selection (FVS) method of Baudat & Anouar (2003), which decreases the computational burden and keeps the generalization performance of the model. A simple but effective strategy is used to calculate the weights of each data point for different sub-models built with RBF-SVR. A real case study on a power production component is presented. Comparisons with results given by the best single SVR model and a fixed-weights ensemble prove the robustness and accuracy of the proposed ensemble approach
Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
A variety of methods have been applied to the architectural configuration and
learning or training of artificial deep neural networks (DNN). These methods
play a crucial role in the success or failure of the DNN for most problems and
applications. Evolutionary Algorithms (EAs) are gaining momentum as a
computationally feasible method for the automated optimisation and training of
DNNs. Neuroevolution is a term which describes these processes of automated
configuration and training of DNNs using EAs. While many works exist in the
literature, no comprehensive surveys currently exist focusing exclusively on
the strengths and limitations of using neuroevolution approaches in DNNs.
Prolonged absence of such surveys can lead to a disjointed and fragmented field
preventing DNNs researchers potentially adopting neuroevolutionary methods in
their own research, resulting in lost opportunities for improving performance
and wider application within real-world deep learning problems. This paper
presents a comprehensive survey, discussion and evaluation of the
state-of-the-art works on using EAs for architectural configuration and
training of DNNs. Based on this survey, the paper highlights the most pertinent
current issues and challenges in neuroevolution and identifies multiple
promising future research directions.Comment: 20 pages (double column), 2 figures, 3 tables, 157 reference
- …