654 research outputs found
Multivariate Modeling of Quasar Variability with an Attention-based Variational Autoencoder
This thesis applied HeTVAE, an attention-based VAE neural network capable of multivariate modeling of time series, to a dataset of several thousand multi-band AGN light curves from ZTF and was one of the first attempts to use a neural network to harness the stochastic light curves in their multivariate form. Whereas standard models of AGN variability make prior assumptions, HeTVAE uses no prior knowledge and is able to learn the data distribution in a regularized latent space, reading semantic information via its up-to-date self-supervised training regimen. We have successfully created a dataset class for preprocessing the irregular multivariate time series and in order to interface with the quasi-off-the-shelf network more conveniently. Also, we have trained several different model iterations using one, two or all three of the filter dimensions from ZTF on Durham’s NCC compute cluster, while configuring useful hyper parameter choices to work robustly for the astronomical dataset. In the network's training, we employed the Adam optimizer with a reduce-on-plateau learning rate schedule and a KL-annealing schedule optimize the VAE’s performance. In experimenting, we show how the VAE has learned the data distribution of the light curves by generating simulated light curves and its interpretability by visualizing attention scores and by visualizing the way the light curves are distributed along the continuous latent space using PCA. We show it orders the light curves across a smooth gradient from those those that have both low amplitude short-term variation and high amplitude long-term variation, to those with little variability, to those with both short-term and long-term high-amplitude variation in the condensed space. We also use PCA to display a potential filtering algorithm that enables parsing through large datasets in an intuitive way and present some of the pitfalls of algorithmic bias in anomaly detection. Finally, we fine-tuned the structurally correct but imprecise multivariate interpolations output by HeTVAE to three objects to show how they could improve constraints on time-delay estimates in the context of reverberation mapping for the relatively poor-cadenced ZTF data. In short, HeTVAE's use cases are ranged and it is a step in the right direction as far as being able to help organize and process the millions of AGN light curves incoming from Vera C. Rubin Observatory’s Legacy Survey of Space and Time in their full 6 optical broadband filter multivariate form
On the Utility of Representation Learning Algorithms for Myoelectric Interfacing
Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden
Runway Safety Improvements Through a Data Driven Approach for Risk Flight Prediction and Simulation
Runway overrun is one of the most frequently occurring flight accident types threatening the safety of aviation. Sensors have been improved with recent technological advancements and allow data collection during flights. The recorded data helps to better identify the characteristics of runway overruns. The improved technological capabilities and the growing air traffic led to increased momentum for reducing flight risk using artificial intelligence. Discussions on incorporating artificial intelligence to enhance flight safety are timely and critical. Using artificial intelligence, we may be able to develop the tools we need to better identify runway overrun risk and increase awareness of runway overruns. This work seeks to increase attitude, skill, and knowledge (ASK) of runway overrun risks by predicting the flight states near touchdown and simulating the flight exposed to runway overrun precursors.
To achieve this, the methodology develops a prediction model and a simulation model. During the flight training process, the prediction model is used in flight to identify potential risks and the simulation model is used post-flight to review the flight behavior. The prediction model identifies potential risks by predicting flight parameters that best characterize the landing performance during the final approach phase. The predicted flight parameters are used to alert the pilots for any runway overrun precursors that may pose a threat. The predictions and alerts are made when thresholds of various flight parameters are exceeded. The flight simulation model simulates the final approach trajectory with an emphasis on capturing the effect wind has on the aircraft. The focus is on the wind since the wind is a relatively significant factor during the final approach; typically, the aircraft is stabilized during the final approach. The flight simulation is used to quickly assess the differences between fight patterns that have triggered overrun precursors and normal flights with no abnormalities. The differences are crucial in learning how to mitigate adverse flight conditions. Both of the models are created with neural network models. The main challenges of developing a neural network model are the unique assignment of each model design space and the size of a model design space. A model design space is unique to each problem and cannot accommodate multiple problems. A model design space can also be significantly large depending on the depth of the model. Therefore, a hyperparameter optimization algorithm is investigated and used to design the data and model structures to best characterize the aircraft behavior during the final approach.
A series of experiments are performed to observe how the model accuracy change with different data pre-processing methods for the prediction model and different neural network models for the simulation model. The data pre-processing methods include indexing the data by different frequencies, by different window sizes, and data clustering. The neural network models include simple Recurrent Neural Networks, Gated Recurrent Units, Long Short Term Memory, and Neural Network Autoregressive with Exogenous Input. Another series of experiments are performed to evaluate the robustness of these models to adverse wind and flare. This is because different wind conditions and flares represent controls that the models need to map to the predicted flight states. The most robust models are then used to identify significant features for the prediction model and the feasible control space for the simulation model. The outcomes of the most robust models are also mapped to the required landing distance metric so that the results of the prediction and simulation are easily read. Then, the methodology is demonstrated with a sample flight exposed to an overrun precursor, and high approach speed, to show how the models can potentially increase attitude, skill, and knowledge of runway overrun risk.
The main contribution of this work is on evaluating the accuracy and robustness of prediction and simulation models trained using Flight Operational Quality Assurance (FOQA) data. Unlike many studies that focused on optimizing the model structures to create the two models, this work optimized both data and model structures to ensure that the data well capture the dynamics of the aircraft it represents. To achieve this, this work introduced a hybrid genetic algorithm that combines the benefits of conventional and quantum-inspired genetic algorithms to quickly converge to an optimal configuration while exploring the design space. With the optimized model, this work identified the data features, from the final approach, with a higher contribution to predicting airspeed, vertical speed, and pitch angle near touchdown. The top contributing features are altitude, angle of attack, core rpm, and air speeds. For both the prediction and the simulation models, this study goes through the impact of various data preprocessing methods on the accuracy of the two models. The results may help future studies identify the right data preprocessing methods for their work. Another contribution from this work is on evaluating how flight control and wind affect both the prediction and the simulation models. This is achieved by mapping the model accuracy at various levels of control surface deflection, wind speeds, and wind direction change. The results saw fairly consistent prediction and simulation accuracy at different levels of control surface deflection and wind conditions. This showed that the neural network-based models are effective in creating robust prediction and simulation models of aircraft during the final approach. The results also showed that data frequency has a significant impact on the prediction and simulation accuracy so it is important to have sufficient data to train the models in the condition that the models will be used. The final contribution of this work is on demonstrating how the prediction and the simulation models can be used to increase awareness of runway overrun.Ph.D
Development of an R package to learn supervised classification techniques
This TFG aims to develop a custom R package for teaching supervised classification algorithms, starting
with the identification of requirements, including algorithms, data structures, and libraries. A strong
theoretical foundation is essential for effective package design. Documentation will explain each function’s
purpose, accompanied by necessary paperwork.
The package will include R scripts and data files in organized directories, complemented by a user
manual for easy installation and usage, even for beginners. Built entirely from scratch without external
dependencies, it’s optimized for accuracy and performance.
In conclusion, this TFG provides a roadmap for creating an R package to teach supervised classification
algorithms, benefiting researchers and practitioners dealing with real-world challenges.Grado en IngenierĂa Informátic
Geometric Learning on Graph Structured Data
Graphs provide a ubiquitous and universal data structure that can be applied in many domains such as social networks, biology, chemistry, physics, and computer science. In this thesis we focus on two fundamental paradigms in graph learning: representation learning and similarity learning over graph-structured data. Graph representation learning aims to learn embeddings for nodes by integrating topological and feature information of a graph. Graph similarity learning brings into play with similarity functions that allow to compute similarity between pairs of graphs in a vector space. We address several challenging issues in these two paradigms, designing powerful, yet efficient and theoretical guaranteed machine learning models that can leverage rich topological structural properties of real-world graphs.
This thesis is structured into two parts. In the first part of the thesis, we will present how to develop powerful Graph Neural Networks (GNNs) for graph representation learning from three different perspectives: (1) spatial GNNs, (2) spectral GNNs, and (3) diffusion GNNs. We will discuss the model architecture, representational power, and convergence properties of these GNN models. Specifically, we first study how to develop expressive, yet efficient and simple message-passing aggregation schemes that can go beyond the Weisfeiler-Leman test (1-WL). We propose a generalized message-passing framework by incorporating graph structural properties into an aggregation scheme. Then, we introduce a new local isomorphism hierarchy on neighborhood subgraphs. We further develop a novel neural model, namely GraphSNN, and theoretically prove that this model is more expressive than the 1-WL test. After that, we study how to build an effective and efficient graph convolution model with spectral graph filters. In this study, we propose a spectral GNN model, called DFNets, which incorporates a novel spectral graph filter, namely feedback-looped filters. As a result, this model can provide better localization on neighborhood while achieving fast convergence and linear memory requirements. Finally, we study how to capture the rich topological information of a graph using graph diffusion. We propose a novel GNN architecture with dynamic PageRank, based on a learnable transition matrix. We explore two variants of this GNN architecture: forward-euler solution and invariable feature solution, and theoretically prove that our forward-euler GNN architecture is guaranteed with the convergence to a stationary distribution.
In the second part of this thesis, we will introduce a new optimal transport distance metric on graphs in a regularized learning framework for graph kernels. This optimal transport distance metric can preserve both local and global structures between graphs during the transport, in addition to preserving features and their local variations. Furthermore, we propose two strongly convex regularization terms to theoretically guarantee the convergence and numerical stability in finding an optimal assignment between graphs. One regularization term is used to regularize a Wasserstein distance between graphs in the same ground space. This helps to preserve the local clustering structure on graphs by relaxing the optimal transport problem to be a cluster-to-cluster assignment between locally connected vertices. The other regularization term is used to regularize a Gromov-Wasserstein distance between graphs across different ground spaces based on degree-entropy KL divergence. This helps to improve the matching robustness of an optimal alignment to preserve the global connectivity structure of graphs. We have evaluated our optimal transport-based graph kernel using different benchmark tasks. The experimental results show that our models considerably outperform all the state-of-the-art methods in all benchmark tasks
Crashworthiness Optimization using difference-based equivalent static Loads - Sizing and Topology Optimization of Structures subjected to Crash
Structural optimization of crash related problems usually involves nonlinearities in geometry, material, and contact. For such kinds of problems, the sensitivities are either not available or very expensive to compute. Efficient gradient-based optimizers can then not be employed directly. The Difference-based Equivalent Static Load (DiESL) method provides a procedure to circumvent the sensitivity calculation of the original nonlinear dynamic problem by creating linear auxiliary load cases enabling gradient-based optimization. Each linear auxiliary load case then represents one specific time step of the original nonlinear dynamic problem.
In this thesis various extensions of the DiESL method are presented and the method is compared to several other relevant approaches in this field. It is demonstrated how an appropriate selection of the time steps in each cycle can improve the DiESL method's approximation quality. For this purpose, the time steps are selected adaptively such that an appropriate curve, indicating the structure's nonlinear behavior, is fitted by the selected time steps. It turns out that this leads to better optimization results and more reliable convergence behavior.
The DiESL method also enables the adaption of path-dependent structural properties of the original nonlinear dynamic problem like material stiffness in each linear auxiliary load case. In this thesis, an adaption of the Young’s modulus and Poisson's ratio on element level in the linear auxiliary load cases corresponding to the local plasticization in the nonlinear dynamic problem is tested. Therefore, a bilinear material model is employed in the auxiliary load cases. Here, the test examples indicate that an observable improvement can only be obtained if the material of the nonlinear dynamic problem is also idealized bilinearily and the portion of elements in the elastic and the plastic range is balanced such that the structure’s behavior is not dominated by one of both.
Crashworthiness design usually involves two contradictory objectives: the structure's stiffness as well as its energy absorption behavior. To be able to address the latter, an approach for handling crash forces with the DiESL method is developed and tested using sizing optimization examples. The respective results are validated by comparing them to the theoretically known optimum or other state of the art methods.
Moreover, the DiESL method is extended to topology optimization utilizing the Solid Isotropic Material with Penalization approach (SIMP). The method is tested using three examples. The first is a rigid pole colliding with a simple beam structure, where the intrusion of the pole is minimized. The initial velocity of the pole is varied in order to examine the influence of inertia effects on the optimized structures. It is shown that the results differ significantly depending on the chosen initial velocity and, consequently, that they exhibit inertia effects. Moreover, considerable improvement in terms of the resulting objective function's value could be achieved employing the DiESL method when compared with the standard ESL method for high initial velocities. The second example is an extruded rocker colliding with a rigid pole, where also the intrusion of the pole is minimized. The DiESL method yields equally good results as the Graph and Heuristic Topology optimization (GHT) approach does. However, the number of nonlinear analyses necessary to achieve convergence is significantly smaller when using the DiESL method. Finally, a rail reinforced by an additive manufactured rib is optimized. Here, several optimization runs are executed. The reaction force is maximized, while the mass of the rib is constrained to various fractions of the original rib's mass. This formulation aims to find designs where the original rib's mass and thus the related production cycle time is reduced, while its stiffness is almost maintained. In doing so a mass reduction of 30% could be achieved
- …