232 research outputs found
BASiS: Batch Aligned Spectral Embedding Space
Graph is a highly generic and diverse representation, suitable for almost any
data processing problem. Spectral graph theory has been shown to provide
powerful algorithms, backed by solid linear algebra theory. It thus can be
extremely instrumental to design deep network building blocks with spectral
graph characteristics. For instance, such a network allows the design of
optimal graphs for certain tasks or obtaining a canonical orthogonal
low-dimensional embedding of the data. Recent attempts to solve this problem
were based on minimizing Rayleigh-quotient type losses. We propose a different
approach of directly learning the eigensapce. A severe problem of the direct
approach, applied in batch-learning, is the inconsistent mapping of features to
eigenspace coordinates in different batches. We analyze the degrees of freedom
of learning this task using batches and propose a stable alignment mechanism
that can work both with batch changes and with graph-metric changes. We show
that our learnt spectral embedding is better in terms of NMI, ACC, Grassman
distance, orthogonality and classification accuracy, compared to SOTA. In
addition, the learning is more stable.Comment: 14 pages, 10 figure
Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization
Modern ML applications increasingly rely on complex deep learning models and
large datasets. There has been an exponential growth in the amount of
computation needed to train the largest models. Therefore, to scale computation
and data, these models are inevitably trained in a distributed manner in
clusters of nodes, and their updates are aggregated before being applied to the
model. However, a distributed setup is prone to Byzantine failures of
individual nodes, components, and software. With data augmentation added to
these settings, there is a critical need for robust and efficient aggregation
systems. We define the quality of workers as reconstruction ratios ,
and formulate aggregation as a Maximum Likelihood Estimation procedure using
Beta densities. We show that the Regularized form of log-likelihood wrt
subspace can be approximately solved using iterative least squares solver, and
provide convergence guarantees using recent Convex Optimization landscape
results. Our empirical findings demonstrate that our approach significantly
enhances the robustness of state-of-the-art Byzantine resilient aggregators. We
evaluate our method in a distributed setup with a parameter server, and show
simultaneous improvements in communication efficiency and accuracy across
various tasks. The code is publicly available at
https://github.com/hamidralmasi/FlagAggregato
Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques
This article intends to systematically identify and comparatively analyze
state-of-the-art supply chain (SC) forecasting strategies and technologies. A
novel framework has been proposed incorporating Big Data Analytics in SC
Management (problem identification, data sources, exploratory data analysis,
machine-learning model training, hyperparameter tuning, performance evaluation,
and optimization), forecasting effects on human-workforce, inventory, and
overall SC. Initially, the need to collect data according to SC strategy and
how to collect them has been discussed. The article discusses the need for
different types of forecasting according to the period or SC objective. The SC
KPIs and the error-measurement systems have been recommended to optimize the
top-performing model. The adverse effects of phantom inventory on forecasting
and the dependence of managerial decisions on the SC KPIs for determining model
performance parameters and improving operations management, transparency, and
planning efficiency have been illustrated. The cyclic connection within the
framework introduces preprocessing optimization based on the post-process KPIs,
optimizing the overall control process (inventory management, workforce
determination, cost, production and capacity planning). The contribution of
this research lies in the standard SC process framework proposal, recommended
forecasting data analysis, forecasting effects on SC performance, machine
learning algorithms optimization followed, and in shedding light on future
research
Minimizing Quotient Regularization Model
Quotient regularization models (QRMs) are a class of powerful regularization
techniques that have gained considerable attention in recent years, due to
their ability to handle complex and highly nonlinear data sets. However, the
nonconvex nature of QRM poses a significant challenge in finding its optimal
solution. We are interested in scenarios where both the numerator and the
denominator of QRM are absolutely one-homogeneous functions, which is widely
applicable in the fields of signal processing and image processing. In this
paper, we utilize a gradient flow to minimize such QRM in combination with a
quadratic data fidelity term. Our scheme involves solving a convex problem
iteratively.The convergence analysis is conducted on a modified scheme in a
continuous formulation, showing the convergence to a stationary point.
Numerical experiments demonstrate the effectiveness of the proposed algorithm
in terms of accuracy, outperforming the state-of-the-art QRM solvers.Comment: 20 page
Accelerating the Computation of Tensor -eigenvalues
Efficient solvers for tensor eigenvalue problems are important tools for the
analysis of higher-order data sets. Here we introduce, analyze and demonstrate
an extrapolation method to accelerate the widely used shifted symmetric higher
order power method for tensor -eigenvalue problems. We analyze the
asymptotic convergence of the method, determining the range of extrapolation
parameters that induce acceleration, as well as the parameter that gives the
optimal convergence rate. We then introduce an automated method to dynamically
approximate the optimal parameter, and demonstrate it's efficiency when the
base iteration is run with either static or adaptively set shifts. Our
numerical results on both even and odd order tensors demonstrate the theory and
show we achieve our theoretically predicted acceleration.Comment: 22 pages, 8 figures, 4 table
Domain-Specific Optimization For Machine Learning System
The machine learning (ML) system has been an indispensable part of the ML ecosystem in recent years. The rapid growth of ML brings new system challenges such as the need of handling more large-scale data and computation, the requirements for higher execution performance, and lower resource usage, stimulating the demand for improving ML system. General-purpose system optimization is widely used but brings limited benefits because ML applications vary in execution behaviors based on their algorithms, input data, and configurations. It\u27s difficult to perform comprehensive ML system optimizations without application specific information. Therefore, domain-specific optimization, a method that optimizes particular types of ML applications based on their unique characteristics, is necessary for advanced ML systems. This dissertation performs domain-specific system optimizations for three important ML applications: graph-based applications, SGD-based applications, and Python-based applications. For SGD-based applications, this dissertation proposes a lossy compression scheme for application checkpoint constructions (called {LC-Checkpoint\xspace}). {LC-Checkpoint\xspace} intends to simultaneously maximize the compression rate of checkpoints and reduce the recovery cost of SGD-based training processes. Extensive experiments show that {LC-Checkpoint\xspace} achieves a high compression rate with a lower recovery cost over a state-of-the-art algorithm. For kernel regression applications, this dissertation designs and implements a parallel software that targets to handle million-scale datasets. The software is evaluated on two million-scale downstream applications (i.e., equity return forecasting problem on the US stock dataset, and image classification problem on the ImageNet dataset) to demonstrate its efficacy and efficiency. For graph-based applications, this dissertation introduces {ATMem\xspace}, a runtime framework to optimize application data placement on heterogeneous memory systems. {ATMem\xspace} aims to maximize the fast memory (small-capacity) utilization by placing only critical data regions that yield the highest performance gains on the fast memory. Experimental results show that {ATMem\xspace} achieves significant speedup over the baseline that places all data on slow memory (large-capacity) with only placing a minority portion of the data on the fast memory. The future research direction is to adapt ML algorithms for software systems/architectures, deeply bind the design of ML algorithms to the implementation of ML systems, to achieve optimal solutions for ML applications
Multiscale optimisation of dynamic properties for additively manufactured lattice structures
A framework for tailoring the dynamic properties of functionally graded lattice structures through the use of multiscale optimisation is presented in this thesis. The multiscale optimisation utilises a two scale approach to allow for complex lattice structures to be simulated in real time at a similar computational expense to traditional finite element problems. The micro and macro scales are linked by a surrogate model that predicts the homogenised material properties of the underlying lattice geometry based on the lattice design parameters. Optimisation constraints on the resonant frequencies and the Modal Assurance Criteria are implemented that can induce the structure to resonate at specific frequencies whilst simultaneously tracking and ensuring the correct mode shapes are maintained. This is where the novelty of the work lies, as dynamic properties have not previously been optimised for in a multiscale, functionally graded lattice structure.
Multiscale methods offer numerous benefits and increased design freedom when generating optimal structures for dynamic environments. These benefits are showcased in a series of optimised cantilever structures. The results show a significant improvement in dynamic behavior when compared to the unoptimised case as well as when compared to a single scale topology optimised structure. The validation of the resonant properties for the lattice structures is performed through a series of mechanical tests on additive manufactured lattices. These tests address both the micro and the macro scale of the multiscale method. The homogeneous and surrogate model assumptions of the micro scale are investigated through both compression and tensile tests of uniform lattice samples. The resonant frequency predictions of the macro scale optimisation are verified through mechanical shaker testing and computed tomography scans of the lattice structure. Sources of discrepancy between the predicted and observed behavior are also investigated and explained.Open Acces
Contributions to Robust Graph Clustering: Spectral Analysis and Algorithms
This dissertation details the design of fast, and parameter free, graph clustering methods to robustly determine set cluster assignments. It provides spectral analysis as well as algorithms that adapt the obtained theoretical results to the implementation of robust graph clustering techniques. Sparsity is of importance in graph clustering and a first contribution of the thesis is the definition of a sparse graph model consistent with the graph clustering objectives. This model is based on an advantageous property, arising from a block diagonal representation, of a matrix that promotes the density of connections within clusters and sparsity between them. Spectral analysis of the sparse
graph model including the eigen-decomposition of the Laplacian matrix is conducted. The analysis of the Laplacian matrix is simplified by defining a vector that carries all the relevant information that is contained in the Laplacian matrix. The obtained spectral properties of sparse graphs are adapted to sparsity-aware clustering based on two methods that formulate the determination of the sparsity level as approximations to spectral properties of the sparse graph models.
A second contribution of this thesis is to analyze the effects of outliers on graph clustering and to propose algorithms that address robustness and the level of sparsity jointly. The basis for this contribution is to specify fundamental outlier types that occur in the cases of extreme sparsity and the mathematical analysis of their effects on sparse graphs to develop graph clustering algorithms that are robust against the investigated outlier effects. Based on the obtained results, two different robust and sparsity-aware affinity matrix construction methods are proposed. Motivated by the outliers’ effects on eigenvectors, a robust Fiedler vector estimation and a robust spectral clustering methods are proposed. Finally, an outlier detection algorithm that is built upon the vertex degree is proposed and applied to gait analysis.
The results of this thesis demonstrate the importance of jointly addressing robustness and the level of sparsity for graph clustering algorithms. Additionally, simplified Laplacian matrix analysis provides promising results to design graph construction methods that may be computed efficiently through the optimization in a vector space instead of the usually used matrix space
Flexible estimation of temporal point processes and graphs
Handling complex data types with spatial structures, temporal dependencies, or discrete values, is generally a challenge in statistics and machine learning. In the recent years, there has been an increasing need of methodological and theoretical work to analyse non-standard data types, for instance, data collected on protein structures, genes interactions, social networks or physical sensors. In this thesis, I will propose a methodology and provide theoretical guarantees for analysing two general types of discrete data emerging from interactive phenomena, namely temporal point processes and graphs.
On the one hand, temporal point processes are stochastic processes used to model event data, i.e., data that comes as discrete points in time or space where some phenomenon occurs. Some of the most successful applications of these discrete processes include online messages, financial transactions, earthquake strikes, and neuronal spikes. The popularity of these processes notably comes from their ability to model unobserved interactions and dependencies between temporally and spatially distant events. However, statistical methods for point processes generally rely on estimating a latent, unobserved, stochastic intensity process. In this context, designing flexible models and consistent estimation methods is often a challenging task.
On the other hand, graphs are structures made of nodes (or agents) and edges (or links), where an edge represents an interaction or relationship between two nodes. Graphs are ubiquitous to model real-world social, transport, and mobility networks, where edges can correspond to virtual exchanges, physical connections between places, or migrations across geographical areas. Besides, graphs are used to represent correlations and lead-lag relationships between time series, and local dependence between random objects. Graphs are typical examples of non-Euclidean data, where adequate distance measures, similarity functions, and generative models need to be formalised. In the deep learning community, graphs have become particularly popular within the field of geometric deep learning.
Structure and dependence can both be modelled by temporal point processes and graphs, although predominantly, the former act on the temporal domain while the latter conceptualise spatial interactions. Nonetheless, some statistical models combine graphs and point processes in order to account for both spatial and temporal dependencies. For instance, temporal point processes have been used to model the birth times of edges and nodes in temporal graphs. Moreover, some multivariate point processes models have a latent graph parameter governing the pairwise causal relationships between the components of
the process. In this thesis, I will notably study such a model, called the Hawkes model, as well as graphs evolving in time.
This thesis aims at designing inference methods that provide flexibility in the contexts of temporal point processes and graphs. This manuscript is presented in an integrated format, with four main chapters and two appendices. Chapters 2 and 3 are dedicated to the study of Bayesian nonparametric inference methods in the generalised Hawkes point process model. While Chapter 2 provides theoretical guarantees for existing methods, Chapter 3 also proposes, analyses, and evaluates a novel variational Bayes methodology. The other main chapters introduce and study model-free inference approaches for two estimation problems on graphs, namely spectral methods for the signed graph clustering problem in Chapter 4, and a deep learning algorithm for the network change point detection task on temporal graphs in Chapter 5.
Additionally, Chapter 1 provides an introduction and background preliminaries on point processes and graphs. Chapter 6 concludes this thesis with a summary and critical thinking on the works in this manuscript, and proposals for future research. Finally, the appendices contain two supplementary papers. The first one, in Appendix A, initiated after the COVID-19 outbreak in March 2020, is an application of a discrete-time Hawkes model to COVID-related deaths counts during the first wave of the pandemic. The second work, in Appendix B, was conducted during an internship at Amazon Research in 2021, and proposes an explainability method for anomaly detection models acting on multivariate time series
Aeroelastic instabilities of an airfoil in transitional flow regimes
Cette thèse porte sur l'étude de l'instabilité aéroélastique provenant de l'interaction fluide–structure, dans le cas d'une aile rigide montée sur un ressort en torsion.
L'étude est centrée sur le phénomène de flottement dû à un décollement laminaire, et plus précisément sur les oscillations (en torsion) auto-entretenues détectées expérimentalement pour un profil NACA0012 à faible incidence, dans la gamme de nombre de Reynolds dits transitionnels (Re in [10^4 – 10^5]), caractérisé par un décollement de la couche limite initialement laminaire, suivi d'une transition et d'un rattachement. L'objectif principal de la thèse est d'expliquer ce phénomène en se basant sur des concepts d'instabilité. Pour ce faire, différentes approches numériques ont été conduites: des simulations numériques bidimensionnelles et des simulations numériques tridimensionnelles (DNS). Ces approches ont en suite servi de base à des analyses de stabilité linéaire (LSA) autour d'un champ moyen ou d'un champ périodique (analyse de Floquet). Le deuxième objectif vise à explorer les différents scénarios non linéaires qui apparaissent dans cette gamme de Reynolds. La première partie de la thèse est consacrée à la caractérisation de l'écoulement autour de l'aile pour des angles d'incidence fixes. Des simulations temporelles bidimensionnelles montrent l'apparition d'oscillations à haute fréquence associées au détachement tourbillonnaire en aval du profil à partir de Re = 8000.
Une analyse de stabilité hydrodynamique (Floquet) est réalisée pour caractériser la transition vers un écoulement tridimensionnel. Des simulations tridimensionnelles sont ensuite réalisées pour Re = 50000 afin de caractériser l'écoulement instantané et moyenné. L'analyse des forces moyennes exercées sur l'aile à incidence fixe permettent de détecter une rigidité aérodynamique négative (rapport moment-incidence) pour la gamme |alpha| 0°), où des solutions chaotiques et quasi-périodiques coexistent pour les mêmes paramètres structuraux, et évolue vers un scénario où les oscillations se font autour de alpha = 0°.
La dernière partie de la thèse essaie d'expliquer la déstabilisation des positions d'équilibre non nulles conduisant à un comportement quasi-périodique à l'aide d'analyses LSA autour des champs moyens et périodiques à incidence fixe. Même si ces analyses sont incapables de prédire un mode propre instable, nous concluons que l'inclusion du terme des contraintes de Reynolds dans la dynamique de perturbation de l'écoulement moyen a un effet important.This thesis investigates aeroelastic instability phenomena arising in coupled fluid–structure interactions, considering the flow around a rigid airfoil mounted on a torsion spring.
The focus is on the laminar separation flutter phenomenon, namely a self-sustained pitch oscillation detected experimentally on a NACA0012 airfoil in the transitional Reynolds number regime (Re in [10^4 – 10^5]) at low incidences, characterised by a detachment of an initially laminar boundary layer followed by its transition and subsequent reattachment. The main objective of the thesis is to explain this phenomenon in terms of instability concepts.
For this, a combination of numerical approaches involving two- and three-dimensional Navier–Stokes simulations—the latter refereed to as Direct Numerical Simulations (DNS)—along with linear stability analyses (LSA) around a mean flow or a periodic flow (Floquet analysis) is employed.
A second objective is to numerically explore the different nonlinear scenarios appearing in the low-to-moderate Reynolds number regime. The first part of the thesis is devoted to the characterisation of the fluid flow around the airfoil considering fixed incidences. Two-dimensional time-marching simulations are first employed, showing the emergence of high-frequency vortex shedding oscillations for Re = 8000. A hydrodynamic stability analysis (Floquet) is then employed to characterise the transition to a three-dimensional flow and DNS is eventually used to characterise both instantaneous and averaged flow quantities at Re = 50000. An analysis of the mean forces exerted on a fixed-incidence wing allows to detect a negative aerodynamic stiffness (torque-to-incidence ratio) in the range |alpha| < 2°, indicating a static instability. The second part of the thesis is devoted to the characterisation of the primary instability of the coupled fluid–structure system using LSA around the mean and periodic flow fields. Considering the symmetrical equilibrium position alpha = 0°, the analysis shows the presence of an unstable static mode, in accordance with the existence of a negative aerodynamic stiffness.
In the third part of the thesis, the emergence of self-sustained flutter oscillations is investigated via two-dimensional aeroelastic simulations. The investigation shows that the system first transitions towards a pitch oscillation around the nonsymmetrical equilibrium position (alpha > 0°), with coexistence of chaotic and quasi-periodic solutions for the same structural parameters, and subsequently transitions towards a pitch oscillation around the symmetrical position (alpha = 0°) as the Reynolds number increases. In the last part of the thesis, an attempt is made to explain the destabilisation of the nonsymmetrical equilibrium positions leading to a quasi-periodic behaviour using LSA around the mean and periodic flow fields at fixed incidences. Even if these analyses are unable to predict an unstable eigenmode, we conclude that the inclusion of the Reynolds stress term in the mean flow perturbation dynamics has an important effect
- …