1,197 research outputs found
Novel Neural Network Applications to Mode Choice in Transportation: Estimating Value of Travel Time and Modelling Psycho-Attitudinal Factors
Whenever researchers wish to study the behaviour of individuals choosing among a set of alternatives, they usually rely on models based on the random utility theory, which postulates that the single individuals modify their behaviour so that they can maximise of their utility. These models, often identified as discrete choice models (DCMs), usually require the definition of the utilities for each alternative, by first identifying the variables influencing the decisions. Traditionally, DCMs focused on observable variables and treated users as optimizing tools with predetermined needs. However, such an approach is in contrast with the results from studies in social sciences which show that choice behaviour can be influenced by psychological factors such as attitudes and preferences. Recently there have been formulations of DCMs which include latent constructs for capturing the impact of subjective factors. These are called hybrid choice models or integrated choice and latent variable models (ICLV). However, DCMs are not exempt from issues, like, the fact that researchers have to choose the variables to include and their relations to define the utilities. This is probably one of the reasons which has recently lead to an influx of numerous studies using machine learning (ML) methods to study mode choice, in which researchers tried to find alternative methods to analyse travellers’ choice behaviour. A ML algorithm is any generic method that uses the data itself to understand and build a model, improving its performance the more it is allowed to learn. This means they do not require any a priori input or hypotheses on the structure and nature of the relationships between the several variables used as its inputs. ML models are usually considered black-box methods, but whenever researchers felt the need for interpretability of ML results, they tried to find alternative ways to use ML methods, like building them by using some a priori knowledge to induce specific constrains. Some researchers also transformed the outputs of ML algorithms so that they could be interpreted from an economic point of view, or built hybrid ML-DCM models. The object of this thesis is that of investigating the benefits and the disadvantages deriving from adopting either DCMs or ML methods to study the phenomenon of mode choice in transportation. The strongest feature of DCMs is the fact that they produce very precise and descriptive results, allowing for a thorough interpretation of their outputs. On the other hand, ML models offer a substantial benefit by being truly data-driven methods and thus learning most relations from the data itself. As a first contribution, we tested an alternative method for calculating the value of travel time (VTT) through the results of ML algorithms. VTT is a very informative parameter to consider, since the time consumed by individuals whenever they need to travel normally represents an undesirable factor, thus they are usually willing to exchange their money to reduce travel times. The method proposed is independent from the mode-choice functions, so it can be applied to econometric models and ML methods equally, if they allow the estimation of individual level probabilities. Another contribution of this thesis is a neural network (NN) for the estimation of choice models with latent variables as an alternative to DCMs. This issue arose from wanting to include in ML models not only level of service variables of the alternatives, and socio-economic attributes of the individuals, but also psycho-attitudinal indicators, to better describe the influence of psychological factors on choice behaviour. The results were estimated by using two different datasets. Since NN results are dependent on the values of their hyper-parameters and on their initialization, several NNs were estimated by using different hyper-parameters to find the optimal values, which were used to verify the stability of the results with different initializations
IterativePFN: True Iterative Point Cloud Filtering
The quality of point clouds is often limited by noise introduced during their
capture process. Consequently, a fundamental 3D vision task is the removal of
noise, known as point cloud filtering or denoising. State-of-the-art learning
based methods focus on training neural networks to infer filtered displacements
and directly shift noisy points onto the underlying clean surfaces. In high
noise conditions, they iterate the filtering process. However, this iterative
filtering is only done at test time and is less effective at ensuring points
converge quickly onto the clean surfaces. We propose IterativePFN (iterative
point cloud filtering network), which consists of multiple IterationModules
that model the true iterative filtering process internally, within a single
network. We train our IterativePFN network using a novel loss function that
utilizes an adaptive ground truth target at each iteration to capture the
relationship between intermediate filtering results during training. This
ensures that the filtered results converge faster to the clean surfaces. Our
method is able to obtain better performance compared to state-of-the-art
methods. The source code can be found at:
https://github.com/ddsediri/IterativePFN.Comment: This paper has been accepted to the IEEE/CVF CVPR Conference, 202
Nonconvex third-order Tensor Recovery Based on Logarithmic Minimax Function
Recent researches have shown that low-rank tensor recovery based non-convex
relaxation has gained extensive attention. In this context, we propose a new
Logarithmic Minimax (LM) function. The comparative analysis between the LM
function and the Logarithmic, Minimax concave penalty (MCP), and Minimax
Logarithmic concave penalty (MLCP) functions reveals that the proposed function
can protect large singular values while imposing stronger penalization on small
singular values. Based on this, we define a weighted tensor LM norm as a
non-convex relaxation for tensor tubal rank. Subsequently, we propose the
TLM-based low-rank tensor completion (LRTC) model and the TLM-based tensor
robust principal component analysis (TRPCA) model respectively. Furthermore, we
provide theoretical convergence guarantees for the proposed methods.
Comprehensive experiments were conducted on various real datasets, and a
comparison analysis was made with the similar EMLCP method. The results
demonstrate that the proposed method outperforms the state-of-the-art methods
Geometric Data Analysis: Advancements of the Statistical Methodology and Applications
Data analysis has become fundamental to our society and comes in multiple facets and approaches. Nevertheless, in research and applications, the focus was primarily on data from Euclidean vector spaces. Consequently, the majority of methods that are applied today are not suited for more general data types. Driven by needs from fields like image processing, (medical) shape analysis, and network analysis, more and more attention has recently been given to data from non-Euclidean spaces–particularly (curved) manifolds. It has led to the field of geometric data analysis whose methods explicitly take the structure (for example, the topology and geometry) of the underlying space into account.
This thesis contributes to the methodology of geometric data analysis by generalizing several fundamental notions from multivariate statistics to manifolds. We thereby focus on two different viewpoints.
First, we use Riemannian structures to derive a novel regression scheme for general manifolds that relies on splines of generalized BĂ©zier curves. It can accurately model non-geodesic relationships, for example, time-dependent trends with saturation effects or cyclic trends. Since BĂ©zier curves can be evaluated with the constructive de Casteljau algorithm, working with data from manifolds of high dimensions (for example, a hundred thousand or more) is feasible. Relying on the regression, we further develop
a hierarchical statistical model for an adequate analysis of longitudinal data in manifolds, and a method to control for confounding variables.
We secondly focus on data that is not only manifold- but even Lie group-valued, which is frequently the case in applications. We can only achieve this by endowing the group with an affine connection structure that is generally not Riemannian. Utilizing it, we derive generalizations of several well-known dissimilarity measures between data distributions that can be used for various tasks, including hypothesis testing. Invariance under data translations is proven, and a connection to continuous distributions is given for one measure.
A further central contribution of this thesis is that it shows use cases for all notions in real-world applications, particularly in problems from shape analysis in medical imaging and archaeology. We can replicate or further quantify several known findings for shape changes of the femur and the right hippocampus under osteoarthritis and Alzheimer's, respectively. Furthermore, in an archaeological application, we obtain new insights into the construction principles of ancient sundials. Last but not least, we use the geometric structure underlying human brain connectomes to predict cognitive scores. Utilizing a sample selection procedure, we obtain state-of-the-art results
Online Machine Learning for Inference from Multivariate Time-series
Inference and data analysis over networks have become significant areas of research due to the increasing prevalence of interconnected systems and the growing volume of data they produce. Many of these systems generate data in the form of multivariate time series, which are collections of time series data that are observed simultaneously across multiple variables. For example, EEG measurements of the brain produce multivariate time series data that record the electrical activity of different brain regions over time. Cyber-physical systems generate multivariate time series that capture the behaviour of physical systems in response to cybernetic inputs. Similarly, financial time series reflect the dynamics of multiple financial instruments or market indices over time. Through the analysis of these time series, one can uncover important details about the behavior of the system, detect patterns, and make predictions. Therefore, designing effective methods for data analysis and inference over networks of multivariate time series is a crucial area of research with numerous applications across various fields. In this Ph.D. Thesis, our focus is on identifying the directed relationships between time series and leveraging this information to design algorithms for data prediction as well as missing data imputation. This Ph.D. thesis is organized as a compendium of papers, which consists of seven chapters and appendices. The first chapter is dedicated to motivation and literature survey, whereas in the second chapter, we present the fundamental concepts that readers should understand to grasp the material presented in the dissertation with ease. In the third chapter, we present three online nonlinear topology identification algorithms, namely NL-TISO, RFNL-TISO, and RFNL-TIRSO. In this chapter, we assume the data is generated from a sparse nonlinear vector autoregressive model (VAR), and propose online data-driven solutions for identifying nonlinear VAR topology. We also provide convergence guarantees in terms of dynamic regret for the proposed algorithm RFNL-TIRSO. Chapters four and five of the dissertation delve into the issue of missing data and explore how the learned topology can be leveraged to address this challenge. Chapter five is distinct from other chapters in its exclusive focus on edge flow data and introduces an online imputation strategy based on a simplicial complex framework that leverages the known network structure in addition to the learned topology. Chapter six of the dissertation takes a different approach, assuming that the data is generated from nonlinear structural equation models. In this chapter, we propose an online topology identification algorithm using a time-structured approach, incorporating information from both the data and the model evolution. The algorithm is shown to have convergence guarantees achieved by bounding the dynamic regret. Finally, chapter seven of the dissertation provides concluding remarks and outlines potential future research directions.publishedVersio
FETA : fairness enforced verifying, training, and predicting algorithms for neural networks
L’automatisation de la prise de décision dans des applications qui affectent directement la qualité de vie des individus grâce aux algorithmes de réseaux de neurones est devenue monnaie courante. Ce mémoire porte sur les enjeux d’équité individuelle qui surviennent lors de la vérification, de l’entraînement et de la prédiction des réseaux de neurones. Une approche populaire pour garantir l’équité consiste à traduire une notion d’équité en contraintes sur les paramètres du modèle. Néanmoins, cette approche ne garantit pas toujours des prédictions équitables des modèles de réseaux de neurones entraînés. Pour relever ce défi, nous avons développé une technique de post-traitement guidée par les contre-exemples afin de faire respecter des contraintes d’équité lors de la prédiction. Contrairement aux travaux antérieurs qui ne garantissent l’équité qu’aux points entourant les données de test ou d’entraînement, nous sommes en mesure de garantir l’équité sur tous les points du domaine. En outre, nous proposons une technique de prétraitement qui repose sur l’utilisation de l’équité comme biais inductif. Cette technique consiste à incorporer itérativement des contre-exemples plus équitables dans le processus d’apprentissage à travers la fonction de perte. Les techniques que nous avons développé ont été implémentées dans un outil appelé FETA. Une évaluation empirique sur des données réelles indique que FETA est non seulement capable de garantir l’équité au moment de la prédiction, mais aussi d’entraîner des modèles précis plus équitables.Algorithmic decision-making driven by neural networks has become very prominent in applications that directly affect people’s quality of life. This paper focuses on the problem of ensuring individual fairness in neural network models during verification, training, and prediction. A popular approach for enforcing fairness is to translate a fairness notion into constraints over the parameters of the model. However, such a translation does not always guarantee fair predictions of the trained neural network model. To address this challenge, we develop a counterexample-guided post-processing technique to provably enforce fairness constraints at prediction time. Contrary to prior work that enforces fairness only on points around test or train data, we are able to enforce and guarantee fairness on all points in the domain. Additionally, we propose a counterexample guided loss as an in-processing technique to use fairness as an inductive bias by iteratively incorporating fairness counterexamples in the learning process. We have implemented these techniques in a tool called FETA. Empirical evaluation on real-world datasets indicates that FETA is not only able to guarantee fairness on-the-fly at prediction time but also is able to train accurate models exhibiting a much higher degree of individual fairness
Data-driven Topology Optimization of Channel Flow Problems
Typical topology optimization methods require complex iterative calculations,
which cannot be realized in meeting the requirements of fast computing
applications. The neural network is studied to reduce the time of computing the
optimization result, however, the data-driven method for fluid topology
optimization is less of discussion. This paper intends to introduce a neural
network architecture that avoids time-consuming iterative processes and has a
strong generalization ability for topology optimization for Stokes flow.
Different neural network methods that have been already successfully used in
solid structure optimization problems are mutated and examined for fluid
topology optimization cases, including Convolution Neural Networks (CNN),
conditional Generative Adversarial Networks (cGAN), and Denoising Diffusion
Implicit Models (DDIM). The presented neural network method is tested on the
channel flow topology optimization problems for Stokes flow. The results have
shown that our presented method has high pixel accuracy, and we gain a 663
times decrease in execution time compared with the conventional method on
average
Temporal-spatial model via Trend Filtering
This research focuses on the estimation of a non-parametric regression
function designed for data with simultaneous time and space dependencies. In
such a context, we study the Trend Filtering, a nonparametric estimator
introduced by \cite{mammen1997locally} and \cite{rudin1992nonlinear}. For
univariate settings, the signals we consider are assumed to have a kth weak
derivative with bounded total variation, allowing for a general degree of
smoothness. In the multivariate scenario, we study a -Nearest Neighbor fused
lasso estimator as in \cite{padilla2018adaptive}, employing an ADMM algorithm,
suitable for signals with bounded variation that adhere to a piecewise
Lipschitz continuity criterion. By aligning with lower bounds, the minimax
optimality of our estimators is validated. A unique phase transition
phenomenon, previously uncharted in Trend Filtering studies, emerges through
our analysis. Both Simulation studies and real data applications underscore the
superior performance of our method when compared with established techniques in
the existing literature
PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks
For problems in image processing and many other fields, a large class of
effective neural networks has encoder-decoder-based architectures. Although
these networks have made impressive performances, mathematical explanations of
their architectures are still underdeveloped. In this paper, we study the
encoder-decoder-based network architecture from the algorithmic perspective and
provide a mathematical explanation. We use the two-phase Potts model for image
segmentation as an example for our explanations. We associate the segmentation
problem with a control problem in the continuous setting. Then, multigrid
method and operator splitting scheme, the PottsMGNet, are used to discretize
the continuous control model. We show that the resulting discrete PottsMGNet is
equivalent to an encoder-decoder-based network. With minor modifications, it is
shown that a number of the popular encoder-decoder-based neural networks are
just instances of the proposed PottsMGNet. By incorporating the
Soft-Threshold-Dynamics into the PottsMGNet as a regularizer, the PottsMGNet
has shown to be robust with the network parameters such as network width and
depth and achieved remarkable performance on datasets with very large noise. In
nearly all our experiments, the new network always performs better or as good
on accuracy and dice score than existing networks for image segmentation
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning
The paper introduces the application of information geometry to describe the
ground states of Ising models by utilizing parity-check matrices of cyclic and
quasi-cyclic codes on toric and spherical topologies. The approach establishes
a connection between machine learning and error-correcting coding. This
proposed approach has implications for the development of new embedding methods
based on trapping sets. Statistical physics and number geometry applied for
optimize error-correcting codes, leading to these embedding and sparse
factorization methods. The paper establishes a direct connection between DNN
architecture and error-correcting coding by demonstrating how state-of-the-art
architectures (ChordMixer, Mega, Mega-chunk, CDIL, ...) from the long-range
arena can be equivalent to of block and convolutional LDPC codes (Cage-graph,
Repeat Accumulate). QC codes correspond to certain types of chemical elements,
with the carbon element being represented by the mixed automorphism
Shu-Lin-Fossorier QC-LDPC code. The connections between Belief Propagation and
the Permanent, Bethe-Permanent, Nishimori Temperature, and Bethe-Hessian Matrix
are elaborated upon in detail. The Quantum Approximate Optimization Algorithm
(QAOA) used in the Sherrington-Kirkpatrick Ising model can be seen as analogous
to the back-propagation loss function landscape in training DNNs. This
similarity creates a comparable problem with TS pseudo-codeword, resembling the
belief propagation method. Additionally, the layer depth in QAOA correlates to
the number of decoding belief propagation iterations in the Wiberg decoding
tree. Overall, this work has the potential to advance multiple fields, from
Information Theory, DNN architecture design (sparse and structured prior graph
topology), efficient hardware design for Quantum and Classical DPU/TPU (graph,
quantize and shift register architect.) to Materials Science and beyond.Comment: 71 pages, 42 Figures, 1 Table, 1 Appendix. arXiv admin note: text
overlap with arXiv:2109.08184 by other author
- …