Search CORE

47 research outputs found

Recommended from our members

Controllable Neural Synthesis for Natural Images and Vector Art

Author: Liu Difan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/10/2022
Field of study

Neural image synthesis approaches have become increasingly popular over the last years due to their ability to generate photorealistic images useful for several applications, such as digital entertainment, mixed reality, synthetic dataset creation, computer art, to name a few. Despite the progress over the last years, current approaches lack two important aspects: (a) they often fail to capture long-range interactions in the image, and as a result, they fail to generate scenes with complex dependencies between their different objects or parts. (b) they often ignore the underlying 3D geometry of the shape/scene in the image, and as a result, they frequently lose coherency and details.My thesis proposes novel solutions to the above problems. First, I propose a neural transformer architecture that captures long-range interactions and context for image synthesis at high resolutions, leading to synthesizing interesting phenomena in scenes, such as reflections of landscapes onto water or flora consistent with the rest of the landscape, that was not possible to generate reliably with previous ConvNet- and other transformer-based approaches. The key idea of the architecture is to sparsify the transformer\u27s attention matrix at high resolutions, guided by dense attention extracted at lower image resolution. I present qualitative and quantitative results, along with user studies, demonstrating the effectiveness of the method, and its superiority compared to the state-of-the-art. Second, I propose a method that generates artistic images with the guidance of input 3D shapes. In contrast to previous methods, the use of a geometric representation of 3D shape enables the synthesis of more precise stylized drawings with fewer artifacts. My method outputs the synthesized images in a vector representation, enabling richer downstream analysis or editing in interactive applications. I also show that the method produces substantially better results than existing image-based methods, in terms of predicting artists’ drawings and in user evaluation of results

ScholarWorks@UMass Amherst

An Analysis by Synthesis Approach for Automatic Vertebral Shape Identification in Clinical QCT

Author: A Consensus
A Mastmeyer
CC Glüer
G Guglielmi
GM Treece
H Giambini
H Ritzel
HK Genant
Isaac Chao
K Siddiqi
M Haidekker
M Haidekker
M Meyer
M Silva
M Smith
Masaki Ohkubo
MS Aslan
R Andresen
S Prevrhal
S Reinhold
SD Rockoff
W Kalender
WA Kalender
Y Kang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/12/2018
Field of study

Quantitative computed tomography (QCT) is a widely used tool for osteoporosis diagnosis and monitoring. The assessment of cortical markers like cortical bone mineral density (BMD) and thickness is a demanding task, mainly because of the limited spatial resolution of QCT. We propose a direct model based method to automatically identify the surface through the center of the cortex of human vertebra. We develop a statistical bone model and analyze its probability distribution after the imaging process. Using an as-rigid-as-possible deformation we find the cortical surface that maximizes the likelihood of our model given the input volume. Using the European Spine Phantom (ESP) and a high resolution \mu CT scan of a cadaveric vertebra, we show that the proposed method is able to accurately identify the real center of cortex ex-vivo. To demonstrate the in-vivo applicability of our method we use manually obtained surfaces for comparison.Comment: Presented on German Conference on Pattern Recognition (GCPR) 2018 in Stuttgar

arXiv.org e-Print Archive

Crossref

Deformation Tracking in Depth and Color Video: An Analysis by Synthesis Approach

Author: Jordt Andreas Stefan
Publication venue
Publication date: 01/01/2015
Field of study

The tracking of deforming objects and the reconstruction of deformation in image sequences is one of the current research areas in computer vision. In contrast to rigid scenes, which can be analyzed and reconstructed very well, general deformations come with an infinite number of sub-movements and ways to parametrize them, which makes it very difficult to formulate discrete tracking goals. In contrast to the classic reconstructions based on color data alone, the combination of depth and color video provides tracking algorithms with a data foundation with less room for ambiguities, but also requires new algorithmic approaches to handle different entities and to exploit the available data. This thesis discusses an Analysis by Synthesis (AbS) scheme as an approach to the deformation tracking problem, a method that differs in many key aspects from common reconstruction schemes. It is demonstrated that AbS based deformation reconstruction can reconstruct complex deformations, deal with occlusions and self-occlusions, and can also be used for real-time tracking

MACAU: Open Access Repository of Kiel University

Differentiable world programs

Author: Jatavallabhul Krishna Murthy
Publication venue
Publication date: 01/01/2022
Field of study

L'intelligence artificielle (IA) moderne a ouvert de nouvelles perspectives prometteuses pour la création de robots intelligents. En particulier, les architectures d'apprentissage basées sur le gradient (réseaux neuronaux profonds) ont considérablement amélioré la compréhension des scènes 3D en termes de perception, de raisonnement et d'action. Cependant, ces progrès ont affaibli l'attrait de nombreuses techniques ``classiques'' développées au cours des dernières décennies. Nous postulons qu'un mélange de méthodes ``classiques'' et ``apprises'' est la voie la plus prometteuse pour développer des modèles du monde flexibles, interprétables et exploitables : une nécessité pour les agents intelligents incorporés. La question centrale de cette thèse est : ``Quelle est la manière idéale de combiner les techniques classiques avec des architectures d'apprentissage basées sur le gradient pour une compréhension riche du monde 3D ?''. Cette vision ouvre la voie à une multitude d'applications qui ont un impact fondamental sur la façon dont les agents physiques perçoivent et interagissent avec leur environnement. Cette thèse, appelée ``programmes différentiables pour modèler l'environnement'', unifie les efforts de plusieurs domaines étroitement liés mais actuellement disjoints, notamment la robotique, la vision par ordinateur, l'infographie et l'IA. Ma première contribution---gradSLAM--- est un système de localisation et de cartographie simultanées (SLAM) dense et entièrement différentiable. En permettant le calcul du gradient à travers des composants autrement non différentiables tels que l'optimisation non linéaire par moindres carrés, le raycasting, l'odométrie visuelle et la cartographie dense, gradSLAM ouvre de nouvelles voies pour intégrer la reconstruction 3D classique et l'apprentissage profond. Ma deuxième contribution - taskography - propose une sparsification conditionnée par la tâche de grandes scènes 3D encodées sous forme de graphes de scènes 3D. Cela permet aux planificateurs classiques d'égaler (et de surpasser) les planificateurs de pointe basés sur l'apprentissage en concentrant le calcul sur les attributs de la scène pertinents pour la tâche. Ma troisième et dernière contribution---gradSim--- est un simulateur entièrement différentiable qui combine des moteurs physiques et graphiques différentiables pour permettre l'estimation des paramètres physiques et le contrôle visuomoteur, uniquement à partir de vidéos ou d'une image fixe.Modern artificial intelligence (AI) has created exciting new opportunities for building intelligent robots. In particular, gradient-based learning architectures (deep neural networks) have tremendously improved 3D scene understanding in terms of perception, reasoning, and action. However, these advancements have undermined many ``classical'' techniques developed over the last few decades. We postulate that a blend of ``classical'' and ``learned'' methods is the most promising path to developing flexible, interpretable, and actionable models of the world: a necessity for intelligent embodied agents. ``What is the ideal way to combine classical techniques with gradient-based learning architectures for a rich understanding of the 3D world?'' is the central question in this dissertation. This understanding enables a multitude of applications that fundamentally impact how embodied agents perceive and interact with their environment. This dissertation, dubbed ``differentiable world programs'', unifies efforts from multiple closely-related but currently-disjoint fields including robotics, computer vision, computer graphics, and AI. Our first contribution---gradSLAM---is a fully differentiable dense simultaneous localization and mapping (SLAM) system. By enabling gradient computation through otherwise non-differentiable components such as nonlinear least squares optimization, ray casting, visual odometry, and dense mapping, gradSLAM opens up new avenues for integrating classical 3D reconstruction and deep learning. Our second contribution---taskography---proposes a task-conditioned sparsification of large 3D scenes encoded as 3D scene graphs. This enables classical planners to match (and surpass) state-of-the-art learning-based planners by focusing computation on task-relevant scene attributes. Our third and final contribution---gradSim---is a fully differentiable simulator that composes differentiable physics and graphics engines to enable physical parameter estimation and visuomotor control, solely from videos or a still image

Dépôt Institutionnel Numérique

Learning Sparse FRAME Models for Natural Image Patterns

Author: A Adler
A Gelman
AM Bruckstein
AP Dempster
BA Olshausen
DH Ackley
G Obozinski
GE Hinton
GE Hinton
J Tropp
JH Friedman
Jianwen Xie
L Younes
M Aharon
M Elad
M Elad
M Riesenhuber
P Felzenszwalb
R Neal
R Rubinstein
R Tibshirani
RE Fan
S Duane
S Geman
S Geman
S Mallat
S Nama
SC Zhu
SC Zhu
Song-Chun Zhu
SS Chen
Wenze Hu
Y Hong
Ying Nian Wu
Z Si
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Optimal Transport for Domain Adaptation

Author: Courty Nicolas
Flamary Rémi
Rakotomamonjy Alain
Tuia Devis
Publication venue
Publication date: 01/01/2016
Field of study

Domain adaptation from one data space (or domain) to another is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data space become more robust when confronted to data depicting the same semantic concepts (the classes), but observed by another observation system with its own specificities. Among the many strategies proposed to adapt a domain to another, finding a common representation has shown excellent properties: by finding a common representation for both domains, a single classifier can be effective in both and use labelled samples from the source domain to predict the unlabelled samples of the target domain. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains. We learn a transportation plan matching both PDFs, which constrains labelled samples in the source domain to remain close during transport. This way, we exploit at the same time the few labeled information in the source and the unlabelled distributions observed in both domains. Experiments in toy and challenging real visual adaptation examples show the interest of the method, that consistently outperforms state of the art approaches

HAL-CentraleSupelec

HAL-UNICE

HAL Descartes

Wageningen University & Research Publications

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

INRIA a CCSD electronic archive server

HAL-INSU

ZORA

HAL-Rennes 1

LFGCN: Levitating over Graphs with Levy Flights

Author: Avrachenkov Konstantin
Chen Yuzhou
Gel Yulia R.
Publication venue
Publication date: 04/09/2020
Field of study

Due to high utility in many applications, from social networks to blockchain to power grids, deep learning on non-Euclidean objects such as graphs and manifolds, coined Geometric Deep Learning (GDL), continues to gain an ever increasing interest. We propose a new L\'evy Flights Graph Convolutional Networks (LFGCN) method for semi-supervised learning, which casts the L\'evy Flights into random walks on graphs and, as a result, allows both to accurately account for the intrinsic graph topology and to substantially improve classification performance, especially for heterogeneous graphs. Furthermore, we propose a new preferential P-DropEdge method based on the Girvan-Newman argument. That is, in contrast to uniform removing of edges as in DropEdge, following the Girvan-Newman algorithm, we detect network periphery structures using information on edge betweenness and then remove edges according to their betweenness centrality. Our experimental results on semi-supervised node classification tasks demonstrate that the LFGCN coupled with P-DropEdge accelerates the training task, increases stability and further improves predictive accuracy of learned graph topology structure. Finally, in our case studies we bring the machinery of LFGCN and other deep networks tools to analysis of power grid networks - the area where the utility of GDL remains untapped.Comment: To Appear in the 2020 IEEE International Conference on Data Mining (ICDM

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Quantum Computing for Machine Learning and Physics Simulation

Author: Zlokapa Alexander
Publication venue
Publication date: 01/01/2021
Field of study

Quantum computing is widely thought to provide exponential speedups over classical algorithms for a variety of computational tasks. In classical computing, methods in artificial intelligence such as neural networks and adversarial learning have enabled drastic improvements in state-of-the-art performance for a variety of tasks. We consider the intersection of quantum computing with machine learning, including the quantum algorithms for deep learning on classical datasets, quantum adversarial learning for quantum states, and variational quantum machine learning for improved physics simulation. We consider a standard deep neural network architecture and show that conditions amenable to trainability by gradient descent coincide with those necessary for an efficient quantum algorithm. Considering the neural network in the infinite-width limit using the neural tangent kernel formalism, we propose a quantum algorithm to train the neural network with vanishing error as the training dataset size increases. Under a sparse approximation of the neural tangent kernel, the training time scales logarithmically with the number of training examples, providing the first known exponential quantum speedup for feedforward neural networks. Related approximations to the neural tangent kernel are discussed, with numerical studies showing successful convergence beyond the proven regime. Our work suggests the applicability of the quantum computing to additional neural network architectures and common datasets such as MNIST, as well as kernel methods beyond the neural tangent kernel. Generative adversarial networks (GANs) are one of the most widely adopted machine learning methods for data generation. We propose an entangling quantum GAN (EQ-GAN) that overcomes some limitations of previously proposed quantum GANs. EQ-GAN guarantees the convergence to a Nash equilibrium under minimax optimization of the discriminator and generator circuits by performing entangling operations between both the generator output and true quantum data. We show that EQ-GAN has additional robustness against coherent errors and demonstrate the effectiveness of EQ-GAN experimentally in a Google Sycamore superconducting quantum processor. By adversarially learning efficient representations of quantum states, we prepare an approximate quantum random access memory and demonstrate its use in applications including the training of near-term quantum neural networks. With quantum computers providing a natural platform for physics simulation, we investigate the use of variational quantum circuits to simulate many-body systems with high fidelity in the near future. In particular, recent work shows that teleportation caused by introducing a weak coupling between two entangled SYK models is dual to a particle traversing an AdS-Schwarzschild wormhole, providing a mechanism to probe quantum gravity theories in the lab. To simulate such a system, we propose the process of compressed Trotterization to improve the fidelity of time evolution on noisy devices. The task of learning approximate time evolution circuits is shown to have a favorable training landscape, and numerical experiments demonstrate its relevance to simulating other many-body systems such as a Fermi-Hubbard model. For the SYK model in particular, we demonstrate the construction of a low-rank approximation that favors a shallower Trotterization. Finally, classical simulations of finite-N SYK models suggest that teleportation via a traversable wormhole instead of random unitary scrambling is achievable with O(20) qubits, providing further indication that such quantum gravity experiments may realizable with near-term quantum hardware.</p

Caltech Theses and Dissertations

Modeling functional brain activity of human working memory using deep recurrent neural networks

Author: Sainath Pravish
Publication venue
Publication date: 01/12/2020
Field of study

Dans les systèmes cognitifs, le rôle de la mémoire de travail est crucial pour le raisonnement visuel et la prise de décision. D’énormes progrès ont été réalisés dans la compréhension des mécanismes de la mémoire de travail humain/animal, ainsi que dans la formulation de différents cadres de réseaux de neurones artificiels à mémoire augmentée. L’objectif global de notre projet est de former des modèles de réseaux de neurones artificiels capables de consolider la mémoire sur une courte période de temps pour résoudre une tâche de mémoire et les relier à l’activité cérébrale des humains qui ont résolu la même tâche. Le projet est de nature interdisciplinaire en essayant de relier les aspects de l’intelligence artificielle (apprentissage profond) et des neurosciences. La tâche cognitive utilisée est la tâche N-back, très populaire en neurosciences cognitives dans laquelle les sujets sont présentés avec une séquence d’images, dont chacune doit être identifiée pour savoir si elle a déjà été vue ou non. L’ensemble de données d’imagerie fonctionnelle (IRMf) utilisé a été collecté dans le cadre du projet Courtois Neurmod. Nous étudions plusieurs variantes de modèles de réseaux neuronaux récurrents qui apprennent à résoudre la tâche de mémoire de travail N-back en les entraînant avec des séquences d’images. Ces réseaux de neurones entraînés optimisés pour la tâche de mémoire sont finalement utilisés pour générer des représentations de caractéristiques pour les images de stimuli vues par les sujets humains pendant leurs enregistrements tout en résolvant la tâche. Les représentations dérivées de ces réseaux de neurones servent ensuite à créer un modèle de codage pour prédire l’activité IRMf BOLD des sujets. On comprend alors la relation entre le modèle de réseau neuronal et l’activité cérébrale en analysant cette capacité prédictive du modèle dans différentes zones du cerveau impliquées dans la mémoire de travail. Ce travail présente une manière d’utiliser des réseaux de neurones artificiels pour modéliser le comportement et le traitement de l’information de la mémoire de travail du cerveau et d’utiliser les données d’imagerie cérébrale capturées sur des sujets humains lors de la tâche N-back pour potentiellement comprendre certains mécanismes de mémoire du cerveau en relation avec ces modèles de réseaux de neurones artificiels.In cognitive systems, the role of working memory is crucial for visual reasoning and decision making. Tremendous progress has been made in understanding the mechanisms of the human/animal working memory, as well as in formulating different frameworks of memory augmented artificial neural networks. The overall objective of our project is to train artificial neural network models that are capable of consolidating memory over a short period of time to solve a memory task and relate them to the brain activity of humans who solved the same task. The project is of interdisciplinary nature in trying to bridge aspects of Artificial Intelligence (deep learning) and Neuroscience. The cognitive task used is the N-back task, a very popular one in Cognitive Neuroscience in which the subjects are presented with a sequence of images, each of which needs to be identified as to whether it was already seen or not. The functional imaging (fMRI) dataset used has been collected as a part of the Courtois Neurmod Project. We study multiple variants of recurrent neural network models that learn to remember input images across timesteps. These trained neural networks optimized for the memory task are ultimately used to generate feature representations for the stimuli images seen by the human subjects during their recordings while solving the task. The representations derived from these neural networks are then to create an encoding model to predict the fMRI BOLD activity of the subjects. We then understand the relationship between the neural network model and brain activity by analyzing this predictive ability of the model in different areas of the brain that are involved in working memory. This work presents a way of using artificial neural networks to model the behavior and information processing of the working memory of the brain and to use brain imaging data captured from human subjects during the N-back task to potentially understand some memory mechanisms of the brain in relation to these artificial neural network models

Dépôt Institutionnel Numérique