Search CORE

455 research outputs found

Robotic-assisted approaches for image-controlled ultrasound procedures

Author: Correia Guilherme Alexandre da Costa
Publication venue
Publication date: 01/01/2019
Field of study

Tese de mestrado integrado, Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica), Universidade de Lisboa, Faculdade de Ciências, 2019A aquisição de imagens de ultrassons (US) é atualmente uma das modalidades de aquisição de imagem mais implementadas no meio médico por diversas razões. Quando comparada a outras modalidades como a tomografia computorizada (CT) e ressonância magnética (MRI), a combinação da sua portabilidade e baixo custo com a possibilidade de adquirir imagens em tempo real resulta numa enorme flexibilidade no que diz respeito às suas aplicações em medicina. Estas aplicações estendem-se desde o simples diagnóstico em ginecologia e obstetrícia, até tarefas que requerem alta precisão como cirurgia guiada por imagem ou mesmo em oncologia na área da braquiterapia. No entanto ao contrário das suas contrapartes devido à natureza do princípio físico da qual decorrem as imagens, a sua qualidade de imagem é altamente dependente da destreza do utilizador para colocar e orientar a sonda de US na região de interesse (ROI) correta, bem como, na sua capacidade de interpretar as imagens obtidas e localizar espacialmente as estruturas no corpo do paciente. De modo para tornar os procedimentos de diagnóstico menos propensos a erros, bem como os procedimentos guiados por imagem mais precisos, o acoplamento desta modalidade de imagem com uma abordagem robótica com controlo baseado na imagem adquirida é cada vez mais comum. Isto permite criar sistemas de diagnóstico e terapia semiautónomos, completamente autónomos ou cooperativos com o seu utilizador. Esta é uma tarefa que requer conhecimento e recursos de múltiplas áreas de conhecimento, incluindo de visão por computador, processamento de imagem e teoria de controlo. Em abordagens deste tipo a sonda de US vai agir como câmara para o interior do corpo do paciente e o processo de controlo vai basear-se em parâmetros tais como, as informações espaciais de uma certa estrutura-alvo presente na imagem adquirida. Estas informações que são extraídos através de vários estágios de processamento de imagem são utilizadas como realimentação no ciclo de controlo do sistema robótico em questão. A extração de informação espacial e controlo devem ser o mais autónomos e céleres possível, de modo a conseguir produzir-se um sistema com a capacidade de atuar em situações que requerem resposta em tempo real. Assim, o objetivo deste projeto foi desenvolver, implementar e validar, em MATLAB, as bases de uma abordagem para o controlo semiautónomo baseado em imagens de um sistema robótico de US e que possibilite o rastreio de estruturas-alvo e a automação de procedimentos de diagnóstico gerais com esta modalidade de imagem. De modo a atingir este objetivo foi assim implementada nesta plataforma, um programa semiautónomo com a capacidade de rastrear contornos em imagens US e capaz de produzir informação relativamente à sua posição e orientação na imagem. Este programa foi desenhado para ser compatível com uma abordagem em tempo real utilizando um sistema de aquisição SONOSITE TITAN, cuja velocidade de aquisição de imagem é de 25 fps. Este programa depende de fortemente de conceitos integrados na área de visão por computador, como computação de momentos e contornos ativos, sendo este último o motor principal da ferramenta de rastreamento. De um modo geral este programa pode ser descrito como uma implementação para rastreamento de contornos baseada em contornos ativos. Este tipo de contornos beneficia de um modelo físico subjacente que o permite ser atraído e convergir para determinadas características da imagem, como linhas, fronteiras, cantos ou regiões específicas, decorrente da minimização de um funcional de energia definido para a sua fronteira. De modo a simplificar e tornar mais célere a sua implementação este modelo dinâmico recorreu à parametrização dos contornos com funções harmónicas, pelo que as suas variáveis de sistema são descritoras de Fourier. Ao basear-se no princípio de menor energia o sistema pode ser encaixado na formulação da mecânica de Euler-Lagrange para sistemas físicos e a partir desta podem extrair-se sistemas de equações diferenciais que descrevem a evolução de um contorno ao longo do tempo. Esta evolução dependente não só da energia interna do contorno em sim, devido às forças de tensão e coesão entre pontos, mas também de forças externas que o vão guiar na imagem. Estas forças externas são determinadas de acordo com a finalidade do contorno e são geralmente derivadas de informação presente na imagem, como intensidades, gradientes e derivadas de ordem superior. Por fim, este sistema é implementado utilizando um método explícito de Euler que nos permite obter uma discretização do sistema em questão e nos proporciona uma expressão iterativa para a evolução do sistema de um estado prévio para um estado futuro que tem em conta os efeitos externos da imagem. Depois de ser implementado o desempenho do programa semiautomático de rastreamento foi validado. Esta validação concentrou-se em duas vertentes: na vertente da robustez do rastreio de contornos quando acoplado a uma sonda de US e na vertente da eficiência temporal do programa e da sua compatibilidade com sistemas de aquisição de imagem em tempo real. Antes de se proceder com a validação este sistema de aquisição foi primeiro calibrado espacialmente de forma simples, utilizando um fantoma de cabos em N contruído em acrílico capaz de produzir padrões reconhecíveis na imagem de ultrassons. Foram utilizados padrões verticais, horizontais e diagonais para calibrar a imagem, para os quais se consegue concluir que os dois primeiros produzem melhores valores para os espaçamentos reais entre pixéis da imagem de US. Finalmente a robustez do programa foi testada utilizando fantomas de 5%(m/m) de agar-agar incrustados com estruturas hipoecogénicas, simuladas por balões de água, construídos especialmente para este propósito. Para este tipo de montagem o programa consegue demonstrar uma estabilidade e robustez satisfatórias para diversos movimentos de translação e rotação da sonda US dentro do plano da imagem e mostrando também resultados promissores de resposta ao alongamento de estruturas, decorrentes de movimentos da sonda de US fora do plano da imagem. A validação da performance temporal do programa foi feita com este a funcionar a solo utilizando vídeos adquiridos na fase anterior para modelos de contornos ativos com diferentes níveis de detalhe. O tempo de computação do algoritmo em cada imagem do vídeo foi medido e a sua média foi calculada. Este valor encontra-se dentro dos níveis previstos, sendo facilmente compatível com a montagem da atual da sonda, cuja taxa de aquisição é 25 fps, atingindo a solo valores na gama entre 40 e 50 fps. Apesar demonstrar uma performance temporal e robustez promissoras esta abordagem possui ainda alguns limites para os quais a ainda não possui solução. Estes limites incluem: o suporte para um sistema rastreamento de contornos múltiplos e em simultâneo para estruturas-alvo mais complexas; a deteção e resolução de eventos topológicos dos contornos, como a fusão, separação e auto-interseção de contornos; a adaptabilidade automática dos parâmetros do sistema de equações para diferentes níveis de ruido da imagem e finalmente a especificidade dos potenciais da imagem para a convergência da abordagem em regiões da imagem que codifiquem tipo de tecidos específicos. Mesmo podendo beneficiar de algumas melhorias este projeto conseguiu atingir o objetivo a que se propôs, proporcionando uma implementação eficiente e robusta para um programa de rastreamento de contornos, permitindo lançar as bases nas quais vai ser futuramente possível trabalhar para finalmente atingir um sistema autónomo de diagnóstico em US. Além disso também demonstrou a utilidade de uma abordagem de contornos ativos para a construção de algoritmos de rastreamento robustos aos movimentos de estruturas-alvo no a imagem e com compatibilidade para abordagens em tempo-real.Ultrasound (US) systems are very popular in the medical field for several reasons. Compared to other imaging techniques such as CT or MRI, the combination of low-priced and portable hardware with realtime image acquisition enables great flexibility regarding medical applications, from simple diagnostics tasks to high precision ones, including those with robotic assistance. Unlike other techniques, the image quality and procedure accuracy are highly dependent on user skills for spatial ultrasound probe positioning and orientation around a region of interest (ROI) for inspection. To make diagnostics less prone to error and guided procedures more precise, and consequently safer, the US approach can be coupled to a robotic system. The probe acts as a camera to the patient body and relevant imaging information can be used to control a robotic arm, enabling the creation of semi-autonomous, cooperative and possibly fully autonomous diagnostics and therapeutics. In this project our aim is to develop a semi-autonomous tool for tracking defined structures of interest within US images, that outputs meaningful spatial information of a target structure (location of the centre of mass [CM], main orientation and elongation). Such tool must accomplish real-time requirements for future use in autonomous image-guided robotic systems. To this end, the concepts of moment-based visual servoing and active contours are fundamental. Active contours possess an underlying physical model allowing deformation according to image information, such as edges, image regions and specific image features. Additionally, the mathematical framework of vision-based control enables us to establish the types of necessary information for controlling a future autonomous system and how such information can be transformed to specify a desired task. Once implemented in MATLAB the tracking and temporal performance of this approach is tested in built agar-agar phantoms embedded with water-filled balloons, for stability demonstration, probe motion robustness in translational and rotational movements, as well as promising capability in responding to target structure deformations. The developed framework is also inside the expected levels, being compatible with a 25 frames per second image acquisition setup. The framework also has a standalone tool capable of dealing with 50 fps. Thus, this work lays the foundation for US guided procedures compatible with real-time approaches in moving and deforming targets

Universidade de Lisboa: Repositório.UL

InShaDe: Invariant Shape Descriptors for visual 2D and 3D cellular and nuclear shape analysis and classification

Author: Cali C.
Publication venue
Publication date: 01/01/2021
Field of study

Institutional Research Information System University of Turin

Systems Structure and Control

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The title of the book System, Structure and Control encompasses broad field of theory and applications of many different control approaches applied on different classes of dynamic systems. Output and state feedback control include among others robust control, optimal control or intelligent control methods such as fuzzy or neural network approach, dynamic systems are e.g. linear or nonlinear with or without time delay, fixed or uncertain, onedimensional or multidimensional. The applications cover all branches of human activities including any kind of industry, economics, biology, social sciences etc

Directory of Open Access Books (DOAB)

Structure-Preserving Model Reduction of Physical Network Systems

Author: Schaft Arjan van der
Publication venue: Springer International Publishing AG
Publication date: 01/01/2022
Field of study

This paper considers physical network systems where the energy storage is naturally associated to the nodes of the graph, while the edges of the graph correspond to static couplings. The first sections deal with the linear case, covering examples such as mass-damper and hydraulic systems, which have a structure that is similar to symmetric consensus dynamics. The last section is concerned with a specific class of nonlinear physical network systems; namely detailed-balanced chemical reaction networks governed by mass action kinetics. In both cases, linear and nonlinear, the structure of the dynamics is similar, and is based on a weighted Laplacian matrix, together with an energy function capturing the energy storage at the nodes. We discuss two methods for structure-preserving model reduction. The first one is clustering; aggregating the nodes of the underlying graph to obtain a reduced graph. The second approach is based on neglecting the energy storage at some of the nodes, and subsequently eliminating those nodes (called Kron reduction).</p

Proceedings - University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Output feedback sliding mode control for time delay systems

Author: Han Xiaoran
Publication venue
Publication date: 22/09/2022
Field of study

This Thesis considers Sliding Mode Control (SMC) for linear systems subjected to uncertainties and delays using output feedback. Delay is a natural phenomenon in many practical systems, the effect of delay can be the potential cause -of performance deterioration or even instability. To achieve better control performance, SMC with output feedback is considered for its inherent robustness feature and practicality for implementation. In highlighting the main results, firstly a novel output feedback SMC design is presented which formulates the problem into Linear Matrix Inequalities (LMIs). The efficiency of the design is compared with the the existing literature in pole assignment. eigenstructure assignment and other LMI methods, which either require more constraints on system structures or are computationally less tractable. For systems with timevarying Slate delay, the method is extended to incorporate the delay effect in the controUer synthesis. Both sliding surface and controller design are formulated as LMI problems. For systems with input/output delays and disturbances. the robustness of SMC is degraded with arbitrarily small delay appearing in the high frequency switching component of the controller. To solve the problem singular perturbation method is used to achieve bounded performance which is proportional to the magnitudes of delay, disturbance and switching gain. The applied research has produced two practical implementation studies. Firstly it relates to the pointing control of an autonomous vehicle subjected to external disturbances and friction resulting from the motion of the vehicle crossing rough terrain. The second implementation concerns the attitude control of a flexible spacecraft with respect to roil, pitch and yaw attitude angles

Kent Academic Repository

Development and application of force fields for molecular simulations

Author: Konrad Manuel
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 13/07/2021
Field of study

Simulationen weicher Materie umfassen ein breites Spektrum von Anwendungen, wie z. B. die Modellierung von Biomolekülen, Polymeren und Materialien für die organische Elektronik. Um die Längen- und Zeitskalen relevanter Phänomene zu erreichen, werden die Wechselwirkungen in diesen Systemen üblicherweise durch recheneffiziente analytische Kraftfelder berechnet. Ein Teil dieser Arbeit beschreibt eine Beispielanwendung für die kraftfeldbasierte Modellierung von amorphen organischen Halbleitern. Der konventionelle Kraftfeldansatz führt jedoch Parameter ein, die aus für das betrachtete Molekül geeigneten Parametersätzen zugewiesen werden müssen. Vor allem aufgrund der einfachen Funktionsausdrücke für die nicht-kovalenten Wechselwirkungen erfordert das Verfahren zur Bestimmung dieser Parametersätze empirische Zielwerte, die nicht immer verfügbar sind. Bottom-up-Ansätze, wie z. B. Bottom-up-Kraftfelder mit festen Funktionsausdrücken oder Potentiale basierend auf neuronalen Netzen, zielen darauf ab, die experimentellen Daten durch Ergebnisse aus ab initio Rechnungen zu ersetzen. Für die Anwendung in umfangreichen Molekulardynamiksimulationen weisen diese Methoden noch offene Herausforderungen auf. Feste Funktionsausdrücke leiden unter einer begrenzten Flexibilität, die ab initio Potentialenergieoberfläche zu reproduzieren und erfordern manuelle Typdefinitionen, um die Anzahl der Parameter zu reduzieren. Potentiale, die auf neuronalen Netzen basieren, verbessern beide Aspekte, aber ihre hohen Rechenanforderungen begrenzen die zugänglichen Längen- und Zeitskalen. In dieser Arbeit wird ein neuartiger Bottom-up-Ansatz zur Modellierung nicht-kovalenter Wechselwirkungen vorgestellt, der für großskalige Simulationen konzipiert ist. Das Konzept effizienter additiver Wechselwirkungen wird mit der Flexibilität künstlicher neuronaler Netze für die Interpolation verschiedener chemischer Zusammensetzungen und geometrischer Anordnungen kombiniert. Die Anwendung des Modells wird in Molekulardynamiksimulationen demonstriert, und der Vergleich der berechneten thermodynamischen Eigenschaften mehrerer kleiner organischer Moleküle mit experimentellen Daten und konventionellen Kraftfeldern zeigt eine vielversprechende Vorhersageleistung. Zusätzlich bewahrt das Modell die Energiezerlegung in physikalisch motivierte Komponenten, die von der symmetrieangepassten Störungstheorie, die für die ab initio Referenzrechnungen verwendet wird, bereitgestellt wird. Diese Trennbarkeit und die Unabhängigkeit von empirischen Daten machen dieses Modell potenziell nützlich für zukünftige Materialdesign-Anwendungen

KITopen

MANIFOLD REPRESENTATIONS OF MUSICAL SIGNALS AND GENERATIVE SPACES

Author: A.C.A. CHEMLA ROMEU SANTOS
Publication venue: Università degli Studi di Milano
Publication date: 30/01/2020
Field of study

Tra i diversi campi di ricerca nell\u2019ambito dell\u2019informatica musicale, la sintesi e la generazione di segnali audio incarna la pluridisciplinalita\u300 di questo settore, nutrendo insieme le pratiche scientifiche e musicale dalla sua creazione. Inerente all\u2019informatica dalla sua creazione, la generazione audio ha ispirato numerosi approcci, evolvendo colle pratiche musicale e gli progressi tecnologici e scientifici. Inoltre, alcuni processi di sintesi permettono anche il processo inverso, denominato analisi, in modo che i parametri di sintesi possono anche essere parzialmente o totalmente estratti dai suoni, dando una rappresentazione alternativa ai segnali analizzati. Per di piu\u300, la recente ascesa dei algoritmi di l\u2019apprendimento automatico ha vivamente interrogato il settore della ricerca scientifica, fornendo potenti data-centered metodi che sollevavano diversi epistemologici interrogativi, nonostante i sui efficacia. Particolarmente, un tipo di metodi di apprendimento automatico, denominati modelli generativi, si concentrano sulla generazione di contenuto originale usando le caratteristiche che hanno estratti dei dati analizzati. In tal caso, questi modelli non hanno soltanto interrogato i precedenti metodi di generazione, ma anche sul modo di integrare questi algoritmi nelle pratiche artistiche. Mentre questi metodi sono progressivamente introdotti nel settore del trattamento delle immagini, la loro applicazione per la sintesi di segnali audio e ancora molto marginale. In questo lavoro, il nostro obiettivo e di proporre un nuovo metodo di audio sintesi basato su questi nuovi tipi di generativi modelli, rafforazti dalle nuove avanzati dell\u2019apprendimento automatico. Al primo posto, facciamo una revisione dei approcci esistenti nei settori dei sistemi generativi e di sintesi sonore, focalizzando sul posto di nostro lavoro rispetto a questi disciplini e che cosa possiamo aspettare di questa collazione. In seguito, studiamo in maniera piu\u300 precisa i modelli generativi, e come possiamo utilizzare questi recenti avanzati per l\u2019apprendimento di complesse distribuzione di suoni, in un modo che sia flessibile e nel flusso creativo del utente. Quindi proponiamo un processo di inferenza / generazione, il quale rifletta i processi di analisi/sintesi che sono molto usati nel settore del trattamento del segnale audio, usando modelli latenti, che sono basati sull\u2019utilizzazione di un spazio continuato di alto livello, che usiamo per controllare la generazione. Studiamo dapprima i risultati preliminari ottenuti con informazione spettrale estratte da diversi tipi di dati, che valutiamo qualitativamente e quantitativamente. Successiva- mente, studiamo come fare per rendere questi metodi piu\u300 adattati ai segnali audio, fronteggiando tre diversi aspetti. Primo, proponiamo due diversi metodi di regolarizzazione di questo generativo spazio che sono specificamente sviluppati per l\u2019audio : una strategia basata sulla traduzione segnali / simboli, e una basata su vincoli percettivi. Poi, proponiamo diversi metodi per fronteggiare il aspetto temporale dei segnali audio, basati sull\u2019estrazione di rappresentazioni multiscala e sulla predizione, che permettono ai generativi spazi ottenuti di anche modellare l\u2019aspetto dinamico di questi segnali. Per finire, cambiamo il nostro approccio scientifico per un punto di visto piu\u301 ispirato dall\u2019idea di ricerca e creazione. Primo, descriviamo l\u2019architettura e il design della nostra libreria open-source, vsacids, sviluppata per permettere a esperti o non-esperti musicisti di provare questi nuovi metodi di sintesi. Poi, proponiamo una prima utilizzazione del nostro modello con la creazione di una performance in real- time, chiamata \ue6go, basata insieme sulla nostra libreria vsacids e sull\u2019uso di une agente di esplorazione, imparando con rinforzo nel corso della composizione. Finalmente, tramo dal lavoro presentato alcuni conclusioni sui diversi modi di migliorare e rinforzare il metodo di sintesi proposto, nonche\u301 eventuale applicazione artistiche.Among the diverse research fields within computer music, synthesis and generation of audio signals epitomize the cross-disciplinarity of this domain, jointly nourishing both scientific and artistic practices since its creation. Inherent in computer music since its genesis, audio generation has inspired numerous approaches, evolving both with musical practices and scientific/technical advances. Moreover, some syn- thesis processes also naturally handle the reverse process, named analysis, such that synthesis parameters can also be partially or totally extracted from actual sounds, and providing an alternative representation of the analyzed audio signals. On top of that, the recent rise of machine learning algorithms earnestly questioned the field of scientific research, bringing powerful data-centred methods that raised several epistemological questions amongst researchers, in spite of their efficiency. Especially, a family of machine learning methods, called generative models, are focused on the generation of original content using features extracted from an existing dataset. In that case, such methods not only questioned previous approaches in generation, but also the way of integrating this methods into existing creative processes. While these new generative frameworks are progressively introduced in the domain of image generation, the application of such generative techniques in audio synthesis is still marginal. In this work, we aim to propose a new audio analysis-synthesis framework based on these modern generative models, enhanced by recent advances in machine learning. We first review existing approaches, both in sound synthesis and in generative machine learning, and focus on how our work inserts itself in both practices and what can be expected from their collation. Subsequently, we focus a little more on generative models, and how modern advances in the domain can be exploited to allow us learning complex sound distributions, while being sufficiently flexible to be integrated in the creative flow of the user. We then propose an inference / generation process, mirroring analysis/synthesis paradigms that are natural in the audio processing domain, using latent models that are based on a continuous higher-level space, that we use to control the generation. We first provide preliminary results of our method applied on spectral information, extracted from several datasets, and evaluate both qualitatively and quantitatively the obtained results. Subsequently, we study how to make these methods more suitable for learning audio data, tackling successively three different aspects. First, we propose two different latent regularization strategies specifically designed for audio, based on and signal / symbol translation and perceptual constraints. Then, we propose different methods to address the inner temporality of musical signals, based on the extraction of multi-scale representations and on prediction, that allow the obtained generative spaces that also model the dynamics of the signal. As a last chapter, we swap our scientific approach to a more research & creation-oriented point of view: first, we describe the architecture and the design of our open-source library, vsacids, aiming to be used by expert and non-expert music makers as an integrated creation tool. Then, we propose an first musical use of our system by the creation of a real-time performance, called aego, based jointly on our framework vsacids and an explorative agent using reinforcement learning to be trained during the performance. Finally, we draw some conclusions on the different manners to improve and reinforce the proposed generation method, as well as possible further creative applications.A\u300 travers les diffe\u301rents domaines de recherche de la musique computationnelle, l\u2019analysie et la ge\u301ne\u301ration de signaux audio sont l\u2019exemple parfait de la trans-disciplinarite\u301 de ce domaine, nourrissant simultane\u301ment les pratiques scientifiques et artistiques depuis leur cre\u301ation. Inte\u301gre\u301e a\u300 la musique computationnelle depuis sa cre\u301ation, la synthe\u300se sonore a inspire\u301 de nombreuses approches musicales et scientifiques, e\u301voluant de pair avec les pratiques musicales et les avance\u301es technologiques et scientifiques de son temps. De plus, certaines me\u301thodes de synthe\u300se sonore permettent aussi le processus inverse, appele\u301 analyse, de sorte que les parame\u300tres de synthe\u300se d\u2019un certain ge\u301ne\u301rateur peuvent e\u302tre en partie ou entie\u300rement obtenus a\u300 partir de sons donne\u301s, pouvant ainsi e\u302tre conside\u301re\u301s comme une repre\u301sentation alternative des signaux analyse\u301s. Paralle\u300lement, l\u2019inte\u301re\u302t croissant souleve\u301 par les algorithmes d\u2019apprentissage automatique a vivement questionne\u301 le monde scientifique, apportant de puissantes me\u301thodes d\u2019analyse de donne\u301es suscitant de nombreux questionnements e\u301piste\u301mologiques chez les chercheurs, en de\u301pit de leur effectivite\u301 pratique. En particulier, une famille de me\u301thodes d\u2019apprentissage automatique, nomme\u301e mode\u300les ge\u301ne\u301ratifs, s\u2019inte\u301ressent a\u300 la ge\u301ne\u301ration de contenus originaux a\u300 partir de caracte\u301ristiques extraites directement des donne\u301es analyse\u301es. Ces me\u301thodes n\u2019interrogent pas seulement les approches pre\u301ce\u301dentes, mais aussi sur l\u2019inte\u301gration de ces nouvelles me\u301thodes dans les processus cre\u301atifs existants. Pourtant, alors que ces nouveaux processus ge\u301ne\u301ratifs sont progressivement inte\u301gre\u301s dans le domaine la ge\u301ne\u301ration d\u2019image, l\u2019application de ces techniques en synthe\u300se audio reste marginale. Dans cette the\u300se, nous proposons une nouvelle me\u301thode d\u2019analyse-synthe\u300se base\u301s sur ces derniers mode\u300les ge\u301ne\u301ratifs, depuis renforce\u301s par les avance\u301es modernes dans le domaine de l\u2019apprentissage automatique. Dans un premier temps, nous examinerons les approches existantes dans le domaine des syste\u300mes ge\u301ne\u301ratifs, sur comment notre travail peut s\u2019inse\u301rer dans les pratiques de synthe\u300se sonore existantes, et que peut-on espe\u301rer de l\u2019hybridation de ces deux approches. Ensuite, nous nous focaliserons plus pre\u301cise\u301ment sur comment les re\u301centes avance\u301es accomplies dans ce domaine dans ce domaine peuvent e\u302tre exploite\u301es pour l\u2019apprentissage de distributions sonores complexes, tout en e\u301tant suffisamment flexibles pour e\u302tre inte\u301gre\u301es dans le processus cre\u301atif de l\u2019utilisateur. Nous proposons donc un processus d\u2019infe\u301rence / g\ue9n\ue9ration, refle\u301tant les paradigmes d\u2019analyse-synthe\u300se existant dans le domaine de ge\u301ne\u301ration audio, base\u301 sur l\u2019usage de mode\u300les latents continus que l\u2019on peut utiliser pour contro\u302ler la ge\u301ne\u301ration. Pour ce faire, nous e\u301tudierons de\u301ja\u300 les re\u301sultats pre\u301liminaires obtenus par cette me\u301thode sur l\u2019apprentissage de distributions spectrales, prises d\u2019ensembles de donne\u301es diversifie\u301s, en adoptant une approche a\u300 la fois quantitative et qualitative. Ensuite, nous proposerons d\u2019ame\u301liorer ces me\u301thodes de manie\u300re spe\u301cifique a\u300 l\u2019audio sur trois aspects distincts. D\u2019abord, nous proposons deux strate\u301gies de re\u301gularisation diffe\u301rentes pour l\u2019analyse de signaux audio : une base\u301e sur la traduction signal/ symbole, ainsi qu\u2019une autre base\u301e sur des contraintes perceptives. Nous passerons par la suite a\u300 la dimension temporelle de ces signaux audio, proposant de nouvelles me\u301thodes base\u301es sur l\u2019extraction de repre\u301sentations temporelles multi-e\u301chelle et sur une ta\u302che supple\u301mentaire de pre\u301diction, permettant la mode\u301lisation de caracte\u301ristiques dynamiques par les espaces ge\u301ne\u301ratifs obtenus. En dernier lieu, nous passerons d\u2019une approche scientifique a\u300 une approche plus oriente\u301e vers un point de vue recherche & cre\u301ation. Premie\u300rement, nous pre\u301senterons notre librairie open-source, vsacids, visant a\u300 e\u302tre employe\u301e par des cre\u301ateurs experts et non-experts comme un outil inte\u301gre\u301. Ensuite, nous proposons une premie\u300re utilisation musicale de notre syste\u300me par la cre\u301ation d\u2019une performance temps re\u301el, nomme\u301e \ue6go, base\u301e a\u300 la fois sur notre librarie et sur un agent d\u2019exploration appris dynamiquement par renforcement au cours de la performance. Enfin, nous tirons les conclusions du travail accompli jusqu\u2019a\u300 maintenant, concernant les possibles ame\u301liorations et de\u301veloppements de la me\u301thode de synthe\u300se propose\u301e, ainsi que sur de possibles applications cre\u301atives

AIR Universita degli studi di Milano

New methods for deep dictionary learning and for image completion

Author: Huang Junjie
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2020
Field of study

Digital imaging plays an essential role in many aspects of our daily life. However due to the hardware limitations of the imaging devices, the image measurements are usually inpaired and require further processing to enhance the quality of the raw images in order to enable applications on the user side. Image enhancement aims to improve the information content within image measurements by exploiting the properties of the target image and the forward model of the imaging device. In this thesis, we aim to tackle two specific image enhancement problems, that is, single image super-resolution and image completion. First, we present a new Deep Analysis Dictionary Model (DeepAM) which consists of multiple layers of analysis dictionaries with associated soft-thresholding operators and a single layer of synthesis dictionary for single image super-resolution. To achieve an effective deep model, each analysis dictionary has been designed to be composed of an Information Preserving Analysis Dictionary (IPAD) which passes essential information from the input signal to output and a Clustering Analysis Dictionary (CAD) which generates discriminative feature representation. The parameters of the deep analysis dictionary model are optimized using a layer-wise learning strategy. We demonstrate that both the proposed deep dictionary design and the learning algorithm are effective. Simulation results show that the proposed method achieves comparable performance with Deep Neural Networks and other existing methods. We then generalize DeepAM to a Deep Convolutional Analysis Dictionary Model (DeepCAM) by learning convolutional dictionaries instead of unstructured dictionaries. The convolutional dictionary is more suitable for processing high-dimensional signals like images and has only a small number of free parameters. By exploiting the properties of a convolutional dictionary, we present an efficient convolutional analysis dictionary learning algorithm. The IPAD and the CAD parts are learned using variations of the proposed convolutional analysis dictionary learning algorithm. We demonstrate that DeepCAM is an effective multi-layer convolutional model and achieves better performance than DeepAM while using a smaller number of parameters. Finally, we present an image completion algorithm based on dense correspondence between the input image and an exemplar image retrieved from Internet which has been taken at a similar position. The dense correspondence which is estimated using a hierarchical PatchMatch algorithm is usually noisy and with a large occlusion area corresponding to the region to be completed. By modelling the dense correspondence as a smooth field, an Expectation-Maximization (EM) based method is presented to interpolate a smooth field over the occlusion area which is then used to transfer image content from the exemplar image to the input image. Color correction is further applied to diminish the possible color differences between the input image and the exemplar image. Numerical results demonstrate that the proposed image completion algorithm is able to achieve photo realistic image completion results.Open Acces

Spiral - Imperial College Digital Repository