575 research outputs found

    MARS: Masked Automatic Ranks Selection in Tensor Decompositions

    Full text link
    Tensor decomposition methods are known to be efficient for compressing and accelerating neural networks. However, the problem of optimal decomposition structure determination is still not well studied while being quite important. Specifically, decomposition ranks present the crucial parameter controlling the compression-accuracy trade-off. In this paper, we introduce MARS -- a new efficient method for the automatic selection of ranks in general tensor decompositions. During training, the procedure learns binary masks over decomposition cores that "select" the optimal tensor structure. The learning is performed via relaxed maximum a posteriori (MAP) estimation in a specific Bayesian model. The proposed method achieves better results compared to previous works in various tasks

    AIDA: An Active Inference-based Design Agent for Audio Processing Algorithms

    Get PDF
    In this paper we present AIDA, which is an active inference-based agent that iteratively designs a personalized audio processing algorithm through situated interactions with a human client. The target application of AIDA is to propose on-the-spot the most interesting alternative values for the tuning parameters of a hearing aid (HA) algorithm, whenever a HA client is not satisfied with their HA performance. AIDA interprets searching for the "most interesting alternative" as an issue of optimal (acoustic) context-aware Bayesian trial design. In computational terms, AIDA is realized as an active inference-based agent with an Expected Free Energy criterion for trial design. This type of architecture is inspired by neuro-economic models on efficient (Bayesian) trial design in brains and implies that AIDA comprises generative probabilistic models for acoustic signals and user responses. We propose a novel generative model for acoustic signals as a sum of time-varying auto-regressive filters and a user response model based on a Gaussian Process Classifier. The full AIDA agent has been implemented in a factor graph for the generative model and all tasks (parameter learning, acoustic context classification, trial design, etc.) are realized by variational message passing on the factor graph. All verification and validation experiments and demonstrations are freely accessible at our GitHub repository

    Message Passing-based Inference in Hierarchical Autoregressive Models

    Get PDF

    Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning

    Get PDF
    A quantum thermal machine is an open quantum system that enables the conversion between heat and work at the micro or nano-scale. Optimally controlling such out-of-equilibrium systems is a crucial yet challenging task with applications to quantum technologies and devices. We introduce a general model-free framework based on reinforcement learning to identify out-of-equilibrium thermodynamic cycles that are Pareto optimal tradeoffs between power and efficiency for quantum heat engines and refrigerators. The method does not require any knowledge of the quantum thermal machine, nor of the system model, nor of the quantum state. Instead, it only observes the heat fluxes, so it is both applicable to simulations and experimental devices. We test our method on a model of an experimentally realistic refrigerator based on a superconducting qubit, and on a heat engine based on a quantum harmonic oscillator. In both cases, we identify the Pareto-front representing optimal power-efficiency tradeoffs, and the corresponding cycles. Such solutions outperform previous proposals made in the literature, such as optimized Otto cycles, reducing quantum friction

    Generative Models for Learning Robot Manipulation Skills from Humans

    Get PDF
    A long standing goal in artificial intelligence is to make robots seamlessly interact with humans in performing everyday manipulation skills. Learning from demonstrations or imitation learning provides a promising route to bridge this gap. In contrast to direct trajectory learning from demonstrations, many problems arise in interactive robotic applications that require higher contextual level understanding of the environment. This requires learning invariant mappings in the demonstrations that can generalize across different environmental situations such as size, position, orientation of objects, viewpoint of the observer, etc. In this thesis, we address this challenge by encapsulating invariant patterns in the demonstrations using probabilistic learning models for acquiring dexterous manipulation skills. We learn the joint probability density function of the demonstrations with a hidden semi-Markov model, and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The model exploits the invariant segments (also termed as sub-goals, options or actions) in the demonstrations and adapts the movement in accordance with the external environmental situations such as size, position and orientation of the objects in the environment using a task-parameterized formulation. We incorporate high-dimensional sensory data for skill acquisition by parsimoniously representing the demonstrations using statistical subspace clustering methods and exploit the coordination patterns in latent space. To adapt the models on the fly and/or teach new manipulation skills online with the streaming data, we formulate a non-parametric scalable online sequence clustering algorithm with Bayesian non-parametric mixture models to avoid the model selection problem while ensuring tractability under small variance asymptotics. We exploit the developed generative models to perform manipulation skills with remotely operated vehicles over satellite communication in the presence of communication delays and limited bandwidth. A set of task-parameterized generative models are learned from the demonstrations of different manipulation skills provided by the teleoperator. The model captures the intention of teleoperator on one hand and provides assistance in performing remote manipulation tasks on the other hand under varying environmental situations. The assistance is formulated under time-independent shared control, where the model continuously corrects the remote arm movement based on the current state of the teleoperator; and/or time-dependent autonomous control, where the model synthesizes the movement of the remote arm for autonomous skill execution. Using the proposed methodology with the two-armed Baxter robot as a mock-up for semi-autonomous teleoperation, we are able to learn manipulation skills such as opening a valve, pick-and-place an object by obstacle avoidance, hot-stabbing (a specialized underwater task akin to peg-in-a-hole task), screw-driver target snapping, and tracking a carabiner in as few as 4 - 8 demonstrations. Our study shows that the proposed manipulation assistance formulations improve the performance of the teleoperator by reducing the task errors and the execution time, while catering for the environmental differences in performing remote manipulation tasks with limited bandwidth and communication delays

    MANIFOLD REPRESENTATIONS OF MUSICAL SIGNALS AND GENERATIVE SPACES

    Get PDF
    Tra i diversi campi di ricerca nell\u2019ambito dell\u2019informatica musicale, la sintesi e la generazione di segnali audio incarna la pluridisciplinalita\u300 di questo settore, nutrendo insieme le pratiche scientifiche e musicale dalla sua creazione. Inerente all\u2019informatica dalla sua creazione, la generazione audio ha ispirato numerosi approcci, evolvendo colle pratiche musicale e gli progressi tecnologici e scientifici. Inoltre, alcuni processi di sintesi permettono anche il processo inverso, denominato analisi, in modo che i parametri di sintesi possono anche essere parzialmente o totalmente estratti dai suoni, dando una rappresentazione alternativa ai segnali analizzati. Per di piu\u300, la recente ascesa dei algoritmi di l\u2019apprendimento automatico ha vivamente interrogato il settore della ricerca scientifica, fornendo potenti data-centered metodi che sollevavano diversi epistemologici interrogativi, nonostante i sui efficacia. Particolarmente, un tipo di metodi di apprendimento automatico, denominati modelli generativi, si concentrano sulla generazione di contenuto originale usando le caratteristiche che hanno estratti dei dati analizzati. In tal caso, questi modelli non hanno soltanto interrogato i precedenti metodi di generazione, ma anche sul modo di integrare questi algoritmi nelle pratiche artistiche. Mentre questi metodi sono progressivamente introdotti nel settore del trattamento delle immagini, la loro applicazione per la sintesi di segnali audio e ancora molto marginale. In questo lavoro, il nostro obiettivo e di proporre un nuovo metodo di audio sintesi basato su questi nuovi tipi di generativi modelli, rafforazti dalle nuove avanzati dell\u2019apprendimento automatico. Al primo posto, facciamo una revisione dei approcci esistenti nei settori dei sistemi generativi e di sintesi sonore, focalizzando sul posto di nostro lavoro rispetto a questi disciplini e che cosa possiamo aspettare di questa collazione. In seguito, studiamo in maniera piu\u300 precisa i modelli generativi, e come possiamo utilizzare questi recenti avanzati per l\u2019apprendimento di complesse distribuzione di suoni, in un modo che sia flessibile e nel flusso creativo del utente. Quindi proponiamo un processo di inferenza / generazione, il quale rifletta i processi di analisi/sintesi che sono molto usati nel settore del trattamento del segnale audio, usando modelli latenti, che sono basati sull\u2019utilizzazione di un spazio continuato di alto livello, che usiamo per controllare la generazione. Studiamo dapprima i risultati preliminari ottenuti con informazione spettrale estratte da diversi tipi di dati, che valutiamo qualitativamente e quantitativamente. Successiva- mente, studiamo come fare per rendere questi metodi piu\u300 adattati ai segnali audio, fronteggiando tre diversi aspetti. Primo, proponiamo due diversi metodi di regolarizzazione di questo generativo spazio che sono specificamente sviluppati per l\u2019audio : una strategia basata sulla traduzione segnali / simboli, e una basata su vincoli percettivi. Poi, proponiamo diversi metodi per fronteggiare il aspetto temporale dei segnali audio, basati sull\u2019estrazione di rappresentazioni multiscala e sulla predizione, che permettono ai generativi spazi ottenuti di anche modellare l\u2019aspetto dinamico di questi segnali. Per finire, cambiamo il nostro approccio scientifico per un punto di visto piu\u301 ispirato dall\u2019idea di ricerca e creazione. Primo, descriviamo l\u2019architettura e il design della nostra libreria open-source, vsacids, sviluppata per permettere a esperti o non-esperti musicisti di provare questi nuovi metodi di sintesi. Poi, proponiamo una prima utilizzazione del nostro modello con la creazione di una performance in real- time, chiamata \ue6go, basata insieme sulla nostra libreria vsacids e sull\u2019uso di une agente di esplorazione, imparando con rinforzo nel corso della composizione. Finalmente, tramo dal lavoro presentato alcuni conclusioni sui diversi modi di migliorare e rinforzare il metodo di sintesi proposto, nonche\u301 eventuale applicazione artistiche.Among the diverse research fields within computer music, synthesis and generation of audio signals epitomize the cross-disciplinarity of this domain, jointly nourishing both scientific and artistic practices since its creation. Inherent in computer music since its genesis, audio generation has inspired numerous approaches, evolving both with musical practices and scientific/technical advances. Moreover, some syn- thesis processes also naturally handle the reverse process, named analysis, such that synthesis parameters can also be partially or totally extracted from actual sounds, and providing an alternative representation of the analyzed audio signals. On top of that, the recent rise of machine learning algorithms earnestly questioned the field of scientific research, bringing powerful data-centred methods that raised several epistemological questions amongst researchers, in spite of their efficiency. Especially, a family of machine learning methods, called generative models, are focused on the generation of original content using features extracted from an existing dataset. In that case, such methods not only questioned previous approaches in generation, but also the way of integrating this methods into existing creative processes. While these new generative frameworks are progressively introduced in the domain of image generation, the application of such generative techniques in audio synthesis is still marginal. In this work, we aim to propose a new audio analysis-synthesis framework based on these modern generative models, enhanced by recent advances in machine learning. We first review existing approaches, both in sound synthesis and in generative machine learning, and focus on how our work inserts itself in both practices and what can be expected from their collation. Subsequently, we focus a little more on generative models, and how modern advances in the domain can be exploited to allow us learning complex sound distributions, while being sufficiently flexible to be integrated in the creative flow of the user. We then propose an inference / generation process, mirroring analysis/synthesis paradigms that are natural in the audio processing domain, using latent models that are based on a continuous higher-level space, that we use to control the generation. We first provide preliminary results of our method applied on spectral information, extracted from several datasets, and evaluate both qualitatively and quantitatively the obtained results. Subsequently, we study how to make these methods more suitable for learning audio data, tackling successively three different aspects. First, we propose two different latent regularization strategies specifically designed for audio, based on and signal / symbol translation and perceptual constraints. Then, we propose different methods to address the inner temporality of musical signals, based on the extraction of multi-scale representations and on prediction, that allow the obtained generative spaces that also model the dynamics of the signal. As a last chapter, we swap our scientific approach to a more research & creation-oriented point of view: first, we describe the architecture and the design of our open-source library, vsacids, aiming to be used by expert and non-expert music makers as an integrated creation tool. Then, we propose an first musical use of our system by the creation of a real-time performance, called aego, based jointly on our framework vsacids and an explorative agent using reinforcement learning to be trained during the performance. Finally, we draw some conclusions on the different manners to improve and reinforce the proposed generation method, as well as possible further creative applications.A\u300 travers les diffe\u301rents domaines de recherche de la musique computationnelle, l\u2019analysie et la ge\u301ne\u301ration de signaux audio sont l\u2019exemple parfait de la trans-disciplinarite\u301 de ce domaine, nourrissant simultane\u301ment les pratiques scientifiques et artistiques depuis leur cre\u301ation. Inte\u301gre\u301e a\u300 la musique computationnelle depuis sa cre\u301ation, la synthe\u300se sonore a inspire\u301 de nombreuses approches musicales et scientifiques, e\u301voluant de pair avec les pratiques musicales et les avance\u301es technologiques et scientifiques de son temps. De plus, certaines me\u301thodes de synthe\u300se sonore permettent aussi le processus inverse, appele\u301 analyse, de sorte que les parame\u300tres de synthe\u300se d\u2019un certain ge\u301ne\u301rateur peuvent e\u302tre en partie ou entie\u300rement obtenus a\u300 partir de sons donne\u301s, pouvant ainsi e\u302tre conside\u301re\u301s comme une repre\u301sentation alternative des signaux analyse\u301s. Paralle\u300lement, l\u2019inte\u301re\u302t croissant souleve\u301 par les algorithmes d\u2019apprentissage automatique a vivement questionne\u301 le monde scientifique, apportant de puissantes me\u301thodes d\u2019analyse de donne\u301es suscitant de nombreux questionnements e\u301piste\u301mologiques chez les chercheurs, en de\u301pit de leur effectivite\u301 pratique. En particulier, une famille de me\u301thodes d\u2019apprentissage automatique, nomme\u301e mode\u300les ge\u301ne\u301ratifs, s\u2019inte\u301ressent a\u300 la ge\u301ne\u301ration de contenus originaux a\u300 partir de caracte\u301ristiques extraites directement des donne\u301es analyse\u301es. Ces me\u301thodes n\u2019interrogent pas seulement les approches pre\u301ce\u301dentes, mais aussi sur l\u2019inte\u301gration de ces nouvelles me\u301thodes dans les processus cre\u301atifs existants. Pourtant, alors que ces nouveaux processus ge\u301ne\u301ratifs sont progressivement inte\u301gre\u301s dans le domaine la ge\u301ne\u301ration d\u2019image, l\u2019application de ces techniques en synthe\u300se audio reste marginale. Dans cette the\u300se, nous proposons une nouvelle me\u301thode d\u2019analyse-synthe\u300se base\u301s sur ces derniers mode\u300les ge\u301ne\u301ratifs, depuis renforce\u301s par les avance\u301es modernes dans le domaine de l\u2019apprentissage automatique. Dans un premier temps, nous examinerons les approches existantes dans le domaine des syste\u300mes ge\u301ne\u301ratifs, sur comment notre travail peut s\u2019inse\u301rer dans les pratiques de synthe\u300se sonore existantes, et que peut-on espe\u301rer de l\u2019hybridation de ces deux approches. Ensuite, nous nous focaliserons plus pre\u301cise\u301ment sur comment les re\u301centes avance\u301es accomplies dans ce domaine dans ce domaine peuvent e\u302tre exploite\u301es pour l\u2019apprentissage de distributions sonores complexes, tout en e\u301tant suffisamment flexibles pour e\u302tre inte\u301gre\u301es dans le processus cre\u301atif de l\u2019utilisateur. Nous proposons donc un processus d\u2019infe\u301rence / g\ue9n\ue9ration, refle\u301tant les paradigmes d\u2019analyse-synthe\u300se existant dans le domaine de ge\u301ne\u301ration audio, base\u301 sur l\u2019usage de mode\u300les latents continus que l\u2019on peut utiliser pour contro\u302ler la ge\u301ne\u301ration. Pour ce faire, nous e\u301tudierons de\u301ja\u300 les re\u301sultats pre\u301liminaires obtenus par cette me\u301thode sur l\u2019apprentissage de distributions spectrales, prises d\u2019ensembles de donne\u301es diversifie\u301s, en adoptant une approche a\u300 la fois quantitative et qualitative. Ensuite, nous proposerons d\u2019ame\u301liorer ces me\u301thodes de manie\u300re spe\u301cifique a\u300 l\u2019audio sur trois aspects distincts. D\u2019abord, nous proposons deux strate\u301gies de re\u301gularisation diffe\u301rentes pour l\u2019analyse de signaux audio : une base\u301e sur la traduction signal/ symbole, ainsi qu\u2019une autre base\u301e sur des contraintes perceptives. Nous passerons par la suite a\u300 la dimension temporelle de ces signaux audio, proposant de nouvelles me\u301thodes base\u301es sur l\u2019extraction de repre\u301sentations temporelles multi-e\u301chelle et sur une ta\u302che supple\u301mentaire de pre\u301diction, permettant la mode\u301lisation de caracte\u301ristiques dynamiques par les espaces ge\u301ne\u301ratifs obtenus. En dernier lieu, nous passerons d\u2019une approche scientifique a\u300 une approche plus oriente\u301e vers un point de vue recherche & cre\u301ation. Premie\u300rement, nous pre\u301senterons notre librairie open-source, vsacids, visant a\u300 e\u302tre employe\u301e par des cre\u301ateurs experts et non-experts comme un outil inte\u301gre\u301. Ensuite, nous proposons une premie\u300re utilisation musicale de notre syste\u300me par la cre\u301ation d\u2019une performance temps re\u301el, nomme\u301e \ue6go, base\u301e a\u300 la fois sur notre librarie et sur un agent d\u2019exploration appris dynamiquement par renforcement au cours de la performance. Enfin, nous tirons les conclusions du travail accompli jusqu\u2019a\u300 maintenant, concernant les possibles ame\u301liorations et de\u301veloppements de la me\u301thode de synthe\u300se propose\u301e, ainsi que sur de possibles applications cre\u301atives

    Weakly-Labeled Data and Identity-Normalization for Facial Image Analysis

    Get PDF
    RÉSUMÉ Cette thèse traite de l’amélioration de la reconnaissance faciale et de l’analyse de l’expression du visage en utilisant des sources d’informations faibles. Les données étiquetées sont souvent rares, mais les données non étiquetées contiennent souvent des informations utiles pour l’apprentissage d’un modèle. Cette thèse décrit deux exemples d’utilisation de cette idée. Le premier est une nouvelle méthode pour la reconnaissance faciale basée sur l’exploitation de données étiquetées faiblement ou bruyamment. Les données non étiquetées peuvent être acquises d’une manière qui offre des caractéristiques supplémentaires. Ces caractéristiques, tout en n’étant pas disponibles pour les données étiquetées, peuvent encore être utiles avec un peu de prévoyance. Cette thèse traite de la combinaison d’un ensemble de données étiquetées pour la reconnaissance faciale avec des images des visages extraits de vidéos sur YouTube et des images des visages obtenues à partir d’un moteur de recherche. Le moteur de recherche web et le moteur de recherche vidéo peuvent être considérés comme de classificateurs très faibles alternatifs qui fournissent des étiquettes faibles. En utilisant les résultats de ces deux types de requêtes de recherche comme des formes d’étiquettes faibles différents, une méthode robuste pour la classification peut être développée. Cette méthode est basée sur des modèles graphiques, mais aussi incorporant une marge probabiliste. Plus précisément, en utilisant un modèle inspiré par la variational relevance vector machine (RVM), une alternative probabiliste à la support vector machine (SVM) est développée. Contrairement aux formulations précédentes de la RVM, le choix d’une probabilité a priori exponentielle est introduit pour produire une approximation de la pénalité L1. Les résultats expérimentaux où les étiquettes bruyantes sont simulées, et les deux expériences distinctes où les étiquettes bruyantes de l’image et les résultats de recherche vidéo en utilisant des noms comme les requêtes indiquent que l’information faible dans les étiquettes peut être exploitée avec succès. Puisque le modèle dépend fortement des méthodes noyau de régression clairsemées, ces méthodes sont examinées et discutées en détail. Plusieurs algorithmes différents utilisant les distributions a priori pour encourager les modèles clairsemés sont décrits en détail. Des expériences sont montrées qui illustrent le comportement de chacune de ces distributions. Utilisés en conjonction avec la régression logistique, les effets de chaque distribution sur l’ajustement du modèle et la complexité du modèle sont montrés. Les extensions aux autres méthodes d’apprentissage machine sont directes, car l’approche est ancrée dans la probabilité bayésienne. Une expérience dans la prédiction structurée utilisant un conditional random field pour une tâche d’imagerie médicale est montrée pour illustrer comment ces distributions a priori peuvent être incorporées facilement à d’autres tâches et peuvent donner de meilleurs résultats. Les données étiquetées peuvent également contenir des sources faibles d’informations qui ne peuvent pas nécessairement être utilisées pour un effet maximum. Par exemple les ensembles de données d’images des visages pour les tâches tels que, l’animation faciale contrôlée par les performances des comédiens, la reconnaissance des émotions, et la prédiction des points clés ou les repères du visage contiennent souvent des étiquettes alternatives par rapport à la tâche d’internet principale. Dans les données de reconnaissance des émotions, par exemple, des étiquettes de l’émotion sont souvent rares. C’est peut-être parce que ces images sont extraites d’une vidéo, dans laquelle seul un petit segment représente l’étiquette de l’émotion. En conséquence, de nombreuses images de l’objet sont dans le même contexte en utilisant le même appareil photo ne sont pas utilisés. Toutefois, ces données peuvent être utilisées pour améliorer la capacité des techniques d’apprentissage de généraliser pour des personnes nouvelles et pas encore vues en modélisant explicitement les variations vues précédemment liées à l’identité et à l’expression. Une fois l’identité et de la variation de l’expression sont séparées, les approches supervisées simples peuvent mieux généraliser aux identités de nouveau. Plus précisément, dans cette thèse, la modélisation probabiliste de ces sources de variation est utilisée pour identité normaliser et des diverses représentations d’images faciales. Une variété d’expériences sont décrites dans laquelle la performance est constamment améliorée, incluant la reconnaissance des émotions, les animations faciales contrôlées par des visages des comédiens sans marqueurs et le suivi des points clés sur des visages. Dans de nombreux cas dans des images faciales, des sources d’information supplémentaire peuvent être disponibles qui peuvent être utilisées pour améliorer les tâches d’intérêt. Cela comprend des étiquettes faibles qui sont prévues pendant la collecte des données, telles que la requête de recherche utilisée pour acquérir des données, ainsi que des informations d’identité dans le cas de plusieurs bases de données d’images expérimentales. Cette thèse soutient en principal que cette information doit être utilisée et décrit les méthodes pour le faire en utilisant les outils de la probabilité.----------ABSTRACT This thesis deals with improving facial recognition and facial expression analysis using weak sources of information. Labeled data is often scarce, but unlabeled data often contains information which is helpful to learning a model. This thesis describes two examples of using this insight. The first is a novel method for face-recognition based on leveraging weak or noisily labeled data. Unlabeled data can be acquired in a way which provides additional features. These features, while not being available for the labeled data, may still be useful with some foresight. This thesis discusses combining a labeled facial recognition dataset with face images extracted from videos on YouTube and face images returned from using a search engine. The web search engine and the video search engine can be viewed as very weak alternative classifier which provide “weak labels.” Using the results from these two different types of search queries as forms of weak labels, a robust method for classification can be developed. This method is based on graphical models, but also encorporates a probabilistic margin. More specifically, using a model inspired by the variational relevance vector machine (RVM), a probabilistic alternative to transductive support vector machines (TSVM) is further developed. In contrast to previous formulations of RVMs, the choice of an Exponential hyperprior is introduced to produce an approximation to the L1 penalty. Experimental results where noisy labels are simulated and separate experiments where noisy labels from image and video search results using names as queries both indicate that weak label information can be successfully leveraged. Since the model depends heavily on sparse kernel regression methods, these methods are reviewed and discussed in detail. Several different sparse priors algorithms are described in detail. Experiments are shown which illustrate the behavior of each of these sparse priors. Used in conjunction with logistic regression, each sparsity inducing prior is shown to have varying effects in terms of sparsity and model fit. Extending this to other machine learning methods is straight forward since it is grounded firmly in Bayesian probability. An experiment in structured prediction using Conditional Random Fields on a medical image task is shown to illustrate how sparse priors can easily be incorporated in other tasks, and can yield improved results. Labeled data may also contain weak sources of information that may not necessarily be used to maximum effect. For example, facial image datasets for the tasks of performance driven facial animation, emotion recognition, and facial key-point or landmark prediction often contain alternative labels from the task at hand. In emotion recognition data, for example, emotion labels are often scarce. This may be because these images are extracted from a video, in which only a small segment depicts the emotion label. As a result, many images of the subject in the same setting using the same camera are unused. However, this data can be used to improve the ability of learning techniques to generalize to new and unseen individuals by explicitly modeling previously seen variations related to identity and expression. Once identity and expression variation are separated, simpler supervised approaches can work quite well to generalize to unseen subjects. More specifically, in this thesis, probabilistic modeling of these sources of variation is used to “identity-normalize” various facial image representations. A variety of experiments are described in which performance on emotion recognition, markerless performance-driven facial animation and facial key-point tracking is consistently improved. This includes an algorithm which shows how this kind of normalization can be used for facial key-point localization. In many cases in facial images, sources of information may be available that can be used to improve tasks. This includes weak labels which are provided during data gathering, such as the search query used to acquire data, as well as identity information in the case of many experimental image databases. This thesis argues in main that this information should be used and describes methods for doing so using the tools of probability

    Friction, Vibration and Dynamic Properties of Transmission System under Wear Progression

    Get PDF
    This reprint focuses on wear and fatigue analysis, the dynamic properties of coating surfaces in transmission systems, and non-destructive condition monitoring for the health management of transmission systems. Transmission systems play a vital role in various types of industrial structure, including wind turbines, vehicles, mining and material-handling equipment, offshore vessels, and aircrafts. Surface wear is an inevitable phenomenon during the service life of transmission systems (such as on gearboxes, bearings, and shafts), and wear propagation can reduce the durability of the contact coating surface. As a result, the performance of the transmission system can degrade significantly, which can cause sudden shutdown of the whole system and lead to unexpected economic loss and accidents. Therefore, to ensure adequate health management of the transmission system, it is necessary to investigate the friction, vibration, and dynamic properties of its contact coating surface and monitor its operating conditions
    • …
    corecore