12 research outputs found

    Animation of Hand-drawn Faces using Machine Learning

    Get PDF
    Today's research in artificial vision has brought us new and exciting possibilities for the production and analysis of multimedia content. Pose estimation is an artificial vision technology that detects and identifies a human body's position and orientation within a picture or video. It locates key points on the bodies, and uses them to create three-dimensional models. In digital animation, pose estimation has paved the way for new visual effects and 3D renderings. By detecting human movements, it is now possible to create fluid realistic animations from still images. This bachelor thesis discusses the development of a pose estimation based program that is able to animate hand-drawn faces -- in particular the caricatured faces in Papiri di Laurea -- using machine learning and image manipulation. Working off of existing techniques for motion capture and 3D animation and making use of existing computer vision libraries like \textit{OpenCV} or \textit{dlib}, the project gave a satisfying result in the form of a short video of a hand-drawn caricatured figure that assumes the facial expressions fed to the program through an input video. The \textit{First Order Motion Model} was used to create this facial animation. It is a model based on the idea of transferring the movement detected from a source video to an image. %This model works best on close-ups of faces; the larger the background, the more the image gets distorted in the background. Possible future developments could include the creation of a website: the user loads their drawing and a video of themselves to get a gif version of their papiro. This could make for a new feature to add to portraits and caricatures, and more specifically to this thesis, a new way to celebrate graduates in Padova.Today's research in artificial vision has brought us new and exciting possibilities for the production and analysis of multimedia content. Pose estimation is an artificial vision technology that detects and identifies a human body's position and orientation within a picture or video. It locates key points on the bodies, and uses them to create three-dimensional models. In digital animation, pose estimation has paved the way for new visual effects and 3D renderings. By detecting human movements, it is now possible to create fluid realistic animations from still images. This bachelor thesis discusses the development of a pose estimation based program that is able to animate hand-drawn faces -- in particular the caricatured faces in Papiri di Laurea -- using machine learning and image manipulation. Working off of existing techniques for motion capture and 3D animation and making use of existing computer vision libraries like \textit{OpenCV} or \textit{dlib}, the project gave a satisfying result in the form of a short video of a hand-drawn caricatured figure that assumes the facial expressions fed to the program through an input video. The \textit{First Order Motion Model} was used to create this facial animation. It is a model based on the idea of transferring the movement detected from a source video to an image. %This model works best on close-ups of faces; the larger the background, the more the image gets distorted in the background. Possible future developments could include the creation of a website: the user loads their drawing and a video of themselves to get a gif version of their papiro. This could make for a new feature to add to portraits and caricatures, and more specifically to this thesis, a new way to celebrate graduates in Padova

    Severity scoring approach using modified optical flow method and lesion identification for facial nerve paralysis assessment

    Get PDF
    The facial nerve controls facial movement and expression. Hence, a patient with facial nerve paralysis will experience affected social interactions, psychological distress, and low self-esteem. Upon the first presentation, it is crucial to determine the severity level of the paralysis and take out the possibility of stroke or any other serious causes by recognising the type of lesion in preventing any mistreatment of the patient. Clinically, the facial nerve is assessed subjectively by observing voluntary facial movement and assigning a score based on the deductions made by the clinician. However, the results are not uniform among different examiners evaluating the same patients. This is extremely undesirable for both medical diagnostic and treatment considerations. Acknowledging the importance of this assessment, this research was conducted to develop a facial nerve assessment that can classify both the severity level of facial nerve function and also the types of facial lesion, Upper Motor Neuron (UMN) and Lower Motor Neuron (LMN), in facial regional assessment and lesion assessment, respectively. For regional assessment, two optical flow techniques, Kanade-Lucas-Tomasi (KLT) and Horn-Schunck (HS) were used in this study to determine the local and global motion information of facial features. Nevertheless, there is a problem with the original KLT which is the inability of the Eigen features to distinguish the normal and patient subjects. Thus, the KLT method was modified by introducing polygonal measurements and the landmarks were placed on each facial region. Similar to the HS method, the multiple frames evaluation was proposed rather than a single frame evaluation of the original HS method to avoid the differences between frames becoming too small. The features of these modified methods, Modified Local Sparse (MLS) and Modified Global Dense (MGD), were combined, namely the Combined Modified Local-Global (CMLG), to discover both local (certain region) and global (entire image) flow features. This served as the input into the k-NN classifier to assess the performance of each of them in determining the severity level of paralysis. For the lesion assessment, the Gabor filter method was used to extract the wrinkle forehead features. Thereafter, the Gabor features combined with the previous features of CMLG, by focusing only on the forehead region to evaluate both the wrinkle and motion information of the facial features. This is because, in an LMN lesion, the patient will not be able to move the forehead symmetrically during the rising eyebrows movement and unable to wrinkle the forehead due to the damaged frontalis muscle. However, the patient with a UMN lesion exhibits the same criteria as a normal subject, where the forehead is spared and can be lifted symmetrically. The CMLG technique in regional assessment showed the best performance in distinguishing between patient and normal subjects with an accuracy of 92.26% compared to that of MLS and MGD, which were 88.38% and 90.32%, respectively. From the results, some assessment tools were developed in this study namely individual score, total score and paralysis score chart which were correlated with the House-Brackmann score and validated by a medical professional with 91.30% of accuracy. In lesion assessment, the combined features of Gabor and CMLG on the forehead region depicted a greater performance in distinguishing the UMN and LMN lesion of the patient with an accuracy of 89.03% compared to Gabor alone, which was 78.07%. In conclusion, the proposed facial nerve assessment approach consisting of both regional assessment and lesion assessment is capable of determining the level of facial paralysis severity and recognising the type of facial lesion, whether it is a UMN or LMN lesion

    Image Processing Algorithms for Diagnostic Analysis of Microcirculation

    Get PDF
    Microcirculation has become a key factor for the study and assessment of tissue perfusion and oxygenation. Detection and assessment of the microvasculature using videomicroscopy from the oral mucosa provides a metric on the density of blood vessels in each single frame. Information pertaining to the density of these microvessels within a field of view can be used to quantitatively monitor and assess the changes occurring in tissue oxygenation and perfusion over time. Automated analysis of this information can be used for real-time diagnostic and therapeutic planning of a number of clinical applications including resuscitation. The objective of this study is to design an automated image processing system to segment microvessels, estimate the density of blood vessels in video recordings, and identify the distribution of blood flow. The proposed algorithm consists of two main stages: video processing and image segmentation. The first step of video processing is stabilization. In the video stabilization step, block matching is applied to the video frames. Similarity is measured by cross-correlation coefficients. The main technique used in the segmentation step is multi-thresholding and pixel verification based on calculated geometric and contrast parameters. Segmentation results and differences of video frames are then used to identify the capillaries with blood flow. After categorizing blood vessels as active or passive, according to the amount of blood flow, quantitative measures identifying microcirculation are calculated. The algorithm is applied to the videos obtained using Microscan Side-stream Dark Field (SDF) imaging technique captured from healthy and critically ill humans/animals. Segmentation results were compared and validated using a blind detailed inspection by experts who used a commercial semi-automated image analysis software program, AVA (Automated Vascular Analysis). The algorithm was found to extract approximately 97% of functionally active capillaries and blood vessels in every frame. The aim of this study is to eliminate the human interaction, increase accuracy and reduce the computation time. The proposed method is an entirely automated process that can perform stabilization, pre-processing, segmentation, and microvessel identification without human intervention. The method may allow for assessment of microcirculatory abnormalities occurring in critically ill and injured patients including close to real-time determination of the adequacy of resuscitation

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    The elusive digital frame and the elasticity of time in painting

    Get PDF
    How can we gain a deeper understanding about the emotional affects of painting with respect to temporality by working with the mechanisms and languages of the moving image? This practice-based doctoral research aims to add to our understanding of the perception of temporality in painterly surface and to investigate the relationship between subjective perceptions and emotional 'affect' in encounters with painting which offer an expanded and enhanced sense of lived temporality. The project sets out to do this by devising art works using the processes, apparatus and structures of 'experimental' film/video and photography. This work seeks to question what can cause the passing of time to become 'elastic' in the perception of the spectator encountering the 'strangeness' of painterly surface as an intense experience and asks how this phenomena may be connected with perceptions of time and vision for the embodied painter engaged in practice. In addition to painting practice within the project, works by Frank Auerbach are taken as examples of 'painterly' surface with which to consider temporality and spectator experience. The written thesis is used to document and reflect on the development of this practice-based work; in particular, insights derived from the two photo/video installation works Que Sera (2010) and Is It You? (2012) which juxtapose material made with high speed filming and long exposures and which engage with the 'frame' as a marker of time passing. The reflective thesis draws on theoretical material, including Maurice Merleau-Ponty's essays which propose painting as a form of metaphysics and a way of understanding how we see; Gilles Deleuze's work on the phenomenology of painting; the experimental film theory of Peter Gidal and recent neuroscientific work by Antonio Damasio, investigating vision and consciousness. This material is used in conjunction with observations from experimental and expanded film works as they deconstruct aspects of subjective temporality and visual perception

    Photographic Mediation as a Mode of Production: Investigating the Agency of Commercial Institutions in Contemporary Vernacular Photography

    Get PDF
    This dissertation argues that to understand what is at stake in contemporary vernacular photography, it is vital to account for the commercial imperatives that are invested in our photographic apparatus. The vernacular is often seen as emerging from the milieu of everyday life, operating outside of institutional constraints. However, commercial institutions have always played a vital role in shaping the meaning and matter of vernacular photography, producing the extended network of devices and protocols through which photographic activity takes place. Vernacular photography should therefore be seen to encapsulate a series of complex negotiations between individual desires and commercial imperatives. Through an examination of three central case studies - Kodak, Snapchat and Ditto Labs - this thesis aims to elucidate how the productive potential of vernacular photography is instrumentalized as a means of generating value. Bringing together approaches from western Marxism with contemporary theories of networked media and photography, the argument is made that photographic mediation can be usefully framed as a mode of production. Photographic mediation, referring to the processual and material dynamics of photography, is employed to investigate the circuits of labour, value and desire that flow through our photographic apparatus. In performing this analysis, the concept of deterritorialization is applied as a way of understanding how photographic mediation has become more productive through destabilizing the boundaries between photography, subjectivity and the everyday. As photography proliferates and disperses into the rhythms and atmospheres that constitute daily life, it is increasingly imbricated into the performance and production of identities, relationships and desires. Under these circumstances, it becomes all the more vital that we recognize the role of commercial actors in shaping not only our photographic apparatus, but also our ways of being in, and relating to, the world

    Terrorizing Images

    Get PDF
    This book combines two focal points: trauma and ekphrasis. It responds to the recognition of how terrorizing images permeate the public sphere in connection with traumatic experiences and conflicts and emphasizes the ways in which such images are described and interpreted by words. Contributors analyze the use of verbally represented images in a variety of literary texts, written in several different languages
    corecore