737 research outputs found

    Patch-based models for visual object classes

    Get PDF
    This thesis concerns models for visual object classes that exhibit a reasonable amount of regularity, such as faces, pedestrians, cells and human brains. Such models are useful for making “within-object” inferences such as determining their individual characteristics and establishing their identity. For example, the model could be used to predict the identity of a face, the pose of a pedestrian or the phenotype of a cell and segment parts of a human brain. Existing object modelling techniques have several limitations. First, most current methods have targeted the above tasks individually using object specific representations; therefore, they cannot be applied to other problems without major alterations. Second, most methods have been designed to work with small databases which do not contain the variations in pose, illumination, occlusion and background clutter seen in ‘real world’ images. Consequently, many existing algorithms fail when tested on unconstrained databases. Finally, the complexity of the training procedure in these methods makes it impractical to use large datasets. In this thesis, we investigate patch-based models for object classes. Our models are capable of exploiting very large databases of objects captured in uncontrolled environments. We represent the test image with a regular grid of patches from a library of images of the same object. All the domain specific information is held in this library: we use one set of images of the object to help draw inferences about others. In each experimental chapter we investigate a different within-object inference task. In particular we develop models for classification, regression, semantic segmentation and identity recognition. In each task, we achieve results that are comparable to or better than the state of the art. We conclude that patch-based representation can be successfully used for the above tasks and shows promise for other applications such as generation and localization

    Synthesizing and Editing Photo-realistic Visual Objects

    Get PDF
    In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images

    A dramaturgy of intermediality: composing with integrative design

    Get PDF
    The thesis investigates and develops a compositional system on intermediality in theatre and performance as a dramaturgical practice through integrative design. The position of the visual/sonic media in theatre and performance has been altered by the digitalisation and networking of media technologies, which enables enhanced dynamic variables in the intermedial processes. The emergent intermediality sites are made accessible by developments in media technologies and form part of broader changes towards a mediatised society: a simultaneous shift in cultural contexts, theatre practice and audience perception. The practice-led research is situated within a postdramatic context and develops a system of compositional perspectives and procedures to enhance the knowledge of a dramaturgy on intermediality. The intermediality forms seem to re-situate the actual/virtual relations in theatre and re-construct the processes of theatricalisation in the composition of the stage narrative. The integration of media and performers produces a compositional environment of semiosis, where the theatre becomes a site of narration, and the designed integration in-between medialities emerges as intermediality sites in the performance event. A selection of performances and theatre directors is identified, who each in distinct ways integrate mediating technologies as a core element in their compositional design. These directors and performances constitute a source of reflection on compositional strategies from the perspective of practice, and enable comparative discussions on dramaturgical design and the consistency of intermediality sites. The practice-led research realised a series of prototyping processes situated in performance laboratories in 2004-5. The laboratories staged investigations into the relation between integrative design procedures and parameters for composition of intermediality sites, particularly the relative presence in-between the actual and the virtual, and the relative duration and distance in-between timeness and placeness. The integration of performer activities and media operations into dramaturgical structures were developed as a design process of identifying the mapping and experiencing the landscape through iterative prototyping. The developed compositional concepts and strategies were realised in the prototype performance Still I Know Who I Am, performed October 2006. This final research performance was a full-scale professional production, which explored the developed dramaturgical designs through creative practice. The performance was realised as a public event, and composed of a series of scenes, each presenting a specific composite of the developed integrative design strategies, and generating a particular intermediality site. The research processes in the performance laboratories and the prototype performance developed on characteristics, parameters and procedures of compositional strategies, investigating the viability of a dramaturgy of intermediality through integrative design. The practice undertaken constitutes raw material from which the concepts are drawn and underpins the premises for the theoretical reflections

    Engaged Humanities

    Get PDF
    What is the role of the humanities at the start of 21st century? In the last few decades, the various disciplines of the humanities (history, linguistics, literary studies, art history, media studies) have encountered a broad range of challenges, related to the future of print culture, to shifts in funding strategies, and to the changing contours of culture and society. Several publications have addressed these challenges as well as potential responses on a theoretical level. This coedited volume opts for a different strategy and presents accessible case studies that demonstrate what humanities scholars contribute to concrete and pressing social debates about topics including adoption, dementia, hacking, and conservation. These “engaged” forms of humanities research reveal the continued importance of thinking and rethinking the nature of art, culture, and public life

    Barbara Morgan's Photographic Interpretation of American Culture, 1935-1980

    Get PDF
    In 1935, Barbara Morgan, a recent arrival in Depression-era New York, reinvented her career as an artist when she abandoned painting and adopted the medium of photography. In the four-and-a-half decades that followed, Morgan witnessed the remaining years of the Great Depression, World War II, the Korean Conflict, the Cold War, the Vietnam War, and Three Mile Island. This dissertation will trace the photographic oeuvre of Morgan as she responded to these events both directly and indirectly, while simultaneously tracking the important artistic and cultural trends of each decade. The first chapter discusses Morgan's early photomontage work, in which she pushed the boundaries of American photography while exploring diverse metaphors for metropolitan splendor and urban isolation as well as the anxieties of the Great Depression and hope for a better future. Morgan's 1941 book Martha Graham: Sixteen Dances in Photographs anchors the second chapter. The influential dance photographs that comprise this publication highlight Morgan's modernist interpretations of Martha Graham's early dances and allow Morgan to examine beauty, strength, and a complex series of emotions through simple gestures and movement. The third chapter uses the light abstraction Morgan employed as a tailpiece for Sixteen Dances as the starting point to investigate her connections to broader artistic trends in the United States during and after the Second World War. In 1951, Morgan published Summer's Children, a photographic account of life in a summer camp that marked a major departure for the artist. Chapter four examines this book in the context of the Cold War and considers such diverse topics as summer camps, progressive education, fear-mongering, and the rise of the photo-spread. In the last two decades of her career, Morgan returned to the medium of photomontage. The fifth chapter examines this period, in which Morgan protested nuclear proliferation, environmental indifference, a perceived lack of scientific morality, and violent entertainment through her montages

    Generación de resúmenes de videos basada en consultas utilizando aprendizaje de máquina y representaciones coordinadas

    Get PDF
    Video constitutes the primary substrate of information of humanity, consider the video data uploaded daily on platforms as YouTube: 300 hours of video per minute, video analysis is currently one of the most active areas in computer science and industry, which includes fields such as video classification, video retrieval and video summarization (VSUMM). VSUMM is a hot research field due to its importance in allowing human users to simplify the information processing required to see and analyze sets of videos, for example, reducing the number of hours of recorded videos to be analyzed by a security personnel. On the other hand, many video analysis tasks and systems requires to reduce the computational load using segmentation schemes, compression algorithms, and video summarization techniques. Many approaches have been studied to solve VSUMM. However, it is not a single solution problem due to its subjective and interpretative nature, in the sense that important parts to be preserved from the input video requires a subjective estimation of an importance sco- re. This score can be related to how interesting are some video segments, how close they represent the complete video, and how segments are related to the task a human user is performing in a given situation. For example, a movie trailer is, in part, a VSUMM task but related to preserving promising and interesting parts from the movie but not to be able to reconstruct the movie content from them, i.e., movie trailers contains interesting scenes but not representative ones. On the contrary, in a surveillance situation, a summary from the closed-circuit cameras needs to be representative and interesting, and in some situations related with some objects of interest, for example, if it is needed to find a person or a car. As written natural language is the main human-machine communication interface, recently some works have made advances in allowing to include textual queries in the VSUMM process which allows to guide the summarization process, in the sense that video segments related with the query are considered important. In this thesis, we present a computational framework to perform video summarization over an input video, which allows the user to input free-form sentences and keywords queries to guide the process by considering user intention or task intention, but also considering general objectives such as representativeness and interestingness. Our framework relies on the use of pre-trained deep visual and linguistic models, although we trained our visual-linguistic coordination model. We expect this model will be of interest in cases where VSUMM tasks requires a high degree of specification of user/task intentions with minimal training stages and rapid deployment.El video constituye el sustrato primario de información de la humanidad, por ejemplo, considere los datos de video subidos diariamente en plataformas cómo YouTube: 300 horas de video por minuto. El análisis de video es actualmente una de las áreas más activas en la informática y la industria, que incluye campos como la clasificación, recuperación y generación de resúmenes de video (VSUMM). VSUMM es un campo de investigación de alto dinamismo debido a su importancia al permitir que los usuarios humanos simplifiquen el procesamiento de la información requerido para ver y analizar conjuntos de videos, por ejemplo, reduciendo la cantidad de horas de videos grabados para ser analizados por un personal de seguridad. Por otro lado, muchas tareas y sistemas de análisis de video requieren reducir la carga computacional utilizando esquemas de segmentación, algoritmos de compresión y técnicas de VSUMM. Se han estudiado muchos enfoques para abordar VSUMM. Sin embargo, no es un problema de solución única debido a su naturaleza subjetiva e interpretativa, en el sentido de que las partes importantes que se deben preservar del video de entrada, requieren una estimación de una puntuación de importancia. Esta puntuación puede estar relacionada con lo interesantes que son algunos segmentos de video, lo cerca que representan el video completo y con cómo los segmentos están relacionados con la tarea que un usuario humano está realizando en una situación determinada. Por ejemplo, un avance de película es, en parte, una tarea de VSUMM, pero esta ́ relacionada con la preservación de partes prometedoras e interesantes de la película, pero no con la posibilidad de reconstruir el contenido de la película a partir de ellas, es decir, los avances de películas contienen escenas interesantes pero no representativas. Por el contrario, en una situación de vigilancia, un resumen de las cámaras de circuito cerrado debe ser representativo e interesante, y en algunas situaciones relacionado con algunos objetos de interés, por ejemplo, si se necesita para encontrar una persona o un automóvil. Dado que el lenguaje natural escrito es la principal interfaz de comunicación hombre-máquina, recientemente algunos trabajos han avanzado en permitir incluir consultas textuales en el proceso VSUMM lo que permite orientar el proceso de resumen, en el sentido de que los segmentos de video relacionados con la consulta se consideran importantes. En esta tesis, presentamos un marco computacional para realizar un resumen de video sobre un video de entrada, que permite al usuario ingresar oraciones de forma libre y consultas de palabras clave para guiar el proceso considerando la intención del mismo o la intención de la tarea, pero también considerando objetivos generales como representatividad e interés. Nuestro marco se basa en el uso de modelos visuales y linguísticos profundos pre-entrenados, aunque también entrenamos un modelo propio de coordinación visual-linguística. Esperamos que este marco computacional sea de interés en los casos en que las tareas de VSUMM requieran un alto grado de especificación de las intenciones del usuario o tarea, con pocas etapas de entrenamiento y despliegue rápido.MincienciasDoctorad

    Good enough sculptures : what happens when sculptures are made to be filmed?

    Get PDF
    This PhD proposes the camera as a tool in the creation of sculpture. Exploring the ways in which the sculptural process is transformed by its relationship to the moment of filming, it aligns itself with artistic practices and theories which foreground material exploration, uncertainty and improvisation, and draws on a number of key artists who have used film and video to extend and explore sculptural practice. It situates fine art practice as a vehicle for exploratory and open-ended research, forging strong links with contemporary art educational theory which sees the creative process as heuristic and immersed within a social context. Using Winnicott’s theory of transitional objects and conception of psychoanalytic practice as a specialised form of play, the PhD forges strong connections between the engaged, responsive and explorative work done by the artist, analyst, teacher and student. The research presents a form of artistic research which facilitates encounters between objects and cameras, through which learning can take place and knowledge can be created - knowledge, which is not discrete or abstracted, but contextualised and embodied. The aim is to involve people in its processes and methods, as opposed to presenting finished works and findings, inscribing the reception of the work into the making process thereby producing active viewers and participants who are thoughtfully and practically involved within the making process. The artistic research method revolves around a collection of objects made to prompt physical, material and imaginative exploration in front of the camera. The camera’s field of vision is re-considered as an arena or situation structured in order to facilitate exploratory activity. ‘Filming sculpture’ becomes the situation/set-up which organises the production of objects-as-sculpture in ways that open up questions around sculpture as a particular category of object, the nature of film experience, and objects more generally. The PhD submission comprises a series of films and gifs, documentation of exhibitions, screenings and discussions undertaken during the research, experimental workshops, and photographs of each of the sculptures. The main written element consists of a series of aphoristic texts and a contextual document, which both draw on ideas and concepts from art and film theory, psychoanalysis, phenomenology, object-oriented ontology and anthropology, outlining the development of the research and situating it within a wider network of practices
    corecore