2,188 research outputs found

    Multi-Modal Aesthetic Assessment for MObile Gaming Image

    Full text link
    With the proliferation of various gaming technology, services, game styles, and platforms, multi-dimensional aesthetic assessment of the gaming contents is becoming more and more important for the gaming industry. Depending on the diverse needs of diversified game players, game designers, graphical developers, etc. in particular conditions, multi-modal aesthetic assessment is required to consider different aesthetic dimensions/perspectives. Since there are different underlying relationships between different aesthetic dimensions, e.g., between the `Colorfulness' and `Color Harmony', it could be advantageous to leverage effective information attached in multiple relevant dimensions. To this end, we solve this problem via multi-task learning. Our inclination is to seek and learn the correlations between different aesthetic relevant dimensions to further boost the generalization performance in predicting all the aesthetic dimensions. Therefore, the `bottleneck' of obtaining good predictions with limited labeled data for one individual dimension could be unplugged by harnessing complementary sources of other dimensions, i.e., augment the training data indirectly by sharing training information across dimensions. According to experimental results, the proposed model outperforms state-of-the-art aesthetic metrics significantly in predicting four gaming aesthetic dimensions.Comment: 5 page

    A Survey on Human-aware Robot Navigation

    Full text link
    Intelligent systems are increasingly part of our everyday lives and have been integrated seamlessly to the point where it is difficult to imagine a world without them. Physical manifestations of those systems on the other hand, in the form of embodied agents or robots, have so far been used only for specific applications and are often limited to functional roles (e.g. in the industry, entertainment and military fields). Given the current growth and innovation in the research communities concerned with the topics of robot navigation, human-robot-interaction and human activity recognition, it seems like this might soon change. Robots are increasingly easy to obtain and use and the acceptance of them in general is growing. However, the design of a socially compliant robot that can function as a companion needs to take various areas of research into account. This paper is concerned with the navigation aspect of a socially-compliant robot and provides a survey of existing solutions for the relevant areas of research as well as an outlook on possible future directions.Comment: Robotics and Autonomous Systems, 202

    Blickpunktabhängige Computergraphik

    Get PDF
    Contemporary digital displays feature multi-million pixels at ever-increasing refresh rates. Reality, on the other hand, provides us with a view of the world that is continuous in space and time. The discrepancy between viewing the physical world and its sampled depiction on digital displays gives rise to perceptual quality degradations. By measuring or estimating where we look, gaze-contingent algorithms aim at exploiting the way we visually perceive to remedy visible artifacts. This dissertation presents a variety of novel gaze-contingent algorithms and respective perceptual studies. Chapter 4 and 5 present methods to boost perceived visual quality of conventional video footage when viewed on commodity monitors or projectors. In Chapter 6 a novel head-mounted display with real-time gaze tracking is described. The device enables a large variety of applications in the context of Virtual Reality and Augmented Reality. Using the gaze-tracking VR headset, a novel gaze-contingent render method is described in Chapter 7. The gaze-aware approach greatly reduces computational efforts for shading virtual worlds. The described methods and studies show that gaze-contingent algorithms are able to improve the quality of displayed images and videos or reduce the computational effort for image generation, while display quality perceived by the user does not change.Moderne digitale Bildschirme ermöglichen immer höhere Auflösungen bei ebenfalls steigenden Bildwiederholraten. Die Realität hingegen ist in Raum und Zeit kontinuierlich. Diese Grundverschiedenheit führt beim Betrachter zu perzeptuellen Unterschieden. Die Verfolgung der Aug-Blickrichtung ermöglicht blickpunktabhängige Darstellungsmethoden, die sichtbare Artefakte verhindern können. Diese Dissertation trägt zu vier Bereichen blickpunktabhängiger und wahrnehmungstreuer Darstellungsmethoden bei. Die Verfahren in Kapitel 4 und 5 haben zum Ziel, die wahrgenommene visuelle Qualität von Videos für den Betrachter zu erhöhen, wobei die Videos auf gewöhnlicher Ausgabehardware wie z.B. einem Fernseher oder Projektor dargestellt werden. Kapitel 6 beschreibt die Entwicklung eines neuartigen Head-mounted Displays mit Unterstützung zur Erfassung der Blickrichtung in Echtzeit. Die Kombination der Funktionen ermöglicht eine Reihe interessanter Anwendungen in Bezug auf Virtuelle Realität (VR) und Erweiterte Realität (AR). Das vierte und abschließende Verfahren in Kapitel 7 dieser Dissertation beschreibt einen neuen Algorithmus, der das entwickelte Eye-Tracking Head-mounted Display zum blickpunktabhängigen Rendern nutzt. Die Qualität des Shadings wird hierbei auf Basis eines Wahrnehmungsmodells für jeden Bildpixel in Echtzeit analysiert und angepasst. Das Verfahren hat das Potenzial den Berechnungsaufwand für das Shading einer virtuellen Szene auf ein Bruchteil zu reduzieren. Die in dieser Dissertation beschriebenen Verfahren und Untersuchungen zeigen, dass blickpunktabhängige Algorithmen die Darstellungsqualität von Bildern und Videos wirksam verbessern können, beziehungsweise sich bei gleichbleibender Bildqualität der Berechnungsaufwand des bildgebenden Verfahrens erheblich verringern lässt

    Scene understanding for autonomous robots operating in indoor environments

    Get PDF
    Mención Internacional en el título de doctorThe idea of having robots among us is not new. Great efforts are continually made to replicate human intelligence, with the vision of having robots performing different activities, including hazardous, repetitive, and tedious tasks. Research has demonstrated that robots are good at many tasks that are hard for us, mainly in terms of precision, efficiency, and speed. However, there are some tasks that humans do without much effort that are challenging for robots. Especially robots in domestic environments are far from satisfactorily fulfilling some tasks, mainly because these environments are unstructured, cluttered, and with a variety of environmental conditions to control. This thesis addresses the problem of scene understanding in the context of autonomous robots operating in everyday human environments. Furthermore, this thesis is developed under the HEROITEA research project that aims to develop a robot system to help elderly people in domestic environments as an assistant. Our main objective is to develop different methods that allow robots to acquire more information from the environment to progressively build knowledge that allows them to improve the performance on high-level robotic tasks. In this way, scene understanding is a broad research topic, and it is considered a complex task due to the multiple sub-tasks that are involved. In that context, in this thesis, we focus on three sub-tasks: object detection, scene recognition, and semantic segmentation of the environment. Firstly, we implement methods to recognize objects considering real indoor environments. We applied machine learning techniques incorporating uncertainties and more modern techniques based on deep learning. Besides, apart from detecting objects, it is essential to comprehend the scene where they can occur. For this reason, we propose an approach for scene recognition that considers the influence of the detected objects in the prediction process. We demonstrate that the exiting objects and their relationships can improve the inference about the scene class. We also consider that a scene recognition model can benefit from the advantages of other models. We propose a multi-classifier model for scene recognition based on weighted voting schemes. The experiments carried out in real-world indoor environments demonstrate that the adequate combination of independent classifiers allows obtaining a more robust and precise model for scene recognition. Moreover, to increase the understanding of a robot about its surroundings, we propose a new division of the environment based on regions to build a useful representation of the environment. Object and scene information is integrated into a probabilistic fashion generating a semantic map of the environment containing meaningful regions within each room. The proposed system has been assessed on simulated and real-world domestic scenarios, demonstrating its ability to generate consistent environment representations. Lastly, full knowledge of the environment can enhance more complex robotic tasks; that is why in this thesis, we try to study how a complete knowledge of the environment influences the robot’s performance in high-level tasks. To do so, we select an essential task, which is searching for objects. This mundane task can be considered a precondition to perform many complex robotic tasks such as fetching and carrying, manipulation, user requirements, among others. The execution of these activities by service robots needs full knowledge of the environment to perform each task efficiently. In this thesis, we propose two searching strategies that consider prior information, semantic representation of the environment, and the relationships between known objects and the type of scene. All our developments are evaluated in simulated and real-world environments, integrated with other systems, and operating in real platforms, demonstrating their feasibility to implement in real scenarios, and in some cases outperforming other approaches. We also demonstrate how our representation of the environment can boost the performance of more complex robotic tasks compared to more standard environmental representations.La idea de tener robots entre nosotros no es nueva. Continuamente se realizan grandes esfuerzos para replicar la inteligencia humana, con la visión de tener robots que realicen diferentes actividades, incluidas tareas peligrosas, repetitivas y tediosas. La investigación ha demostrado que los robots son buenos en muchas tareas que resultan difíciles para nosotros, principalmente en términos de precisión, eficiencia y velocidad. Sin embargo, existen tareas que los humanos realizamos sin mucho esfuerzo y que son un desafío para los robots. Especialmente, los robots en entornos domésticos están lejos de cumplir satisfactoriamente algunas tareas, principalmente porque estos entornos no son estructurados, pueden estar desordenados y cuentan con una gran variedad de condiciones ambientales que controlar. Esta tesis aborda el problema de la comprensión de la escena en el contexto de robots autónomos que operan en entornos humanos cotidianos. Asimismo, esta tesis se desarrolla en el marco del proyecto de investigación HEROITEA que tiene como objetivo desarrollar un sistema robótico que funcione como asistente para ayudar a personas mayores en entornos domésticos. Nuestro principal objetivo es desarrollar diferentes métodos que permitan a los robots adquirir más información del entorno a fin de construir progresivamente un conocimiento que les permita mejorar su desempeño en tareas robóticas más complejas. En este sentido, la comprensión de escenas es un tema de investigación amplio, y se considera una tarea compleja debido a las múltiples subtareas involucradas. En esta tesis nos enfocamos específicamente en tres subtareas: detección de objetos, reconocimiento de escenas y etiquetado semántico del entorno. Por un lado, implementamos métodos para el reconocimiento de objectos considerando entornos interiores reales. Aplicamos técnicas de aprendizaje automático incorporando incertidumbres y técnicas más modernas basadas en aprendizaje profundo. Además, aparte de detectar objetos, es fundamental comprender la escena donde estos se encuentran. Por esta razón, proponemos un modelo para el reconocimiento de escenas que considera la influencia de los objetos detectados en el proceso de predicción. Demostramos que los objetos existentes y sus relaciones pueden mejorar el proceso de inferencia de la categoría de la escena. También consideramos que un modelo de reconocimiento de escenas puede beneficiarse de las ventajas de otros modelos. Por ello, proponemos un multiclasificador para el reconocimiento de escenas basado en esquemas de votación ponderados. Los experimentos llevados a cabo en entornos interiores reales demuestran que la combinación adecuada de clasificadores independientes permite obtener un modelo más robusto y preciso para el reconocimiento de escenas. Adicionalmente, para aumentar la comprensión de un robot acerca de su entorno, proponemos una nueva división del entorno basada en regiones a fin de construir una representación útil del entorno. La información de objetos y de la escena se integra de forma probabilística generando un mapa semántico que contiene regiones significativas dentro de cada habitación. El sistema propuesto ha sido evaluado en entornos domésticos simulados y reales, demostrando su capacidad para generar representaciones consistentes del entorno. Por otro lado, el conocimiento integral del entorno puede mejorar tareas robóticas más complejas; es por ello que en esta tesis analizamos cómo el conocimiento completo del entorno influye en el desempeño del robot en tareas de alto nivel. Para ello, seleccionamos una tarea fundamental, que es la búsqueda de objetos. Esta tarea mundana puede considerarse una condición previa para realizar diversas tareas robóticas complejas, como transportar objetos, tareas de manipulación, atender requerimientos del usuario, entre otras. La ejecución de estas actividades por parte de robots de servicio requiere un conocimiento profundo del entorno para realizar cada tarea de manera eficiente. En esta tesis proponemos dos estrategias de búsqueda de objetos que consideran información previa, la representación semántica del entorno, las relaciones entre los objetos conocidos y el tipo de escena. Todos nuestros desarrollos son evaluados en entornos simulados y reales, integrados con otros sistemas y operando en plataformas reales, demostrando su viabilidad de ser implementados en escenarios reales y, en algunos casos, superando a otros enfoques. También demostramos cómo nuestra representación del entorno puede mejorar el desempeño de tareas robóticas más complejas en comparación con representaciones del entorno más tradicionales.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Carlos Balaguer Bernaldo de Quirós.- Secretario: Fernando Matía Espada.- Vocal: Klaus Strob

    Event Structure In Vision And Language

    Get PDF
    Our visual experience is surprisingly rich: We do not only see low-level properties such as colors or contours; we also see events, or what is happening. Within linguistics, the examination of how we talk about events suggests that relatively abstract elements exist in the mind which pertain to the relational structure of events, including general thematic roles (e.g., Agent), Causation, Motion, and Transfer. For example, “Alex gave Jesse flowers” and “Jesse gave Alex flowers” both refer to an event of transfer, with the directionality of the transfer having different social consequences. The goal of the present research is to examine the extent to which abstract event information of this sort (event structure) is generated in visual perceptual processing. Do we perceive this information, just as we do with more ‘traditional’ visual properties like color and shape? In the first study (Chapter 2), I used a novel behavioral paradigm to show that event roles – who is acting on whom – are rapidly and automatically extracted from visual scenes, even when participants are engaged in an orthogonal task, such as color or gender identification. In the second study (Chapter 3), I provided functional magnetic resonance (fMRI) evidence for commonality in content between neural representations elicited by static snapshots of actions and by full, dynamic action sequences. These two studies suggest that relatively abstract representations of events are spontaneously extracted from sparse visual information. In the final study (Chapter 4), I return to language, the initial inspiration for my investigations of events in vision. Here I test the hypothesis that the human brain represents verbs in part via their associated event structures. Using a model of verbs based on event-structure semantic features (e.g., Cause, Motion, Transfer), it was possible to successfully predict fMRI responses in language-selective brain regions as people engaged in real-time comprehension of naturalistic speech. Taken together, my research reveals that in both perception and language, the mind rapidly constructs a representation of the world that includes events with relational structure
    corecore