167 research outputs found

    Format-independent media resource adaptation and delivery

    Get PDF

    Acquiring Qualitative Explainable Graphs for Automated Driving Scene Interpretation

    Full text link
    The future of automated driving (AD) is rooted in the development of robust, fair and explainable artificial intelligence methods. Upon request, automated vehicles must be able to explain their decisions to the driver and the car passengers, to the pedestrians and other vulnerable road users and potentially to external auditors in case of accidents. However, nowadays, most explainable methods still rely on quantitative analysis of the AD scene representations captured by multiple sensors. This paper proposes a novel representation of AD scenes, called Qualitative eXplainable Graph (QXG), dedicated to qualitative spatiotemporal reasoning of long-term scenes. The construction of this graph exploits the recent Qualitative Constraint Acquisition paradigm. Our experimental results on NuScenes, an open real-world multi-modal dataset, show that the qualitative eXplainable graph of an AD scene composed of 40 frames can be computed in real-time and light in space storage which makes it a potentially interesting tool for improved and more trustworthy perception and control processes in AD

    Interpretation of complex situations in a semantic-based surveillance framework

    Get PDF
    The integration of cognitive capabilities in computer vision systems requires both to enable high semantic expressiveness and to deal with high computational costs as large amounts of data are involved in the analysis. This contribution describes a cognitive vision system conceived to automatically provide high-level interpretations of complex real-time situations in outdoor and indoor scenarios, and to eventually maintain communication with casual end users in multiple languages. The main contributions are: (i) the design of an integrative multilevel architecture for cognitive surveillance purposes; (ii) the proposal of a coherent taxonomy of knowledge to guide the process of interpretation, which leads to the conception of a situation-based ontology; (iii) the use of situational analysis for content detection and a progressive interpretation of semantically rich scenes, by managing incomplete or uncertain knowledge, and (iv) the use of such an ontological background to enable multilingual capabilities and advanced end-user interfaces. Experimental results are provided to show the feasibility of the proposed approach.This work was supported by the project 'CONSOLIDER-INGENIO 2010 Multimodal interaction in pattern recognition and computer vision' (V-00069). This work is supported by EC Grants IST-027110 for the HERMES project and IST-045547 for the VIDI-video project, and by the Spanish MEC under Projects TIN2006-14606 and CONSOLIDER-INGENIO 2010 (CSD2007-00018). Jordi Gonzàlez also acknowledges the support of a Juan de la Cierva Postdoctoral fellowship from the Spanish MEC.Peer Reviewe

    MPEG-4's BIFS-Anim protocol: using MPEG-4 for streaming of 3D animations

    Get PDF
    This thesis explores issues related to the generation and animation of synthetic objects within the context of MPEG-4. MPEG-4 was designed to provide a standard that will deliver rich multimedia content on many different platforms and networks. MPEG-4 should be viewed as a toolbox rather than as a monolithic standard as each implementer of the standard will pick the necessary tools adequate to their needs, likely to be a small subset of the available tools. The subset of MPEG-4 that will be examined here are the tools relating to the generation of 3D scenes and to the animation of those scenes. A comparison with the most popular 3D standard, Virtual Reality Modeling Language (VRML) will be included. An overview of the MPEG-4 standard will be given, describing the basic concepts. MPEG-4 uses a scene description language called Binary Format for Scene (BIFS) for the composition of scenes, this description language will be described. The potential for the technology used in BIFS to provide low bitrate streaming 3D animations will be analysed and some examples of the possible uses of this technology will be given. A tool for the encoding of streaming 3D animations will be described and results will be shown that MPEG-4 provides a more efficient way of encoding 3D data when compared to VRML. Finally a look will be taken at the future of 3D content on the Internet

    Self-organizing distributed digital library supporting audio-video

    Get PDF
    The StreamOnTheFly network combines peer-to-peer networking and open-archive principles for community radio channels and TV stations in Europe. StreamOnTheFly demonstrates new methods of archive management and personalization technologies for both audio and video. It also provides a collaboration platform for community purposes that suits the flexible activity patterns of these kinds of broadcaster communities

    Rapid Prototyping for Virtual Environments

    Get PDF
    Development of Virtual Environment (VE) applications is challenging where application developers are required to have expertise in the target VE technologies along with the problem domain expertise. New VE technologies impose a significant learning curve to even the most experienced VE developer. The proposed solution relies on synthesis to automate the migration of a VE application to a new unfamiliar VE platform/technology. To solve the problem, the Common Scene Definition Framework (CSDF) is developed, that serves as a superset/model representation of the target virtual world. Input modules are developed to populate the framework with the capabilities of the virtual world imported from VRML 2.0 and X3D formats. The synthesis capability is built into the framework to synthesize the virtual world into a subset of VRML 2.0, VRML 1.0, X3D, Java3D, JavaFX, JavaME, and OpenGL technologies, which may reside on different platforms. Interfaces are designed to keep the framework extensible to different and new VE formats/technologies. The framework demonstrated the ability to quickly synthesize a working prototype of the input virtual environment in different VE formats

    A graphics software architecture for high-end interactive TV terminals

    Get PDF
    This thesis proposes a graphics architecture for next-generation digital television receivers. The starting assumption is that in the future, a number of multimedia terminals will have access through a number of networks to a variety of content and services. One example of such a device is a media station capable of integrating different kinds of multimedia objects such as 2D/3D graphics and video, reacting to user interaction, and supporting the temporal dimension of applications. Some of the services intended for these devices include, for example, games and enhanced information over broadcasted video. First, this thesis provides an overview of the digital television environment, focusing on the limitations of current receivers and hints at future directions. In addition, this thesis compares different solutions from regional standardisation bodies such as DVB, CableLabs, and ARIB. It proposes the adoption of the most relevant initiative, GEM by DVB. Unfortunately, GEM software middleware only considers Java language as an authoring format, meaning that the declarative environment and advanced functionalities (e.g., 3D graphics support) remain to be standardised. Because in the future different user groups will have different demands with regard to television, this thesis identifies two major extensions to the GEM standard. First, it proposes a declarative environment for GEM that takes into account W3C standardisation efforts. This environment is divided into two configurations: one capable of rendering limited interactive applications such as information services, and another intended for more demanding applications, for example a distance learning portal that synchronises videos of lecturers and slides. Second, this thesis proposes to extend the procedural environment of GEM with 3D graphics support. The potential services of this new profile, High-End Interactive, include games and commercials. Then, based on the requirements the proposed profiles should meet, this thesis defines a graphics architecture model composed of five layers. The hardware abstraction layer is in charge of rendering the final graphics output. The graphical context is a cross-platform abstraction of the rendering region and provides graphics primitives (e.g., rectangles and images). The graphical environment provides the means to control different graphical contexts. The GUI toolkit is a set of ready-made user interface widgets and layout schemes. Finally, high-level languages are easy-to-use tools for developing simple services. The thesis concludes with a report of my experience implementing a digital television receiver based on the proposals described. In addition to testing the application of the proposed graphics architecture to the design and implementation of a next-generation digital television receiver, the implementation permits the analysis of the requirements of such receivers and of the services they can provide.reviewe
    corecore