10 research outputs found

    Generalized motion and edge adaptive interpolation de-interlacing algorithm

    Get PDF
    This paper presents a generalized motion and edge adaptive de-interlacing framework, which offers a structured way to develop de-interlacing algorithm. The framework encompasses many typical de-interlacing algorithms, ranging from simple interpolation based algorithms, to more complex edge dependent and motion adaptive algorithms. Based on this framework, we develop a new de-interlacing algorithm which is efficient and artifacts-free. The proposed algorithm was evaluated by five video sequences, namely, "Akiyo", Mother and Daughter", "Silent", "Foreman" and "Stefan". Experimental results confirm that the proposed algorithm performs, both objectively and subjectively, much better than other similar algorithms. These promising results indicate that the proposed framework has good potential for realizing even better de-interlacing algorithms.postprin

    Motion and edge adaptive interpolation de-interlacing algorithm

    Get PDF
    This paper presents a new motion and edge adaptive de-interlacing algorithm, which is efficient and artifacts-free. It is novel in the sense that it introduces a way to properly interpolate the two (odd and even) field images according to the information provided by the simplest form of motion detection and edge orientation estimation methods. The proposed algorithm was evaluated by three video sequences, namely, 'Akiyo', 'Silent', 'Foreman'. Experimental results confirm that the proposed algorithm performs, both objectively and subjectively, much better than other similar algorithms. These promising results indicate that the proposed interpolation approach has good potential to realize even better de-interlacing algorithms, if more sophisticated motion detection and edge orientation estimation methods are employed.postprintThe 10th WSEAS international conference on Computers (ICCOMP'06), Athens, Greece, 13-15 July 2006. In Proceedings of ICCOMP, 2006, p. 1030-103

    Motion and edge adaptive interpolation de-interlacing algorithm

    Get PDF
    This paper presents a new motion and edge adaptive de-interlacing algorithm, which is efficient and artifacts-free. It is novel in the sense that it introduces a way to properly interpolate the two (odd and even) field images according to the information provided by the simplest form of motion detection and edge orientation estimation methods. The proposed algorithm was evaluated by three video sequences, namely, 'Akiyo', 'Silent', 'Foreman'. Experimental results confirm that the proposed algorithm performs, both objectively and subjectively, much better than other similar algorithms. These promising results indicate that the proposed interpolation approach has good potential to realize even better de-interlacing algorithms, if more sophisticated motion detection and edge orientation estimation methods are employed.postprintThe 10th WSEAS international conference on Computers (ICCOMP'06), Athens, Greece, 13-15 July 2006. In Proceedings of ICCOMP, 2006, p. 1030-103

    Automatic video segmentation employing object/camera modeling techniques

    Get PDF
    Practically established video compression and storage techniques still process video sequences as rectangular images without further semantic structure. However, humans watching a video sequence immediately recognize acting objects as semantic units. This semantic object separation is currently not reflected in the technical system, making it difficult to manipulate the video at the object level. The realization of object-based manipulation will introduce many new possibilities for working with videos like composing new scenes from pre-existing video objects or enabling user-interaction with the scene. Moreover, object-based video compression, as defined in the MPEG-4 standard, can provide high compression ratios because the foreground objects can be sent independently from the background. In the case that the scene background is static, the background views can even be combined into a large panoramic sprite image, from which the current camera view is extracted. This results in a higher compression ratio since the sprite image for each scene only has to be sent once. A prerequisite for employing object-based video processing is automatic (or at least user-assisted semi-automatic) segmentation of the input video into semantic units, the video objects. This segmentation is a difficult problem because the computer does not have the vast amount of pre-knowledge that humans subconsciously use for object detection. Thus, even the simple definition of the desired output of a segmentation system is difficult. The subject of this thesis is to provide algorithms for segmentation that are applicable to common video material and that are computationally efficient. The thesis is conceptually separated into three parts. In Part I, an automatic segmentation system for general video content is described in detail. Part II introduces object models as a tool to incorporate userdefined knowledge about the objects to be extracted into the segmentation process. Part III concentrates on the modeling of camera motion in order to relate the observed camera motion to real-world camera parameters. The segmentation system that is described in Part I is based on a background-subtraction technique. The pure background image that is required for this technique is synthesized from the input video itself. Sequences that contain rotational camera motion can also be processed since the camera motion is estimated and the input images are aligned into a panoramic scene-background. This approach is fully compatible to the MPEG-4 video-encoding framework, such that the segmentation system can be easily combined with an object-based MPEG-4 video codec. After an introduction to the theory of projective geometry in Chapter 2, which is required for the derivation of camera-motion models, the estimation of camera motion is discussed in Chapters 3 and 4. It is important that the camera-motion estimation is not influenced by foreground object motion. At the same time, the estimation should provide accurate motion parameters such that all input frames can be combined seamlessly into a background image. The core motion estimation is based on a feature-based approach where the motion parameters are determined with a robust-estimation algorithm (RANSAC) in order to distinguish the camera motion from simultaneously visible object motion. Our experiments showed that the robustness of the original RANSAC algorithm in practice does not reach the theoretically predicted performance. An analysis of the problem has revealed that this is caused by numerical instabilities that can be significantly reduced by a modification that we describe in Chapter 4. The synthetization of static-background images is discussed in Chapter 5. In particular, we present a new algorithm for the removal of the foreground objects from the background image such that a pure scene background remains. The proposed algorithm is optimized to synthesize the background even for difficult scenes in which the background is only visible for short periods of time. The problem is solved by clustering the image content for each region over time, such that each cluster comprises static content. Furthermore, it is exploited that the times, in which foreground objects appear in an image region, are similar to the corresponding times of neighboring image areas. The reconstructed background could be used directly as the sprite image in an MPEG-4 video coder. However, we have discovered that the counterintuitive approach of splitting the background into several independent parts can reduce the overall amount of data. In the case of general camera motion, the construction of a single sprite image is even impossible. In Chapter 6, a multi-sprite partitioning algorithm is presented, which separates the video sequence into a number of segments, for which independent sprites are synthesized. The partitioning is computed in such a way that the total area of the resulting sprites is minimized, while simultaneously satisfying additional constraints. These include a limited sprite-buffer size at the decoder, and the restriction that the image resolution in the sprite should never fall below the input-image resolution. The described multisprite approach is fully compatible to the MPEG-4 standard, but provides three advantages. First, any arbitrary rotational camera motion can be processed. Second, the coding-cost for transmitting the sprite images is lower, and finally, the quality of the decoded sprite images is better than in previously proposed sprite-generation algorithms. Segmentation masks for the foreground objects are computed with a change-detection algorithm that compares the pure background image with the input images. A special effect that occurs in the change detection is the problem of image misregistration. Since the change detection compares co-located image pixels in the camera-motion compensated images, a small error in the motion estimation can introduce segmentation errors because non-corresponding pixels are compared. We approach this problem in Chapter 7 by integrating risk-maps into the segmentation algorithm that identify pixels for which misregistration would probably result in errors. For these image areas, the change-detection algorithm is modified to disregard the difference values for the pixels marked in the risk-map. This modification significantly reduces the number of false object detections in fine-textured image areas. The algorithmic building-blocks described above can be combined into a segmentation system in various ways, depending on whether camera motion has to be considered or whether real-time execution is required. These different systems and example applications are discussed in Chapter 8. Part II of the thesis extends the described segmentation system to consider object models in the analysis. Object models allow the user to specify which objects should be extracted from the video. In Chapters 9 and 10, a graph-based object model is presented in which the features of the main object regions are summarized in the graph nodes, and the spatial relations between these regions are expressed with the graph edges. The segmentation algorithm is extended by an object-detection algorithm that searches the input image for the user-defined object model. We provide two objectdetection algorithms. The first one is specific for cartoon sequences and uses an efficient sub-graph matching algorithm, whereas the second processes natural video sequences. With the object-model extension, the segmentation system can be controlled to extract individual objects, even if the input sequence comprises many objects. Chapter 11 proposes an alternative approach to incorporate object models into a segmentation algorithm. The chapter describes a semi-automatic segmentation algorithm, in which the user coarsely marks the object and the computer refines this to the exact object boundary. Afterwards, the object is tracked automatically through the sequence. In this algorithm, the object model is defined as the texture along the object contour. This texture is extracted in the first frame and then used during the object tracking to localize the original object. The core of the algorithm uses a graph representation of the image and a newly developed algorithm for computing shortest circular-paths in planar graphs. The proposed algorithm is faster than the currently known algorithms for this problem, and it can also be applied to many alternative problems like shape matching. Part III of the thesis elaborates on different techniques to derive information about the physical 3-D world from the camera motion. In the segmentation system, we employ camera-motion estimation, but the obtained parameters have no direct physical meaning. Chapter 12 discusses an extension to the camera-motion estimation to factorize the motion parameters into physically meaningful parameters (rotation angles, focal-length) using camera autocalibration techniques. The speciality of the algorithm is that it can process camera motion that spans several sprites by employing the above multi-sprite technique. Consequently, the algorithm can be applied to arbitrary rotational camera motion. For the analysis of video sequences, it is often required to determine and follow the position of the objects. Clearly, the object position in image coordinates provides little information if the viewing direction of the camera is not known. Chapter 13 provides a new algorithm to deduce the transformation between the image coordinates and the real-world coordinates for the special application of sport-video analysis. In sport videos, the camera view can be derived from markings on the playing field. For this reason, we employ a model of the playing field that describes the arrangement of lines. After detecting significant lines in the input image, a combinatorial search is carried out to establish correspondences between lines in the input image and lines in the model. The algorithm requires no information about the specific color of the playing field and it is very robust to occlusions or poor lighting conditions. Moreover, the algorithm is generic in the sense that it can be applied to any type of sport by simply exchanging the model of the playing field. In Chapter 14, we again consider panoramic background images and particularly focus ib their visualization. Apart from the planar backgroundsprites discussed previously, a frequently-used visualization technique for panoramic images are projections onto a cylinder surface which is unwrapped into a rectangular image. However, the disadvantage of this approach is that the viewer has no good orientation in the panoramic image because he looks into all directions at the same time. In order to provide a more intuitive presentation of wide-angle views, we have developed a visualization technique specialized for the case of indoor environments. We present an algorithm to determine the 3-D shape of the room in which the image was captured, or, more generally, to compute a complete floor plan if several panoramic images captured in each of the rooms are provided. Based on the obtained 3-D geometry, a graphical model of the rooms is constructed, where the walls are displayed with textures that are extracted from the panoramic images. This representation enables to conduct virtual walk-throughs in the reconstructed room and therefore, provides a better orientation for the user. Summarizing, we can conclude that all segmentation techniques employ some definition of foreground objects. These definitions are either explicit, using object models like in Part II of this thesis, or they are implicitly defined like in the background synthetization in Part I. The results of this thesis show that implicit descriptions, which extract their definition from video content, work well when the sequence is long enough to extract this information reliably. However, high-level semantics are difficult to integrate into the segmentation approaches that are based on implicit models. Intead, those semantics should be added as postprocessing steps. On the other hand, explicit object models apply semantic pre-knowledge at early stages of the segmentation. Moreover, they can be applied to short video sequences or even still pictures since no background model has to be extracted from the video. The definition of a general object-modeling technique that is widely applicable and that also enables an accurate segmentation remains an important yet challenging problem for further research

    Multiresolution image models and estimation techniques

    Get PDF

    Hardware-accelerated algorithms in visual computing

    Get PDF
    This thesis presents new parallel algorithms which accelerate computer vision methods by the use of graphics processors (GPUs) and evaluates them with respect to their speed, scalability, and the quality of their results. It covers the fields of homogeneous and anisotropic diffusion processes, diffusion image inpainting, optic flow, and halftoning. In this turn, it compares different solvers for homogeneous diffusion and presents a novel \u27extended\u27 box filter. Moreover, it suggests to use the fast explicit diffusion scheme (FED) as an efficient and flexible solver for nonlinear and in particular for anisotropic parabolic diffusion problems on graphics hardware. For elliptic diffusion-like processes, it recommends to use cascadic FED or Fast Jacobi schemes. The presented optic flow algorithm represents one of the fastest yet very accurate techniques. Finally, it presents a novel halftoning scheme which yields state-of-the-art results for many applications in image processing and computer graphics.Diese Arbeit präsentiert neue parallele Algorithmen zur Beschleunigung von Methoden in der Bildinformatik mittels Grafikprozessoren (GPUs), und evaluiert diese im Hinblick auf Geschwindigkeit, Skalierungsverhalten, und Qualität der Resultate. Sie behandelt dabei die Gebiete der homogenen und anisotropen Diffusionsprozesse, Inpainting (Bildvervollständigung) mittels Diffusion, die Bestimmung des optischen Flusses, sowie Halbtonverfahren. Dabei werden verschiedene Löser für homogene Diffusion verglichen und ein neuer \u27erweiterter\u27 Mittelwertfilter präsentiert. Ferner wird vorgeschlagen, das schnelle explizite Diffusionsschema (FED) als effizienten und flexiblen Löser für parabolische nichtlineare und speziell anisotrope Diffusionsprozesse auf Grafikprozessoren einzusetzen. Für elliptische diffusionsartige Prozesse wird hingegen empfohlen, kaskadierte FED- oder schnelle Jacobi-Verfahren einzusetzen. Der vorgestellte Algorithmus zur Berechnung des optischen Flusses stellt eines der schnellsten und dennoch äußerst genauen Verfahren dar. Schließlich wird ein neues Halbtonverfahren präsentiert, das in vielen Bereichen der Bildverarbeitung und Computergrafik Ergebnisse produziert, die den Stand der Technik repräsentieren

    Interactive mixed reality rendering in a distributed ray tracing framework

    Get PDF
    The recent availability of interactive ray tracing opened the way for new applications and for improving existing ones in terms of quality. Since today CPUs are still too slow for this purpose, the necessary computing power is obtained by connecting a number of machines and using distributed algorithms. Mixed reality rendering - the realm of convincingly combining real and virtual parts to a new composite scene - needs a powerful rendering method to obtain a photorealistic result. The ray tracing algorithm thus provides an excellent basis for photorealistic rendering and also advantages over other methods. It is worth to explore its abilities for interactive mixed reality rendering. This thesis shows the applicability of interactive ray tracing for mixed (MR) and augmented reality (AR) applications on the basis of the OpenRT framework. Two extensions to the OpenRT system are introduced and serve as basic building blocks: streaming video textures and in-shader AR view compositing. Streaming video textures allow for inclusion of the real world into interactive applications in terms of imagery. The AR view compositing mechanism is needed to fully exploit the advantages of modular shading in a ray tracer. A number of example applications from the entire spectrum of the Milgram Reality-Virtuality continuum illustrate the practical implications. An implementation of a classic AR scenario, inserting a virtual object into live video, shows how a differential rendering method can be used in combination with a custom build real-time lightprobe device to capture the incident light and include it into the rendering process to achieve convincing shading and shadows. Another field of mixed reality rendering is the insertion of real actors into a virtual scene in real-time. Two methods - video billboards and a live 3D visual hull reconstruction - are discussed. The implementation of live mixed reality systems is based on a number of technologies beside rendering and a comprehensive understanding of related methods and hardware is necessary. Large parts of this thesis hence deal with the discussion of technical implementations and design alternatives. A final summary discusses the benefits and drawbacks of interactive ray tracing for mixed reality rendering.Die Verfügbarkeit von interaktivem Ray-Tracing ebnet den Weg für neue Anwendungen, aber auch für die Verbesserung der Qualität bestehener Methoden. Da die heute verfügbaren CPUs noch zu langsam sind, ist es notwendig, mehrere Maschinen zu verbinden und verteilte Algorithmen zu verwenden. Mixed Reality Rendering - die Technik der überzeugenden Kombination von realen und synthetischen Teilen zu einer neuen Szene - braucht eine leistungsfähige Rendering-Methode um photorealistische Ergebnisse zu erzielen. Der Ray-Tracing-Algorithmus bietet hierfür eine exzellente Basis, aber auch Vorteile gegenüber anderen Methoden. Es ist naheliegend, die Möglichkeiten von Ray-Tracing für Mixed-Reality-Anwendungen zu erforschen. Diese Arbeit zeigt die Anwendbarkeit von interaktivem Ray-Tracing für Mixed-Reality (MR) und Augmented-Reality (AR) Anwendungen anhand des OpenRT-Systems. Zwei Erweiterungen dienen als Grundbausteine: Videotexturen und In-Shader AR View Compositing. Videotexturen erlauben die reale Welt in Form von Bildern in den Rendering-Prozess mit einzubeziehen. Der View-Compositing-Mechanismus is notwendig um die Modularität einen Ray-Tracers voll auszunutzen. Eine Reihe von Beispielanwendungen von beiden Enden des Milgramschen Reality-Virtuality-Kontinuums verdeutlichen die praktischen Aspekte. Eine Implementierung des klassischen AR-Szenarios, das Einfügen eines virtuellen Objektes in eine Live-Übertragung zeigt, wie mittels einer Differential Rendering Methode und einem selbstgebauten Gerät zur Erfassung des einfallenden Lichts realistische Beleuchtung und Schatten erzielt werden können. Ein anderer Anwendungsbereich ist das Einfügen einer realen Person in eine künstliche Szene. Hierzu werden zwei Methoden besprochen: Video-Billboards und eine interaktive 3D Rekonstruktion. Da die Implementierung von Mixed-Reality-Anwendungen Kentnisse und Verständnis einer ganzen Reihe von Technologien nebem dem eigentlichen Rendering voraus setzt, ist eine Diskussion der technischen Grundlagen ein wesentlicher Bestandteil dieser Arbeit. Dies ist notwenig, um die Entscheidungen für bestimmte Designalternativen zu verstehen. Den Abschluss bildet eine Diskussion der Vor- und Nachteile von interaktivem Ray-Tracing für Mixed Reality Anwendungen

    Interactive mixed reality rendering in a distributed ray tracing framework

    Get PDF
    The recent availability of interactive ray tracing opened the way for new applications and for improving existing ones in terms of quality. Since today CPUs are still too slow for this purpose, the necessary computing power is obtained by connecting a number of machines and using distributed algorithms. Mixed reality rendering - the realm of convincingly combining real and virtual parts to a new composite scene - needs a powerful rendering method to obtain a photorealistic result. The ray tracing algorithm thus provides an excellent basis for photorealistic rendering and also advantages over other methods. It is worth to explore its abilities for interactive mixed reality rendering. This thesis shows the applicability of interactive ray tracing for mixed (MR) and augmented reality (AR) applications on the basis of the OpenRT framework. Two extensions to the OpenRT system are introduced and serve as basic building blocks: streaming video textures and in-shader AR view compositing. Streaming video textures allow for inclusion of the real world into interactive applications in terms of imagery. The AR view compositing mechanism is needed to fully exploit the advantages of modular shading in a ray tracer. A number of example applications from the entire spectrum of the Milgram Reality-Virtuality continuum illustrate the practical implications. An implementation of a classic AR scenario, inserting a virtual object into live video, shows how a differential rendering method can be used in combination with a custom build real-time lightprobe device to capture the incident light and include it into the rendering process to achieve convincing shading and shadows. Another field of mixed reality rendering is the insertion of real actors into a virtual scene in real-time. Two methods - video billboards and a live 3D visual hull reconstruction - are discussed. The implementation of live mixed reality systems is based on a number of technologies beside rendering and a comprehensive understanding of related methods and hardware is necessary. Large parts of this thesis hence deal with the discussion of technical implementations and design alternatives. A final summary discusses the benefits and drawbacks of interactive ray tracing for mixed reality rendering.Die Verfügbarkeit von interaktivem Ray-Tracing ebnet den Weg für neue Anwendungen, aber auch für die Verbesserung der Qualität bestehener Methoden. Da die heute verfügbaren CPUs noch zu langsam sind, ist es notwendig, mehrere Maschinen zu verbinden und verteilte Algorithmen zu verwenden. Mixed Reality Rendering - die Technik der überzeugenden Kombination von realen und synthetischen Teilen zu einer neuen Szene - braucht eine leistungsfähige Rendering-Methode um photorealistische Ergebnisse zu erzielen. Der Ray-Tracing-Algorithmus bietet hierfür eine exzellente Basis, aber auch Vorteile gegenüber anderen Methoden. Es ist naheliegend, die Möglichkeiten von Ray-Tracing für Mixed-Reality-Anwendungen zu erforschen. Diese Arbeit zeigt die Anwendbarkeit von interaktivem Ray-Tracing für Mixed-Reality (MR) und Augmented-Reality (AR) Anwendungen anhand des OpenRT-Systems. Zwei Erweiterungen dienen als Grundbausteine: Videotexturen und In-Shader AR View Compositing. Videotexturen erlauben die reale Welt in Form von Bildern in den Rendering-Prozess mit einzubeziehen. Der View-Compositing-Mechanismus is notwendig um die Modularität einen Ray-Tracers voll auszunutzen. Eine Reihe von Beispielanwendungen von beiden Enden des Milgramschen Reality-Virtuality-Kontinuums verdeutlichen die praktischen Aspekte. Eine Implementierung des klassischen AR-Szenarios, das Einfügen eines virtuellen Objektes in eine Live-Übertragung zeigt, wie mittels einer Differential Rendering Methode und einem selbstgebauten Gerät zur Erfassung des einfallenden Lichts realistische Beleuchtung und Schatten erzielt werden können. Ein anderer Anwendungsbereich ist das Einfügen einer realen Person in eine künstliche Szene. Hierzu werden zwei Methoden besprochen: Video-Billboards und eine interaktive 3D Rekonstruktion. Da die Implementierung von Mixed-Reality-Anwendungen Kentnisse und Verständnis einer ganzen Reihe von Technologien nebem dem eigentlichen Rendering voraus setzt, ist eine Diskussion der technischen Grundlagen ein wesentlicher Bestandteil dieser Arbeit. Dies ist notwenig, um die Entscheidungen für bestimmte Designalternativen zu verstehen. Den Abschluss bildet eine Diskussion der Vor- und Nachteile von interaktivem Ray-Tracing für Mixed Reality Anwendungen

    La traduzione specializzata all’opera per una piccola impresa in espansione: la mia esperienza di internazionalizzazione in cinese di Bioretics© S.r.l.

    Get PDF
    Global markets are currently immersed in two all-encompassing and unstoppable processes: internationalization and globalization. While the former pushes companies to look beyond the borders of their country of origin to forge relationships with foreign trading partners, the latter fosters the standardization in all countries, by reducing spatiotemporal distances and breaking down geographical, political, economic and socio-cultural barriers. In recent decades, another domain has appeared to propel these unifying drives: Artificial Intelligence, together with its high technologies aiming to implement human cognitive abilities in machinery. The “Language Toolkit – Le lingue straniere al servizio dell’internazionalizzazione dell’impresa” project, promoted by the Department of Interpreting and Translation (Forlì Campus) in collaboration with the Romagna Chamber of Commerce (Forlì-Cesena and Rimini), seeks to help Italian SMEs make their way into the global market. It is precisely within this project that this dissertation has been conceived. Indeed, its purpose is to present the translation and localization project from English into Chinese of a series of texts produced by Bioretics© S.r.l.: an investor deck, the company website and part of the installation and use manual of the Aliquis© framework software, its flagship product. This dissertation is structured as follows: Chapter 1 presents the project and the company in detail; Chapter 2 outlines the internationalization and globalization processes and the Artificial Intelligence market both in Italy and in China; Chapter 3 provides the theoretical foundations for every aspect related to Specialized Translation, including website localization; Chapter 4 describes the resources and tools used to perform the translations; Chapter 5 proposes an analysis of the source texts; Chapter 6 is a commentary on translation strategies and choices
    corecore