368 research outputs found

    Towards Optimal Image Stitching for Virtual Microscopy

    Get PDF
    In this paper we present an image stitching method based on dynamic programming and describe its application to automated slide acquisition for Virtual Microscopy (VM). Given a large number of fields of view (FOVs) acquired from a single microscope slide, we composite these images into a single large 'virtual slide' image. The location of each FOV is determined using a new algorithm based on dynamic programming. We compare the performance of the proposed algorithm to an existing greedy algorithm. In a visual trial it is shown that the new algorithm provides a significant improvement in perceived image quality at image boundaries compared to the existing algorithm

    A tracking framework for accurate face localization

    Get PDF
    This paper proposes a complete framework for accurate face localization on video frames. Detection and forward tracking are first combined according to predefined rules to get a first set of face candidates. Backward tracking is then applied to provide another set of possible localizations. Finally a dynamic programming algorithm is used to select the candidates that minimize a specific cost function. This method was designed to handle different scale, pose and lighting conditions. The experiments show that it improves the face detection rate compared to a frame-based detector and provides a higher precision than a forward information-based tracker.IFIP International Conference on Artificial Intelligence in Theory and Practice - Machine VisionRed de Universidades con Carreras en Informática (RedUNCI

    A tracking framework for accurate face localization

    Get PDF
    This paper proposes a complete framework for accurate face localization on video frames. Detection and forward tracking are first combined according to predefined rules to get a first set of face candidates. Backward tracking is then applied to provide another set of possible localizations. Finally a dynamic programming algorithm is used to select the candidates that minimize a specific cost function. This method was designed to handle different scale, pose and lighting conditions. The experiments show that it improves the face detection rate compared to a frame-based detector and provides a higher precision than a forward information-based tracker.IFIP International Conference on Artificial Intelligence in Theory and Practice - Machine VisionRed de Universidades con Carreras en Informática (RedUNCI

    Multiple Media Correlation: Theory and Applications

    Get PDF
    This thesis introduces multiple media correlation, a new technology for the automatic alignment of multiple media objects such as text, audio, and video. This research began with the question: what can be learned when multiple multimedia components are analyzed simultaneously? Most ongoing research in computational multimedia has focused on queries, indexing, and retrieval within a single media type. Video is compressed and searched independently of audio, text is indexed without regard to temporal relationships it may have to other media data. Multiple media correlation provides a framework for locating and exploiting correlations between multiple, potentially heterogeneous, media streams. The goal is computed synchronization, the determination of temporal and spatial alignments that optimize a correlation function and indicate commonality and synchronization between media objects. The model also provides a basis for comparison of media in unrelated domains. There are many real-world applications for this technology, including speaker localization, musical score alignment, and degraded media realignment. Two applications, text-to-speech alignment and parallel text alignment, are described in detail with experimental validation. Text-to-speech alignment computes the alignment between a textual transcript and speech-based audio. The presented solutions are effective for a wide variety of content and are useful not only for retrieval of content, but in support of automatic captioning of movies and video. Parallel text alignment provides a tool for the comparison of alternative translations of the same document that is particularly useful to the classics scholar interested in comparing translation techniques or styles. The results presented in this thesis include (a) new media models more useful in analysis applications, (b) a theoretical model for multiple media correlation, (c) two practical application solutions that have wide-spread applicability, and (d) Xtrieve, a multimedia database retrieval system that demonstrates this new technology and demonstrates application of multiple media correlation to information retrieval. This thesis demonstrates that computed alignment of media objects is practical and can provide immediate solutions to many information retrieval and content presentation problems. It also introduces a new area for research in media data analysis

    Techniques for binocular markerless visual tracking of 3D articulated bodies

    Get PDF

    Proceedings of the 2009 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    The joint workshop of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Karlsruhe, and the Vision and Fusion Laboratory (Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT)), is organized annually since 2005 with the aim to report on the latest research and development findings of the doctoral students of both institutions. This book provides a collection of 16 technical reports on the research results presented on the 2009 workshop

    SInCom 2015

    Get PDF
    2nd Baden-Württemberg Center of Applied Research Symposium on Information and Communication Systems, SInCom 2015, 13. November 2015 in Konstan

    MIDAS: Multi-device Integrated Dynamic Activity Spaces

    Get PDF
    Mobile phones, tablet computers, laptops, desktops, and large screen displays are increasingly available to individuals for information access, often simultaneously. Dominant content access protocols, such as HTTP/1.1, do not take advantage of this device multiplicity and support information access from single devices only. Changing devices means restarting an information session. Using devices in conjunction with each other poses several challenges, which include the presentation of content on devices with diverse form factors and propagation of the content changes across these devices. In this dissertation, I report on the design and implementation of MIDAS - architecture and a prototype system for multi-device presentations. I propose a framework, called 12C, for characterizing multi-device systems and evaluate MIDAS within this framework. MIDAS is designed as a middleware that can work with multiple client-server architectures, such as the Web and context-aware Trellis, a non-Web hypertext system. It presents information content simultaneously on devices with diverse characteristics without requiring sensor-enhanced environments. The system adapts content elements for optimal presentation on the target device while also striving to retain fidelity with the original form from a human perceptual perspective. MIDAS reconfigures its presentation in response to user actions, availability of devices, and environmental context, such as a user's location or the time of day. I conducted a pilot study that explored human perception of similarity when image attributes such as size and color depth are modified in the process of presenting images on different devices. The results indicated that users tend to prefer scaling of images to color-depth reduction but gray scaling of images is preferable to either modification. Not all images scale equally gracefully; those dominated by natural elements or manmade structures scale exceptionally well. Images that depict recognizable human faces or textual elements should be scaled only to an extent that these features retain their integrity. Attributes of the 12C framework describe aspects of multi-device systems that include infrastructure, presentation, interaction, interface, and security. Based on these criteria, MIDAS is a flexible infrastructure, which lends itself to several content distribution and interaction strategies by separating client- and server-side configuration
    corecore