15 research outputs found

    Dimensionality reduction and sparse representations in computer vision

    Get PDF
    The proliferation of camera equipped devices, such as netbooks, smartphones and game stations, has led to a significant increase in the production of visual content. This visual information could be used for understanding the environment and offering a natural interface between the users and their surroundings. However, the massive amounts of data and the high computational cost associated with them, encumbers the transfer of sophisticated vision algorithms to real life systems, especially ones that exhibit resource limitations such as restrictions in available memory, processing power and bandwidth. One approach for tackling these issues is to generate compact and descriptive representations of image data by exploiting inherent redundancies. We propose the investigation of dimensionality reduction and sparse representations in order to accomplish this task. In dimensionality reduction, the aim is to reduce the dimensions of the space where image data reside in order to allow resource constrained systems to handle them and, ideally, provide a more insightful description. This goal is achieved by exploiting the inherent redundancies that many classes of images, such as faces under different illumination conditions and objects from different viewpoints, exhibit. We explore the description of natural images by low dimensional non-linear models called image manifolds and investigate the performance of computer vision tasks such as recognition and classification using these low dimensional models. In addition to dimensionality reduction, we study a novel approach in representing images as a sparse linear combination of dictionary examples. We investigate how sparse image representations can be used for a variety of tasks including low level image modeling and higher level semantic information extraction. Using tools from dimensionality reduction and sparse representation, we propose the application of these methods in three hierarchical image layers, namely low-level features, mid-level structures and high-level attributes. Low level features are image descriptors that can be extracted directly from the raw image pixels and include pixel intensities, histograms, and gradients. In the first part of this work, we explore how various techniques in dimensionality reduction, ranging from traditional image compression to the recently proposed Random Projections method, affect the performance of computer vision algorithms such as face detection and face recognition. In addition, we discuss a method that is able to increase the spatial resolution of a single image, without using any training examples, according to the sparse representations framework. In the second part, we explore mid-level structures, including image manifolds and sparse models, produced by abstracting information from low-level features and offer compact modeling of high dimensional data. We propose novel techniques for generating more descriptive image representations and investigate their application in face recognition and object tracking. In the third part of this work, we propose the investigation of a novel framework for representing the semantic contents of images. This framework employs high level semantic attributes that aim to bridge the gap between the visual information of an image and its textual description by utilizing low level features and mid level structures. This innovative paradigm offers revolutionary possibilities including recognizing the category of an object from purely textual information without providing any explicit visual example

    Data comparison schemes for Pattern Recognition in Digital Images using Fractals

    Get PDF
    Pattern recognition in digital images is a common problem with application in remote sensing, electron microscopy, medical imaging, seismic imaging and astrophysics for example. Although this subject has been researched for over twenty years there is still no general solution which can be compared with the human cognitive system in which a pattern can be recognised subject to arbitrary orientation and scale. The application of Artificial Neural Networks can in principle provide a very general solution providing suitable training schemes are implemented. However, this approach raises some major issues in practice. First, the CPU time required to train an ANN for a grey level or colour image can be very large especially if the object has a complex structure with no clear geometrical features such as those that arise in remote sensing applications. Secondly, both the core and file space memory required to represent large images and their associated data tasks leads to a number of problems in which the use of virtual memory is paramount. The primary goal of this research has been to assess methods of image data compression for pattern recognition using a range of different compression methods. In particular, this research has resulted in the design and implementation of a new algorithm for general pattern recognition based on the use of fractal image compression. This approach has for the first time allowed the pattern recognition problem to be solved in a way that is invariant of rotation and scale. It allows both ANNs and correlation to be used subject to appropriate pre-and post-processing techniques for digital image processing on aspect for which a dedicated programmer's work bench has been developed using X-Designer

    Texture and Colour in Image Analysis

    Get PDF
    Research in colour and texture has experienced major changes in the last few years. This book presents some recent advances in the field, specifically in the theory and applications of colour texture analysis. This volume also features benchmarks, comparative evaluations and reviews

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Quality of Experience in Immersive Video Technologies

    Get PDF
    Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersĂą QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsĂą ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersĂą preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

    Interfaces for Modular Surgical Planning and Assistance Systems

    Get PDF
    Modern surgery of the 21st century relies in many aspects on computers or, in a wider sense, digital data processing. Department administration, OR scheduling, billing, and - with increasing pervasion - patient data management are performed with the aid of so called Surgical Information Systems (SIS) or, more general, Hospital Information Systems (HIS). Computer Assisted Surgery (CAS) summarizes techniques which assist a surgeon in the preparation and conduction of surgical interventions. Today still predominantly based on radiology images, these techniques include the preoperative determination of an optimal surgical strategy and intraoperative systems which aim at increasing the accuracy of surgical manipulations. CAS is a relatively young field of computer science. One of the unsolved "teething troubles" of CAS is the absence of technical standards for the interconnectivity of CAS system. Current CAS systems are usually "islands of information" with no connection to other devices within the operating room or hospital-wide information systems. Several workshop reports and individual publications point out that this situation leads to ergonomic, logistic, and economic limitations in hospital work. Perioperative processes are prolonged by the manual installation and configuration of an increasing amount of technical devices. Intraoperatively, a large amount of the surgeons'' attention is absorbed by the requirement to monitor and operate systems. The need for open infrastructures which enable the integration of CAS devices from different vendors in order to exchange information as well as commands among these devices through a network has been identified by numerous experts with backgrounds in medicine as well as engineering. This thesis contains two approaches to the integration of CAS systems: - For perioperative data exchange, the specification of new data structures as an amendment to the existing DICOM standard for radiology image management is presented. The extension of DICOM towards surgical application allows for the seamless integration of surgical planning and reporting systems into DICOM-based Picture Archiving and Communication Systems (PACS) as they are installed in most hospitals for the exchange and long-term archival of patient images and image-related patient data. - For the integration of intraoperatively used CAS devices, such as, e.g., navigation systems, video image sources, or biosensors, the concept of a surgical middleware is presented. A c++ class library, the TiCoLi, is presented which facilitates the configuration of ad-hoc networks among the modules of a distributed CAS system as well as the exchange of data streams, singular data objects, and commands between these modules. The TiCoLi is the first software library for a surgical field of application to implement all of these services. To demonstrate the suitability of the presented specifications and their implementation, two modular CAS applications are presented which utilize the proposed DICOM extensions for perioperative exchange of surgical planning data as well as the TiCoLi for establishing an intraoperative network of autonomous, yet not independent, CAS modules.Die moderne Hochleistungschirurgie des 21. Jahrhunderts ist auf vielerlei Weise abhĂ€ngig von Computern oder, im weiteren Sinne, der digitalen Datenverarbeitung. Administrative AblĂ€ufe, wie die Erstellung von NutzungsplĂ€nen fĂŒr die verfĂŒgbaren technischen, rĂ€umlichen und personellen Ressourcen, die Rechnungsstellung und - in zunehmendem Maße - die Verwaltung und Archivierung von Patientendaten werden mit Hilfe von digitalen Informationssystemen rationell und effizient durchgefĂŒhrt. Innerhalb der Krankenhausinformationssysteme (KIS, oder englisch HIS) stehen fĂŒr die speziellen BedĂŒrfnisse der einzelnen Fachabteilungen oft spezifische Informationssysteme zur VerfĂŒgung. Chirurgieinformationssysteme (CIS, oder englisch SIS) decken hierbei vor allen Dingen die Bereiche Operationsplanung sowie Materialwirtschaft fĂŒr spezifisch chirurgische Verbrauchsmaterialien ab. WĂ€hrend die genannten HIS und SIS vornehmlich der Optimierung administrativer Aufgaben dienen, stehen die Systeme der Computerassistierten Chirugie (CAS) wesentlich direkter im Dienste der eigentlichen chirugischen Behandlungsplanung und Therapie. Die CAS verwendet Methoden der Robotik, digitalen Bild- und Signalverarbeitung, kĂŒnstlichen Intelligenz, numerischen Simulation, um nur einige zu nennen, zur patientenspezifischen Behandlungsplanung und zur intraoperativen UnterstĂŒtzung des OP-Teams, allen voran des Chirurgen. Vor allen Dingen Fortschritte in der rĂ€umlichen Verfolgung von Werkzeugen und Patienten ("Tracking"), die VerfĂŒgbarkeit dreidimensionaler radiologischer Aufnahmen (CT, MRT, ...) und der Einsatz verschiedener Robotersysteme haben in den vergangenen Jahrzehnten den Einzug des Computers in den Operationssaal - medienwirksam - ermöglicht. Weniger prominent, jedoch keinesfalls von untergeordnetem praktischen Nutzen, sind Beispiele zur automatisierten Überwachung klinischer Messwerte, wie etwa Blutdruck oder SauerstoffsĂ€ttigung. Im Gegensatz zu den meist hochgradig verteilten und gut miteinander verwobenen Informationssystemen fĂŒr die Krankenhausadministration und Patientendatenverwaltung, sind die Systeme der CAS heutzutage meist wenig oder ĂŒberhaupt nicht miteinander und mit Hintergrundsdatenspeichern vernetzt. Eine Reihe wissenschaftlicher Publikationen und interdisziplinĂ€rer Workshops hat sich in den vergangen ein bis zwei Jahrzehnten mit den Problemen des Alltagseinsatzes von CAS Systemen befasst. Mit steigender IntensitĂ€t wurde hierbei auf den Mangel an infrastrukturiellen Grundlagen fĂŒr die Vernetzung intraoperativ eingesetzter CAS Systeme miteinander und mit den perioperativ eingesetzten Planungs-, Dokumentations- und Archivierungssystemen hingewiesen. Die sich daraus ergebenden negativen EinflĂŒsse auf die Effizienz perioperativer AblĂ€ufe - jedes GerĂ€t muss manuell in Betrieb genommen und mit den spezifischen Daten des nĂ€chsten Patienten gefĂŒttert werden - sowie die zunehmende Aufmerksamkeit, welche der Operateur und sein Team auf die Überwachung und dem Betrieb der einzelnen GerĂ€te verwenden muss, werden als eine der "Kinderkrankheiten" dieser relativ jungen Technologie betrachtet und stehen einer Verbreitung ĂŒber die Grenzen einer engagierten technophilen Nutzergruppe hinaus im Wege. Die vorliegende Arbeit zeigt zwei parallel von einander (jedoch, im Sinne der SchnittstellenkompatibilitĂ€t, nicht gĂ€nzlich unabhĂ€ngig voneinander) zu betreibende AnsĂ€tze zur Integration von CAS Systemen. - FĂŒr den perioperativen Datenaustausch wird die Spezifikation zusĂ€tzlicher Datenstrukturen zum Transfer chirurgischer Planungsdaten im Rahmen des in radiologischen Bildverarbeitungssystemen weit verbreiteten DICOM Standards vorgeschlagen und an zwei Beispielen vorgefĂŒhrt. Die Erweiterung des DICOM Standards fĂŒr den perioperativen Einsatz ermöglicht hierbei die nahtlose Integration chirurgischer Planungssysteme in existierende "Picture Archiving and Communication Systems" (PACS), welche in den meisten FĂ€llen auf dem DICOM Standard basieren oder zumindest damit kompatibel sind. Dadurch ist einerseits der Tatsache Rechnung getragen, dass die patientenspezifische OP-Planung in hohem Masse auf radiologischen Bildern basiert und andererseits sicher gestellt, dass die Planungsergebnisse entsprechend der geltenden Bestimmungen langfristig archiviert und gegen unbefugten Zugriff geschĂŒtzt sind - PACS Server liefern hier bereits wohlerprobte Lösungen. - FĂŒr die integration intraoperativer CAS Systeme, wie etwa Navigationssysteme, Videobildquellen oder Sensoren zur Überwachung der Vitalparameter, wird das Konzept einer "chirurgischen Middleware" vorgestellt. Unter dem Namen TiCoLi wurde eine c++ Klassenbibliothek entwickelt, auf deren Grundlage die Konfiguration von ad-hoc Netzwerken wĂ€hrend der OP-Vorbereitung mittels plug-and-play Mechanismen erleichtert wird. Nach erfolgter Konfiguration ermöglicht die TiCoLi den Austausch kontinuierlicher Datenströme sowie einzelner Datenpakete und Kommandos zwischen den Modulen einer verteilten CAS Anwendung durch ein Ethernet-basiertes Netzwerk. Die TiCoLi ist die erste frei verfĂŒgbare Klassenbibliothek welche diese FunktionalitĂ€ten dediziert fĂŒr einen Einsatz im chirurgischen Umfeld vereinigt. Zum Nachweis der Tauglichkeit der gezeigten Spezifikationen und deren Implementierungen, werden zwei modulare CAS Anwendungen prĂ€sentiert, welche die vorgeschlagenen DICOM Erweiterungen zum perioperativen Austausch von Planungsergebnissen sowie die TiCoLi zum intraoperativen Datenaustausch von Messdaten unter echzeitnahen Anforderungen verwenden

    The development of GIS to aid conservation of architectural and archaeological sites using digital terrestrial photogrammetry

    Get PDF
    This thesis is concerned with the creation and implementation of an Architectural/Archaeological information System (A/AIS) by integrating digital terrestrial photogrammetry and CAD facilities as applicable to the requirements of architects, archaeologists and civil engineers. Architects and archaeologists are involved with the measurement, analysis and recording of the historical buildings and monuments. Hard-copy photogrammetric methods supporting such analyses and documentation are well established. But the requirement to interpret, classify and quantitatively process photographs can be time consuming. Also, they have limited application and cannot be re-examined if the information desired is not directly presented and a much more challenging extraction of 3-D coordinates than in a digital photogrammetric environment. The A/AIS has been developed to the point that it can provide a precise and reliable technique for non-contact 3-D measurements. The speed of on-line data acquisition, high degree of automation and adaptability has made this technique a powerful measurement tool with a great number of applications for architectural or archaeological sites. The designed tool (A/AIS) has been successful in producing the expected results in tasks examined for St. Avit Senieur Abbey in France, Strome Castle in Scotland, Gilbert Scott Building of Glasgow University, Hunter Memorial in Glasgow University and Anobanini Rock in Iran. The goals of this research were: to extract, using digital photogrammetric digitising, 3-D coordinates of architectural/archaeological features, to identify an appropriate 3-D model, to import 3-D points/lines into an appropriate 3-D modeller, to generate 3-D objects. to design and implement a prototype architectural Information System using the above 3-D model, to compare this approach to traditional approaches of measuring and archiving required information. An assessment of the contribution of digital photogrammetry, GIS and CAD to the surveying, conservation, recording and documentation of historical buildings and cultural monuments include digital rectification and restitution, feature extraction for the creation of 3-D digital models and the computer visualisation are the focus of this research

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences
    corecore