57 research outputs found

    Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training

    No full text
    International audienceDexterity and procedural knowledge are two critical skills that surgeons need to master to perform accurate and safe surgical interventions. However, current training systems do not allow us to provide an in-depth analysis of surgical gestures to precisely assess these skills. Our objective is to develop a method for the automatic and quantitative assessment of surgical gestures. To reach this goal, we propose a new unsupervised algorithm that can automatically segment kinematic data from robotic training sessions. Without relying on any prior information or model, this algorithm detects critical points in the kinematic data that define relevant spatio-temporal segments. Based on the association of these segments, we obtain an accurate recognition of the gestures involved in the surgical training task. We, then, perform an advanced analysis and assess our algorithm using datasets recorded during real expert training sessions. After comparing our approach with the manual annotations of the surgical gestures, we observe 97.4% accuracy for the learning purpose and an average matching score of 81.9% for the fully automated gesture recognition process. Our results show that trainees workflow can be followed and surgical gestures may be automatically evaluated according to an expert database. This approach tends toward improving training efficiency by minimizing the learning curve

    A Hybrid HMM/ANN Based Approach for Online Signature Verification

    Full text link

    Linking Sheet Music and Audio - Challenges and New Approaches

    Get PDF
    Score and audio files are the two most important ways to represent, convey, record, store, and experience music. While score describes a piece of music on an abstract level using symbols such as notes, keys, and measures, audio files allow for reproducing a specific acoustic realization of the piece. Each of these representations reflects different facets of music yielding insights into aspects ranging from structural elements (e.g., motives, themes, musical form) to specific performance aspects (e.g., artistic shaping, sound). Therefore, the simultaneous access to score and audio representations is of great importance. In this paper, we address the problem of automatically generating musically relevant linking structures between the various data sources that are available for a given piece of music. In particular, we discuss the task of sheet music-audio synchronization with the aim to link regions in images of scanned scores to musically corresponding sections in an audio recording of the same piece. Such linking structures form the basis for novel interfaces that allow users to access and explore multimodal sources of music within a single framework. As our main contributions, we give an overview of the state-of-the-art for this kind of synchronization task, we present some novel approaches, and indicate future research directions. In particular, we address problems that arise in the presence of structural differences and discuss challenges when applying optical music recognition to complex orchestral scores. Finally, potential applications of the synchronization results are presented

    Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training

    Get PDF
    Dexterity and procedural knowledge are two critical skills that surgeons need to master to perform accurate and safe surgical interventions. However, current training systems do not allow us to provide an in-depth analysis of surgical gestures to precisely assess these skills. Our objective is to develop a method for the automatic and quantitative assessment of surgical gestures. To reach this goal, we propose a new unsupervised algorithm that can automatically segment kinematic data from robotic training sessions. Without relying on any prior information or model, this algorithm detects critical points in the kinematic data that define relevant spatio-temporal segments. Based on the association of these segments, we obtain an accurate recognition of the gestures involved in the surgical training task. We, then, perform an advanced analysis and assess our algorithm using datasets recorded during real expert training sessions. After comparing our approach with the manual annotations of the surgical gestures, we observe 97.4% accuracy for the learning purpose and an average matching score of 81.9% for the fully automated gesture recognition process. Our results show that trainees workflow can be followed and surgical gestures may be automatically evaluated according to an expert database. This approach tends toward improving training efficiency by minimizing the learning curve

    Robust real-time tracking in smart camera networks

    Get PDF

    Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System

    Get PDF
    Over the last two decades, growing efforts to digitize our cultural heritage could be observed. Most of these digitization initiatives pursuit either one or both of the following goals: to conserve the documents - especially those threatened by decay - and to provide remote access on a grand scale. For music documents these trends are observable as well, and by now several digital music libraries are in existence. An important characteristic of these music libraries is an inherent multimodality resulting from the large variety of available digital music representations, such as scanned score, symbolic score, audio recordings, and videos. In addition, for each piece of music there exists not only one document of each type, but many. Considering and exploiting this multimodality and multiplicity, the DFG-funded digital library initiative PROBADO MUSIC aimed at developing a novel user-friendly interface for content-based retrieval, document access, navigation, and browsing in large music collections. The implementation of such a front end requires the multimodal linking and indexing of the music documents during preprocessing. As the considered music collections can be very large, the automated or at least semi-automated calculation of these structures would be recommendable. The field of music information retrieval (MIR) is particularly concerned with the development of suitable procedures, and it was the goal of PROBADO MUSIC to include existing and newly developed MIR techniques to realize the envisioned digital music library system. In this context, the present thesis discusses the following three MIR tasks: music synchronization, audio matching, and pattern detection. We are going to identify particular issues in these fields and provide algorithmic solutions as well as prototypical implementations. In Music synchronization, for each position in one representation of a piece of music the corresponding position in another representation is calculated. This thesis focuses on the task of aligning scanned score pages of orchestral music with audio recordings. Here, a previously unconsidered piece of information is the textual specification of transposing instruments provided in the score. Our evaluations show that the neglect of such information can result in a measurable loss of synchronization accuracy. Therefore, we propose an OCR-based approach for detecting and interpreting the transposition information in orchestral scores. For a given audio snippet, audio matching methods automatically calculate all musically similar excerpts within a collection of audio recordings. In this context, subsequence dynamic time warping (SSDTW) is a well-established approach as it allows for local and global tempo variations between the query and the retrieved matches. Moving to real-life digital music libraries with larger audio collections, however, the quadratic runtime of SSDTW results in untenable response times. To improve on the response time, this thesis introduces a novel index-based approach to SSDTW-based audio matching. We combine the idea of inverted file lists introduced by Kurth and Müller (Efficient index-based audio matching, 2008) with the shingling techniques often used in the audio identification scenario. In pattern detection, all repeating patterns within one piece of music are determined. Usually, pattern detection operates on symbolic score documents and is often used in the context of computer-aided motivic analysis. Envisioned as a new feature of the PROBADO MUSIC system, this thesis proposes a string-based approach to pattern detection and a novel interactive front end for result visualization and analysis

    Automated retinal layer segmentation and pre-apoptotic monitoring for three-dimensional optical coherence tomography

    Get PDF
    The aim of this PhD thesis was to develop segmentation algorithm adapted and optimized to retinal OCT data that will provide objective 3D layer thickness which might be used to improve diagnosis and monitoring of retinal pathologies. Additionally, a 3D stack registration method was produced by modifying an existing algorithm. A related project was to develop a pre-apoptotic retinal monitoring based on the changes in texture parameters of the OCT scans in order to enable treatment before the changes become irreversible; apoptosis refers to the programmed cell death that can occur in retinal tissue and lead to blindness. These issues can be critical for the examination of tissues within the central nervous system. A novel statistical model for segmentation has been created and successfully applied to a large data set. A broad range of future research possibilities into advanced pathologies has been created by the results obtained. A separate model has been created for choroid segmentation located deep in retina, as the appearance of choroid is very different from the top retinal layers. Choroid thickness and structure is an important index of various pathologies (diabetes etc.). As part of the pre-apoptotic monitoring project it was shown that an increase in proportion of apoptotic cells in vitro can be accurately quantified. Moreover, the data obtained indicates a similar increase in neuronal scatter in retinal explants following axotomy (removal of retinas from the eye), suggesting that UHR-OCT can be a novel non-invasive technique for the in vivo assessment of neuronal health. Additionally, an independent project within the computer science department in collaboration with the school of psychology has been successfully carried out, improving analysis of facial dynamics and behaviour transfer between individuals. Also, important improvements to a general signal processing algorithm, dynamic time warping (DTW), have been made, allowing potential application in a broad signal processing field.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Automatic signature verification system

    Get PDF
    Philosophiae Doctor - PhDIn this thesis, we explore dynamic signature verification systems. Unlike other signature models, we use genuine signatures in this project as they are more appropriate in real world applications. Signature verification systems are typical examples of biometric devices that use physical and behavioral characteristics to verify that a person really is who he or she claims to be. Other popular biometric examples include fingerprint scanners and hand geometry devices. Hand written signatures have been used for some time to endorse financial transactions and legal contracts although little or no verification of signatures is done. This sets it apart from the other biometrics as it is well accepted method of authentication. Until more recently, only hidden Markov models were used for model construction. Ongoing research on signature verification has revealed that more accurate results can be achieved by combining results of multiple models. We also proposed to use combinations of multiple single variate models instead of single multi variate models which are currently being adapted by many systems. Apart from these, the proposed system is an attractive way for making financial transactions more secure and authenticate electronic documents as it can be easily integrated into existing transaction procedures and electronic communication

    A new representation for matching words

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Sciences of Bilkent University, 2007.Thesis (Master's) -- Bilkent University, 2007.Includes bibliographical references leaves 77-82.Large archives of historical documents are challenging to many researchers all over the world. However, these archives remain inaccessible since manual indexing and transcription of such a huge volume is difficult. In addition, electronic imaging tools and image processing techniques gain importance with the rapid increase in digitalization of materials in libraries and archives. In this thesis, a language independent method is proposed for representation of word images, which leads to retrieval and indexing of documents. While character recognition methods suffer from preprocessing and overtraining, we make use of another method, which is based on extracting words from documents and representing each word image with the features of invariant regions. The bag-of-words approach, which is shown to be successful to classify objects and scenes, is adapted for matching words. Since the curvature or connection points, or the dots are important visual features to distinct two words from each other, we make use of the salient points which are shown to be successful in representing such distinctive areas and heavily used for matching. Difference of Gaussian (DoG) detector, which is able to find scale invariant regions, and Harris Affine detector, which detects affine invariant regions, are used for detection of such areas and detected keypoints are described with Scale Invariant Feature Transform (SIFT) features. Then, each word image is represented by a set of visual terms which are obtained by vector quantization of SIFT descriptors and similar words are matched based on the similarity of these representations by using different distance measures. These representations are used both for document retrieval and word spotting. The experiments are carried out on Arabic, Latin and Ottoman datasets, which included different writing styles and different writers. The results show that the proposed method is successful on retrieval and indexing of documents even if with different scripts and different writers and since it is language independent, it can be easily adapted to other languages as well. Retrieval performance of the system is comparable to the state of the art methods in this field. In addition, the system is succesfull on capturing semantic similarities, which is useful for indexing, and it does not include any supervising step.Ataer, EsraM.S

    Automated retinal layer segmentation and pre-apoptotic monitoring for three-dimensional optical coherence tomography

    Get PDF
    The aim of this PhD thesis was to develop segmentation algorithm adapted and optimized to retinal OCT data that will provide objective 3D layer thickness which might be used to improve diagnosis and monitoring of retinal pathologies. Additionally, a 3D stack registration method was produced by modifying an existing algorithm. A related project was to develop a pre-apoptotic retinal monitoring based on the changes in texture parameters of the OCT scans in order to enable treatment before the changes become irreversible; apoptosis refers to the programmed cell death that can occur in retinal tissue and lead to blindness. These issues can be critical for the examination of tissues within the central nervous system. A novel statistical model for segmentation has been created and successfully applied to a large data set. A broad range of future research possibilities into advanced pathologies has been created by the results obtained. A separate model has been created for choroid segmentation located deep in retina, as the appearance of choroid is very different from the top retinal layers. Choroid thickness and structure is an important index of various pathologies (diabetes etc.). As part of the pre-apoptotic monitoring project it was shown that an increase in proportion of apoptotic cells in vitro can be accurately quantified. Moreover, the data obtained indicates a similar increase in neuronal scatter in retinal explants following axotomy (removal of retinas from the eye), suggesting that UHR-OCT can be a novel non-invasive technique for the in vivo assessment of neuronal health. Additionally, an independent project within the computer science department in collaboration with the school of psychology has been successfully carried out, improving analysis of facial dynamics and behaviour transfer between individuals. Also, important improvements to a general signal processing algorithm, dynamic time warping (DTW), have been made, allowing potential application in a broad signal processing field
    corecore