1,511 research outputs found

    DeepMatching: Hierarchical Deformable Dense Matching

    Get PDF
    We introduce a novel matching algorithm, called DeepMatching, to compute dense correspondences between images. DeepMatching relies on a hierarchical, multi-layer, correlational architecture designed for matching images and was inspired by deep convolutional approaches. The proposed matching algorithm can handle non-rigid deformations and repetitive textures and efficiently determines dense correspondences in the presence of significant changes between images. We evaluate the performance of DeepMatching, in comparison with state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al 2005), the MPI-Sintel (Butler et al 2012) and the Kitti (Geiger et al 2013) datasets. DeepMatching outperforms the state-of-the-art algorithms and shows excellent results in particular for repetitive textures.We also propose a method for estimating optical flow, called DeepFlow, by integrating DeepMatching in the large displacement optical flow (LDOF) approach of Brox and Malik (2011). Compared to existing matching algorithms, additional robustness to large displacements and complex motion is obtained thanks to our matching approach. DeepFlow obtains competitive performance on public benchmarks for optical flow estimation

    Differently stained whole slide image registration technique with landmark validation

    Get PDF
    Abstract. One of the most significant features in digital pathology is to compare and fuse successive differently stained tissue sections, also called slides, visually. Doing so, aligning different images to a common frame, ground truth, is required. Current sample scanning tools enable to create images full of informative layers of digitalized tissues, stored with a high resolution into whole slide images. However, there are a limited amount of automatic alignment tools handling large images precisely in acceptable processing time. The idea of this study is to propose a deep learning solution for histopathology image registration. The main focus is on the understanding of landmark validation and the impact of stain augmentation on differently stained histopathology images. Also, the developed registration method is compared with the state-of-the-art algorithms which utilize whole slide images in the field of digital pathology. There are previous studies about histopathology, digital pathology, whole slide imaging and image registration, color staining, data augmentation, and deep learning that are referenced in this study. The goal is to develop a learning-based registration framework specifically for high-resolution histopathology image registration. Different whole slide tissue sample images are used with a resolution of up to 40x magnification. The images are organized into sets of consecutive, differently dyed sections, and the aim is to register the images based on only the visible tissue and ignore the background. Significant structures in the tissue are marked with landmarks. The quality measurements include, for example, the relative target registration error, structural similarity index metric, visual evaluation, landmark-based evaluation, matching points, and image details. These results are comparable and can be used also in the future research and in development of new tools. Moreover, the results are expected to show how the theory and practice are combined in whole slide image registration challenges. DeepHistReg algorithm will be studied to better understand the development of stain color feature augmentation-based image registration tool of this study. Matlab and Aperio ImageScope are the tools to annotate and validate the image, and Python is used to develop the algorithm of this new registration tool. As cancer is globally a serious disease regardless of age or lifestyle, it is important to find ways to develop the systems experts can use while working with patients’ data. There is still a lot to improve in the field of digital pathology and this study is one step toward it.Eri menetelmin värjättyjen virtuaalinäytelasien rekisteröintitekniikka kiintopisteiden validointia hyödyntäen. Tiivistelmä. Yksi tärkeimmistä digitaalipatologian ominaisuuksista on verrata ja fuusioida peräkkäisiä eri menetelmin värjättyjä kudosleikkeitä toisiinsa visuaalisesti. Tällöin keskenään lähes identtiset kuvat kohdistetaan samaan yhteiseen kehykseen, niin sanottuun pohjatotuuteen. Nykyiset näytteiden skannaustyökalut mahdollistavat sellaisten kuvien luonnin, jotka ovat täynnä kerroksittaista tietoa digitalisoiduista näytteistä, tallennettuna erittäin korkean resoluution virtuaalisiin näytelaseihin. Tällä hetkellä on olemassa kuitenkin vain kourallinen automaattisia työkaluja, jotka kykenevät käsittelemään näin valtavia kuvatiedostoja tarkasti hyväksytyin aikarajoin. Tämän työn tarkoituksena on syväoppimista hyväksikäyttäen löytää ratkaisu histopatologisten kuvien rekisteröintiin. Tärkeimpänä osa-alueena on ymmärtää kiintopisteiden validoinnin periaatteet sekä eri väriaineiden augmentoinnin vaikutus. Lisäksi tässä työssä kehitettyä rekisteröintialgoritmia tullaan vertailemaan muihin kirjallisuudessa esitettyihin algoritmeihin, jotka myös hyödyntävät virtuaalinäytelaseja digitaalipatologian saralla. Kirjallisessa osiossa tullaan siteeraamaan aiempia tutkimuksia muun muassa seuraavista aihealueista: histopatologia, digitaalipatologia, virtuaalinäytelasi, kuvantaminen ja rekisteröinti, näytteen värjäys, data-augmentointi sekä syväoppiminen. Tavoitteena on kehittää oppimispohjainen rekisteröintikehys erityisesti korkearesoluutioisille digitalisoiduille histopatologisille kuville. Erilaisissa näytekuvissa tullaan käyttämään jopa 40-kertaista suurennosta. Kuvat kudoksista on järjestetty eri menetelmin värjättyihin peräkkäisiin kuvasarjoihin ja tämän työn päämääränä on rekisteröidä kuvat pohjautuen ainoastaan kudosten näkyviin osuuksiin, jättäen kuvien tausta huomioimatta. Kudosten merkittävimmät rakenteet on merkattu niin sanotuin kiintopistein. Työn laatumittauksina käytetään arvoja, kuten kohteen suhteellinen rekisteröintivirhe (rTRE), rakenteellisen samankaltaisuuindeksin mittari (SSIM), sekä visuaalista arviointia, kiintopisteisiin pohjautuvaa arviointia, yhteensopivuuskohtia, ja kuvatiedoston yksityiskohtia. Nämä arvot ovat verrattavissa myös tulevissa tutkimuksissa ja samaisia arvoja voidaan käyttää uusia työkaluja kehiteltäessä. DeepHistReg metodi toimii pohjana tässä työssä kehitettävälle näytteen värjäyksen parantamiseen pohjautuvalle rekisteröintityökalulle. Matlab ja Aperio ImageScope ovat ohjelmistoja, joita tullaan hyödyntämään tässä työssä kuvien merkitsemiseen ja validointiin. Ohjelmointikielenä käytetään Pythonia. Syöpä on maailmanlaajuisesti vakava sairaus, joka ei katso ikää eikä elämäntyyliä. Siksi on tärkeää löytää uusia keinoja kehittää työkaluja, joita asiantuntijat voivat hyödyntää jokapäiväisessä työssään potilastietojen käsittelyssä. Digitaalipatologian osa-alueella on vielä paljon innovoitavaa ja tämä työ on yksi askel eteenpäin taistelussa syöpäsairauksia vastaan

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Bridging the gap between reconstruction and synthesis

    Get PDF
    Aplicat embargament des de la data de defensa fins el 15 de gener de 20223D reconstruction and image synthesis are two of the main pillars in computer vision. Early works focused on simple tasks such as multi-view reconstruction and texture synthesis. With the spur of Deep Learning, the field has rapidly progressed, making it possible to achieve more complex and high level tasks. For example, the 3D reconstruction results of traditional multi-view approaches are currently obtained with single view methods. Similarly, early pattern based texture synthesis works have resulted in techniques that allow generating novel high-resolution images. In this thesis we have developed a hierarchy of tools that cover all these range of problems, lying at the intersection of computer vision, graphics and machine learning. We tackle the problem of 3D reconstruction and synthesis in the wild. Importantly, we advocate for a paradigm in which not everything should be learned. Instead of applying Deep Learning naively we propose novel representations, layers and architectures that directly embed prior 3D geometric knowledge for the task of 3D reconstruction and synthesis. We apply these techniques to problems including scene/person reconstruction and photo-realistic rendering. We first address methods to reconstruct a scene and the clothed people in it while estimating the camera position. Then, we tackle image and video synthesis for clothed people in the wild. Finally, we bridge the gap between reconstruction and synthesis under the umbrella of a unique novel formulation. Extensive experiments conducted along this thesis show that the proposed techniques improve the performance of Deep Learning models in terms of the quality of the reconstructed 3D shapes / synthesised images, while reducing the amount of supervision and training data required to train them. In summary, we provide a variety of low, mid and high level algorithms that can be used to incorporate prior knowledge into different stages of the Deep Learning pipeline and improve performance in tasks of 3D reconstruction and image synthesis.La reconstrucció 3D i la síntesi d'imatges són dos dels pilars fonamentals en visió per computador. Els estudis previs es centren en tasques senzilles com la reconstrucció amb informació multi-càmera i la síntesi de textures. Amb l'aparició del "Deep Learning", aquest camp ha progressat ràpidament, fent possible assolir tasques molt més complexes. Per exemple, per obtenir una reconstrucció 3D, tradicionalment s'utilitzaven mètodes multi-càmera, en canvi ara, es poden obtenir a partir d'una sola imatge. De la mateixa manera, els primers treballs de síntesi de textures basats en patrons han donat lloc a tècniques que permeten generar noves imatges completes en alta resolució. En aquesta tesi, hem desenvolupat una sèrie d'eines que cobreixen tot aquest ventall de problemes, situats en la intersecció entre la visió per computador, els gràfics i l'aprenentatge automàtic. Abordem el problema de la reconstrucció i la síntesi 3D en el món real. És important destacar que defensem un paradigma on no tot s'ha d'aprendre. Enlloc d'aplicar el "Deep Learning" de forma naïve, proposem representacions novedoses i arquitectures que incorporen directament els coneixements geomètrics ja existents per a aconseguir la reconstrucció 3D i la síntesi d'imatges. Nosaltres apliquem aquestes tècniques a problemes com ara la reconstrucció d'escenes/persones i a la renderització d'imatges fotorealistes. Primer abordem els mètodes per reconstruir una escena, les persones vestides que hi ha i la posició de la càmera. A continuació, abordem la síntesi d'imatges i vídeos de persones vestides en situacions quotidianes. I finalment, aconseguim, a través d'una nova formulació única, connectar la reconstrucció amb la síntesi. Els experiments realitzats al llarg d'aquesta tesi demostren que les tècniques proposades milloren el rendiment dels models de "Deepp Learning" pel que fa a la qualitat de les reconstruccions i les imatges sintetitzades alhora que redueixen la quantitat de dades necessàries per entrenar-los. En resum, proporcionem una varietat d'algoritmes de baix, mitjà i alt nivell que es poden utilitzar per incorporar els coneixements previs a les diferents etapes del "Deep Learning" i millorar el rendiment en tasques de reconstrucció 3D i síntesi d'imatges.Postprint (published version

    Analysis and Manipulation of Repetitive Structures of Varying Shape

    Get PDF
    Self-similarity and repetitions are ubiquitous in man-made and natural objects. Such structural regularities often relate to form, function, aesthetics, and design considerations. Discovering structural redundancies along with their dominant variations from 3D geometry not only allows us to better understand the underlying objects, but is also beneficial for several geometry processing tasks including compact representation, shape completion, and intuitive shape manipulation. To identify these repetitions, we present a novel detection algorithm based on analyzing a graph of surface features. We combine general feature detection schemes with a RANSAC-based randomized subgraph searching algorithm in order to reliably detect recurring patterns of locally unique structures. A subsequent segmentation step based on a simultaneous region growing is applied to verify that the actual data supports the patterns detected in the feature graphs. We introduce our graph based detection algorithm on the example of rigid repetitive structure detection. Then we extend the approach to allow more general deformations between the detected parts. We introduce subspace symmetries whereby we characterize similarity by requiring the set of repeating structures to form a low dimensional shape space. We discover these structures based on detecting linearly correlated correspondences among graphs of invariant features. The found symmetries along with the modeled variations are useful for a variety of applications including non-local and non-rigid denoising. Employing subspace symmetries for shape editing, we introduce a morphable part model for smart shape manipulation. The input geometry is converted to an assembly of deformable parts with appropriate boundary conditions. Our method uses self-similarities from a single model or corresponding parts of shape collections as training input and allows the user also to reassemble the identified parts in new configurations, thus exploiting both the discrete and continuous learned variations while ensuring appropriate boundary conditions across part boundaries. We obtain an interactive yet intuitive shape deformation framework producing realistic deformations on classes of objects that are difficult to edit using repetition-unaware deformation techniques

    Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

    Full text link
    Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

    Structure-aware content creation : detection, retargeting and deformation

    Get PDF
    Nowadays, access to digital information has become ubiquitous, while three-dimensional visual representation is becoming indispensable to knowledge understanding and information retrieval. Three-dimensional digitization plays a natural role in bridging connections between the real and virtual world, which prompt the huge demand for massive three-dimensional digital content. But reducing the effort required for three-dimensional modeling has been a practical problem, and long standing challenge in compute graphics and related fields. In this thesis, we propose several techniques for lightening up the content creation process, which have the common theme of being structure-aware, ie maintaining global relations among the parts of shape. We are especially interested in formulating our algorithms such that they make use of symmetry structures, because of their concise yet highly abstract principles are universally applicable to most regular patterns. We introduce our work from three different aspects in this thesis. First, we characterized spaces of symmetry preserving deformations, and developed a method to explore this space in real-time, which significantly simplified the generation of symmetry preserving shape variants. Second, we empirically studied three-dimensional offset statistics, and developed a fully automatic retargeting application, which is based on verified sparsity. Finally, we made step forward in solving the approximate three-dimensional partial symmetry detection problem, using a novel co-occurrence analysis method, which could serve as the foundation to high-level applications.Jetzt hat die Zugang zu digitalen Informationen allgegenwärtig geworden. Dreidimensionale visuelle Darstellung wird immer zum Einsichtsverständnis und Informationswiedergewinnung unverzichtbar. Dreidimensionale Digitalisierung verbindet die reale und virtuelle Welt auf natürliche Weise, die prompt die große Nachfrage nach massiven dreidimensionale digitale Inhalte. Es ist immer noch ein praktisches Problem und langjährige Herausforderung in Computergrafik und verwandten Bereichen, die den Aufwand für die dreidimensionale Modellierung reduzieren. In dieser Dissertation schlagen wir verschiedene Techniken zur Aufhellung der Erstellung von Inhalten auf, im Rahmen der gemeinsamen Thema der struktur-bewusst zu sein, d.h. globalen Beziehungen zwischen den Teilen der Gestalt beibehalten wird. Besonders interessiert sind wir bei der Formulierung unserer Algorithmen, so dass sie den Einsatz von Symmetrische Strukturen machen, wegen ihrer knappen, aber sehr abstrakten Prinzipien für die meisten regelmäßigen Mustern universell einsetzbar sind. Wir stellen unsere Arbei aus drei verschiedenen Aspekte in dieser Dissertation. Erstens befinden wir Räume der Verformungen, die Symmetrien zu erhalten, und entwickelten wir eine Methode, diesen Raum in Echtzeit zu erkunden, die deutlich die Erzeugung von Gestalten vereinfacht, die Symmetrien zu bewahren. Zweitens haben wir empirisch untersucht dreidimensionale Offset Statistiken und entwickelten eine vollautomatische Applikation für Retargeting, die auf den verifizierte Seltenheit basiert. Schließlich treten wir uns auf die ungefähre dreidimensionalen Teilsymmetrie Erkennungsproblem zu lösen, auf der Grundlage unserer neuen Kookkurrenz Analyseverfahren, die viele hochrangige Anwendungen dienen verwendet werden könnten

    Object detection and activity recognition in digital image and video libraries

    Get PDF
    This thesis is a comprehensive study of object-based image and video retrieval, specifically for car and human detection and activity recognition purposes. The thesis focuses on the problem of connecting low level features to high level semantics by developing relational object and activity presentations. With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools. The traditional text based query systems based on manual annotation process are impractical for today\u27s large libraries requiring an efficient information retrieval system. For this purpose, a hierarchical information retrieval system is proposed where shape, color and motion characteristics of objects of interest are captured in compressed and uncompressed domains. The proposed retrieval method provides object detection and activity recognition at different resolution levels from low complexity to low false rates. The thesis first examines extraction of low level features from images and videos using intensity, color and motion of pixels and blocks. Local consistency based on these features and geometrical characteristics of the regions is used to group object parts. The problem of managing the segmentation process is solved by a new approach that uses object based knowledge in order to group the regions according to a global consistency. A new model-based segmentation algorithm is introduced that uses a feedback from relational representation of the object. The selected unary and binary attributes are further extended for application specific algorithms. Object detection is achieved by matching the relational graphs of objects with the reference model. The major advantages of the algorithm can be summarized as improving the object extraction by reducing the dependence on the low level segmentation process and combining the boundary and region properties. The thesis then addresses the problem of object detection and activity recognition in compressed domain in order to reduce computational complexity. New algorithms for object detection and activity recognition in JPEG images and MPEG videos are developed. It is shown that significant information can be obtained from the compressed domain in order to connect to high level semantics. Since our aim is to retrieve information from images and videos compressed using standard algorithms such as JPEG and MPEG, our approach differentiates from previous compressed domain object detection techniques where the compression algorithms are governed by characteristics of object of interest to be retrieved. An algorithm is developed using the principal component analysis of MPEG motion vectors to detect the human activities; namely, walking, running, and kicking. Object detection in JPEG compressed still images and MPEG I frames is achieved by using DC-DCT coefficients of the luminance and chrominance values in the graph based object detection algorithm. The thesis finally addresses the problem of object detection in lower resolution and monochrome images. Specifically, it is demonstrated that the structural information of human silhouettes can be captured from AC-DCT coefficients
    corecore