174 research outputs found

    Geometric deep learning reveals the spatiotemporal fingerprint of microscopic motion

    Full text link
    The characterization of dynamical processes in living systems provides important clues for their mechanistic interpretation and link to biological functions. Thanks to recent advances in microscopy techniques, it is now possible to routinely record the motion of cells, organelles, and individual molecules at multiple spatiotemporal scales in physiological conditions. However, the automated analysis of dynamics occurring in crowded and complex environments still lags behind the acquisition of microscopic image sequences. Here, we present a framework based on geometric deep learning that achieves the accurate estimation of dynamical properties in various biologically-relevant scenarios. This deep-learning approach relies on a graph neural network enhanced by attention-based components. By processing object features with geometric priors, the network is capable of performing multiple tasks, from linking coordinates into trajectories to inferring local and global dynamic properties. We demonstrate the flexibility and reliability of this approach by applying it to real and simulated data corresponding to a broad range of biological experiments.Comment: 17 pages, 5 figure, 2 supplementary figure

    Efficient Learning-based Image Enhancement : Application to Compression Artifact Removal and Super-resolution

    Get PDF
    Many computer vision and computational photography applications essentially solve an image enhancement problem. The image has been deteriorated by a specific noise process, such as aberrations from camera optics and compression artifacts, that we would like to remove. We describe a framework for learning-based image enhancement. At the core of our algorithm lies a generic regularization framework that comprises a prior on natural images, as well as an application-specific conditional model based on Gaussian processes. In contrast to prior learning-based approaches, our algorithm can instantly learn task-specific degradation models from sample images which enables users to easily adapt the algorithm to a specific problem and data set of interest. This is facilitated by our efficient approximation scheme of large-scale Gaussian processes. We demonstrate the efficiency and effectiveness of our approach by applying it to example enhancement applications including single-image super-resolution, as well as artifact removal in JPEG- and JPEG 2000-encoded images

    Artificial neural network-statistical approach for PET volume analysis and classification

    Get PDF
    Copyright © 2012 The Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.This article has been made available through the Brunel Open Access Publishing Fund.The increasing number of imaging studies and the prevailing application of positron emission tomography (PET) in clinical oncology have led to a real need for efficient PET volume handling and the development of new volume analysis approaches to aid the clinicians in the clinical diagnosis, planning of treatment, and assessment of response to therapy. A novel automated system for oncological PET volume analysis is proposed in this work. The proposed intelligent system deploys two types of artificial neural networks (ANNs) for classifying PET volumes. The first methodology is a competitive neural network (CNN), whereas the second one is based on learning vector quantisation neural network (LVQNN). Furthermore, Bayesian information criterion (BIC) is used in this system to assess the optimal number of classes for each PET data set and assist the ANN blocks to achieve accurate analysis by providing the best number of classes. The system evaluation was carried out using experimental phantom studies (NEMA IEC image quality body phantom), simulated PET studies using the Zubal phantom, and clinical studies representative of nonsmall cell lung cancer and pharyngolaryngeal squamous cell carcinoma. The proposed analysis methodology of clinical oncological PET data has shown promising results and can successfully classify and quantify malignant lesions.This study was supported by the Swiss National Science Foundation under Grant SNSF 31003A-125246, Geneva Cancer League, and the Indo Swiss Joint Research Programme ISJRP 138866. This article is made available through the Brunel Open Access Publishing Fund

    A Novel Image Compression Method Based on Classified Energy and Pattern Building Blocks

    Get PDF
    In this paper, a novel image compression method based on generation of the so-called classified energy and pattern blocks (CEPB) is introduced and evaluation results are presented. The CEPB is constructed using the training images and then located at both the transmitter and receiver sides of the communication system. Then the energy and pattern blocks of input images to be reconstructed are determined by the same way in the construction of the CEPB. This process is also associated with a matching procedure to determine the index numbers of the classified energy and pattern blocks in the CEPB which best represents (matches) the energy and pattern blocks of the input images. Encoding parameters are block scaling coefficient and index numbers of energy and pattern blocks determined for each block of the input images. These parameters are sent from the transmitter part to the receiver part and the classified energy and pattern blocks associated with the index numbers are pulled from the CEPB. Then the input image is reconstructed block by block in the receiver part using a mathematical model that is proposed. Evaluation results show that the method provides considerable image compression ratios and image quality even at low bit rates.The work described in this paper was funded by the Isik University Scientific Research Fund (Project contract no. 10B301). The author would like to thank to Professor B. S. Yarman (Istanbul University, College of Engineering, Department of Electrical-Electronics Engineering), Assistant Professor Hakan Gurkan (Isik University, Engineering Faculty, Department of Electrical-Electronics Engineering), the researchers in the International Computer Science Institute (ICSI), Speech Group, University of California at Berkeley, CA, USA and the researchers in the SRI International, Speech Technology and Research (STAR) Laboratory, Menlo Park, CA, USA for many helpful discussions on this work during his postdoctoral fellow years. The author also would like to thank the anonymous reviewers for their valuable comments and suggestions which substantially improved the quality of this paperPublisher's Versio

    Implicit-explicit Integrated Representations for Multi-view Video Compression

    Full text link
    With the increasing consumption of 3D displays and virtual reality, multi-view video has become a promising format. However, its high resolution and multi-camera shooting result in a substantial increase in data volume, making storage and transmission a challenging task. To tackle these difficulties, we propose an implicit-explicit integrated representation for multi-view video compression. Specifically, we first use the explicit representation-based 2D video codec to encode one of the source views. Subsequently, we propose employing the implicit neural representation (INR)-based codec to encode the remaining views. The implicit codec takes the time and view index of multi-view video as coordinate inputs and generates the corresponding implicit reconstruction frames.To enhance the compressibility, we introduce a multi-level feature grid embedding and a fully convolutional architecture into the implicit codec. These components facilitate coordinate-feature and feature-RGB mapping, respectively. To further enhance the reconstruction quality from the INR codec, we leverage the high-quality reconstructed frames from the explicit codec to achieve inter-view compensation. Finally, the compensated results are fused with the implicit reconstructions from the INR to obtain the final reconstructed frames. Our proposed framework combines the strengths of both implicit neural representation and explicit 2D codec. Extensive experiments conducted on public datasets demonstrate that the proposed framework can achieve comparable or even superior performance to the latest multi-view video compression standard MIV and other INR-based schemes in terms of view compression and scene modeling

    Text detection in natural scenes through weighted majority voting of DCT high pass filters, line removal, and color consistency filtering

    Get PDF
    Detecting text in images presents the unique challenge of finding both in-scene and superimposed text of various sizes, fonts, colors, and textures in complex backgrounds. The goal of this system is not to recognize specific letters or words but only to determine if a pixel is text or not. This pixel level decision is made by applying a set of weighted classifiers created using a set of high pass filters, and a series of image processing techniques. It is our assertion that the learned weighted combination of frequency filters in conjunction with image processing techniques may show better pixel level text detection performance in terms of precision, recall, and f-metric, than any of the components do individually. Qualitatively, our algorithm performs well and shows promising results. Quantitative numbers are not as high as is desired, but not unreasonable. For the complete ensemble, the f-metric was found to be 0.36

    Off-line Thai handwriting recognition in legal amount

    Get PDF
    Thai handwriting in legal amounts is a challenging problem and a new field in the area of handwriting recognition research. The focus of this thesis is to implement Thai handwriting recognition system. A preliminary data set of Thai handwriting in legal amounts is designed. The samples in the data set are characters and words of the Thai legal amounts and a set of legal amounts phrases collected from a number of native Thai volunteers. At the preprocessing and recognition process, techniques are introduced to improve the characters recognition rates. The characters are divided into two smaller subgroups by their writing levels named body and high groups. The recognition rates of both groups are increased based on their distinguished features. The writing level separation algorithms are implemented using the size and position of characters. Empirical experiments are set to test the best combination of the feature to increase the recognition rates. Traditional recognition systems are modified to give the accumulative top-3 ranked answers to cover the possible character classes. At the postprocessing process level, the lexicon matching algorithms are implemented to match the ranked characters with the legal amount words. These matched words are joined together to form possible choices of amounts. These amounts will have their syntax checked in the last stage. Several syntax violations are caused by consequence faulty character segmentation and recognition resulting from connecting or broken characters. The anomaly in handwriting caused by these characters are mainly detected by their size and shape. During the recovery process, the possible word boundary patterns can be pre-defined and used to segment the hypothesis words. These words are identified by the word recognition and the results are joined with previously matched words to form the full amounts and checked by the syntax rules again. From 154 amounts written by 10 writers, the rejection rate is 14.9 percent with the recovery processes. The recognition rate for the accepted amount is 100 percent
    corecore