324 research outputs found

    High Quality Image Interpolation via Local Autoregressive and Nonlocal 3-D Sparse Regularization

    Full text link
    In this paper, we propose a novel image interpolation algorithm, which is formulated via combining both the local autoregressive (AR) model and the nonlocal adaptive 3-D sparse model as regularized constraints under the regularization framework. Estimating the high-resolution image by the local AR regularization is different from these conventional AR models, which weighted calculates the interpolation coefficients without considering the rough structural similarity between the low-resolution (LR) and high-resolution (HR) images. Then the nonlocal adaptive 3-D sparse model is formulated to regularize the interpolated HR image, which provides a way to modify these pixels with the problem of numerical stability caused by AR model. In addition, a new Split-Bregman based iterative algorithm is developed to solve the above optimization problem iteratively. Experiment results demonstrate that the proposed algorithm achieves significant performance improvements over the traditional algorithms in terms of both objective quality and visual perceptionComment: 4 pages, 5 figures, 2 tables, to be published at IEEE Visual Communications and Image Processing (VCIP) 201

    Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision

    Full text link
    Feature selection is essential for effective visual recognition. We propose an efficient joint classifier learning and feature selection method that discovers sparse, compact representations of input features from a vast sea of candidates, with an almost unsupervised formulation. Our method requires only the following knowledge, which we call the \emph{feature sign}---whether or not a particular feature has on average stronger values over positive samples than over negatives. We show how this can be estimated using as few as a single labeled training sample per class. Then, using these feature signs, we extend an initial supervised learning problem into an (almost) unsupervised clustering formulation that can incorporate new data without requiring ground truth labels. Our method works both as a feature selection mechanism and as a fully competitive classifier. It has important properties, low computational cost and excellent accuracy, especially in difficult cases of very limited training data. We experiment on large-scale recognition in video and show superior speed and performance to established feature selection approaches such as AdaBoost, Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771

    A learning-based CT prostate segmentation method via joint transductive feature selection and regression

    Get PDF
    In1 recent years, there has been a great interest in prostate segmentation, which is a important and challenging task for CT image guided radiotherapy. In this paper, a learning-based segmentation method via joint transductive feature selection and transductive regression is presented, which incorporates the physician’s simple manual specification (only taking a few seconds), to aid accurate segmentation, especially for the case with large irregular prostate motion. More specifically, for the current treatment image, experienced physician is first allowed to manually assign the labels for a small subset of prostate and non-prostate voxels, especially in the first and last slices of the prostate regions. Then, the proposed method follows the two step: in prostate-likelihood estimation step, two novel algorithms: tLasso and wLapRLS, will be sequentially employed for transductive feature selection and transductive regression, respectively, aiming to generate the prostate-likelihood map. In multi-atlases based label fusion step, the final segmentation result will be obtained according to the corresponding prostate-likelihood map and the previous images of the same patient. The proposed method has been substantially evaluated on a real prostate CT dataset including 24 patients with 330 CT images, and compared with several state-of-the-art methods. Experimental results show that the proposed method outperforms the state-of-the-arts in terms of higher Dice ratio, higher true positive fraction, and lower centroid distances. Also, the results demonstrate that simple manual specification can help improve the segmentation performance, which is clinically feasible in real practice

    Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning

    Get PDF
    Face hallucination is a technique that reconstruct high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of image patch. In addition, when they are confronted with misalignment or the Small Sample Size (SSS) problem, the hallucination performance is very poor. To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL). Under the context-patch based framework, we advance a thresholding based representation method to enhance the reconstruction accuracy and reduce the computational complexity. To further improve the performance of the proposed algorithm, we propose a promotion strategy called reproducing learning. By adding the estimated HR face to the training set, which can simulates the case that the HR version of the input LR face is present in the training set, thus iteratively enhancing the final hallucination result. Experiments demonstrate that the proposed TLcR-RL method achieves a substantial increase in the hallucinated results, both subjectively and objectively. Additionally, the proposed framework is more robust to face misalignment and the SSS problem, and its hallucinated HR face is still very good when the LR test face is from the real-world. The MATLAB source code is available at https://github.com/junjun-jiang/TLcR-RL

    Contribution to Graph-based Manifold Learning with Application to Image Categorization.

    Get PDF
    122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad

    GAN you train your network

    Get PDF
    2022 Summer.Includes bibliographical references.Zero-shot classifiers identify unseen classes — classes not seen during training. Specifically, zero-shot models classify attribute information associated with classes (e.g., a zebra has stripes but a lion does not). Lately, the usage of generative adversarial networks (GAN) for zero-shot learning has significantly improved the recognition accuracy of unseen classes by producing visual features on any class. Here, I investigate how similar visual features obtained from images of a class are to the visual features generated by a GAN. I find that, regardless of metric, both sets of visual features are disjointed. I also fine-tune a ResNet so that it produces visual features that are similar to the visual features generated by a GAN — this is novel because all standard approaches do the opposite: they train the GAN to match the output of the model. I conclude that these experiments emphasize the need to establish a standard input pipeline in zero-shot learning because of the mismatch of generated and real features, as well as the variation in features (and subsequent GAN performance) from different implementations of models such as ResNet-101
    corecore