310 research outputs found

    Permutation-invariant Feature Restructuring for Correlation-aware Image Set-based Recognition

    Full text link
    We consider the problem of comparing the similarity of image sets with variable-quantity, quality and un-ordered heterogeneous images. We use feature restructuring to exploit the correlations of both inner&\&inter-set images. Specifically, the residual self-attention can effectively restructure the features using the other features within a set to emphasize the discriminative images and eliminate the redundancy. Then, a sparse/collaborative learning-based dependency-guided representation scheme reconstructs the probe features conditional to the gallery features in order to adaptively align the two sets. This enables our framework to be compatible with both verification and open-set identification. We show that the parametric self-attention network and non-parametric dictionary learning can be trained end-to-end by a unified alternative optimization scheme, and that the full framework is permutation-invariant. In the numerical experiments we conducted, our method achieves top performance on competitive image set/video-based face recognition and person re-identification benchmarks.Comment: Accepted to ICCV 201

    An application of ordinal regression to extract social dysfunction levels through behavioral problems

    Get PDF
    Psychological problems are complex in nature and accurate identification of these problems is important. For the identification of psychological problems, one of the preliminary tools is the use of interviews/questionnaires. Questionnaires are preferred over interviews if the group under study is large. A strengths and difficulties questionnaire (SDQ) is one of the most widely used and powerful questionnaires to identify behavioral problems and distresses being faced by the respondents, affecting their day-to-day lives (responsible for social dysfunction). This study was held on college/university students in India, with the objective of examining if the extent of social dysfunction as measured by an impact score can be extracted from behavioral problems which are the components of the difficulty score of SDQ. Two surveys were conducted during the COVID-19 pandemic period, between the months of May–June 2020 and October 2020–February 2021 for the study. Only those responses were considered who felt distressed (“yes” to item 26 of SDQ). The numbers of such responses were 772/1020 and 584/743, respectively, in the two surveys. Distress levels were treated as ordered variables and three categories of distress level, viz., “Normal”, “Borderline”, and “Abnormal” were estimated through behavioral problems using ordinal regression (OR) methods with a negative log-log link function. The fitting of OR models was tested and accepted using Cox and Snell, Nagelkerke, and McFadden test. Hyperactivity-inattention and emotional symptoms were significant contributors to estimating levels of distress among respondents in survey 1 (p < 0.05). In addition to these components, in survey 2, peer problems were also significant. OR models were good at estimating the extreme categories; however, the “Borderline” category was not estimated well. One of the reasons was the use of qualitative and complex data with the least wide “Borderline” category, both for the “Difficulty” and the “Impact” scores

    AUTO3D: Novel view synthesis through unsupervisely learned variational viewpoint and global 3D representation

    Full text link
    This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision. In the viewer-centered coordinates, we construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation (shape, texture and the origin of viewer-centered coordinates, etc.). The global appearance of the 3D object is given by several appearance-describing images taken from any number of viewpoints. Our spatial correlation module extracts a global 3D representation from the appearance-describing images in a permutation invariant manner. Our system can achieve implicitly 3D understanding without explicitly 3D reconstruction. With an unsupervisely learned viewer-centered relative-pose/rotation code, the decoder can hallucinate the novel view continuously by sampling the relative-pose in a prior distribution. In various applications, we demonstrate that our model can achieve comparable or even better results than pose/3D model-supervised learning-based novel view synthesis (NVS) methods with any number of input views.Comment: ECCV 202

    Techniques in Ordinal Classification and Image-to-Image Translation

    Get PDF
    Dans cette thèse, nous explorons deux thèmes de recherche dans le domaine de l’apprentissage en profondeur et de l’imagerie médicale. La première est dans la classification ordinale, dans laquelle les classes à prévoir sont discrètes mais ont une relation d’ordonnancement. Les distributions de probabilités sous les classes ordinales peuvent posséder des propriétés indésirables, comme la non-unimodalité. Nous proposons une technique simple pour contraindre les distributions de probabilités ordinales discrètes à être unimodales par l’utilisation des distributions de Poisson et des distributions de probabilités binomiales. Nous évaluons cette approche sur la base d’une estimation de l’âge et d’un ensemble de données Kaggle sur la rétinopathie diabétique et obtenons des résultats compétitifs. Nous supposons que la contrainte d’unimodalité – en plus de rendre les distributions de probabilité plus interprétables – agit comme un régularisateur qui peut atténuer le dépassement, surtout dans un régime de données faible. Dans le second thème, nous explorons la traduction d’image à image contradictoire et motivons leur utilité dans le cadre d’un apprentissage semi-supervisé. Nous évaluons une méthode existante et en proposons une nouvelle que nous évaluons sur plusieurs bases de données comme celles utilisées dans notre travail sur la classification ordinale. Dans ce dernier cas, nous voulons établir une correspondance entre le domaine des scanners de patients symptomatiques et celui des scanners de patients non symptomatiques. Cela forme effectivement un modèle qui peut démêler les facteurs de variation sous-jacents et apprendre à détecter et à supprimer les zones symptomatiques de l’image, ce qui pourrait être exploité de plusieurs façons, comme aider un réseau qui s’appuie sur des étiquettes riches, ou générer des exemples synthétiques. Nous présentons des résultats qualitatifs intéressants et motivons plusieurs pistes prometteuses pour l’avenir.----------ABSTRACT: In this thesis we explore two research topics within the realm of deep learning and medical imaging. The first is in ordinal classification, in which the classes to be predicted are discrete but have an ordering relation. Probability distributions under ordinal classes can possess undesired properties, such as non-unimodality. We propose a straightforward technique to constrain discrete ordinal probability distributions to be unimodal via the use of the Poisson and binomial probability distributions. We evaluate this approach on an age estimation and Kaggle diabetic retinopathy dataset and obtain competitive results. We conjecture that the unimodality constraint – in addition to making the probability distributions more interpretable – acts as a regulariser which can mitigate overfitting, especially in a low data regime. In the second topic, we explore adversarial image-to-image translation and motivate their utility within the framework of semi-supervised learning. We evaluate an existing method and propose a new one which we evaluate on several datasets such as the ones employed in our work on ordinal classification. In the case of the latter, we want to map from the domain of symptomatic patient scans to non-symptomatic patient scans. This effectively trains a model which can disentangle the underlying factors of variation and learn to detect and remove symptomatic regions in the image, which could be leveraged in several ways, such as aiding a network which relies on rich labels, or generating synthetic examples. We present some interesting qualitative results and motivate several promising avenues to take for the future
    • …
    corecore