7 research outputs found

    Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts

    Get PDF
    Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people

    Morphological Analysis for Object Recognition, Matching, and Applications

    Get PDF
    This thesis deals with the detection and classifcation of objects in visual images and with the analysis of shape changes between object instances. Whereas the task of object recognition focuses on learning models which describe common properties between instances of a specific category, the analysis of the specific differences between instances is also relevant to understand the objects and the categories themselves. This research is governed by the idea that important properties for the automatic perception and understanding of objects are transmitted through their geometry or shape. Therefore, models for object recognition and shape matching are devised which exploit the geometry and properties of the objects, using as little user supervision as possible. In order to learn object models for detection in a reliable manner, suitable object representations are required. The key idea in this work is to use a richer representation of the object shape within the object model in order to increase the description power and thus the performance of the whole system. For this purpose, we first investigate the integration of curvature information of shapes in the object model which is learned. Since natural objects intrinsically exhibit curved boundaries, an object is better described if this shape cue is integrated. This subject extends the widely used object representation based on gradient orientation histograms by incorporating a robust histogram-based description of curvature. We show that integrating this information substantially improves detection results over descriptors that solely rely upon histograms of orientated gradients. The impact of using richer shape representations for object recognition is further investigated through a novel method which goes beyond traditional bounding-box representations for objects. Visual recognition requires learning object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of objects using a bounding box seems inappropriate since such a box contains a significant amount of background clutter. Therefore, the presented approach learns object models for detection while simultaneously learning to segregate objects from clutter and extracting their overall shape, without however, requiring manual segmentation of the training samples. Shape equivalence is another interesting property related to shape. It refers to the ability of perceiving two distinct objects as having the same or similar shape. This thesis also explores the usage of this ability to detect objects in unsupervised scenarios, that is where no annotation of training data is available for learning a statistical model. For this purpose, a dataset of historical Chinese cartoons drawn during the Cultural Revolution and immediately thereafter is analyzed. Relevant objects in this dataset are emphasized through annuli of light rays. The idea of our method is to consider the different annuli as shape equivalent objects, that is, as objects sharing the same shape and devise a method to detect them. Thereafter, it is possible to indirectly infer the position, size and scale of the emphasized objects using the annuli detections. Not only commonalities among objects, but also the specific differences between them are perceived by a visual system. These differences can be understood through the analysis of how objects and their shape change. For this reason, this thesis also develops a novel methodology for analyzing the shape deformation between a single pair of images under missing correspondences. The key observation is that objects cannot deform arbitrarily, but rather the deformation itself follows the geometry and constraints imposed by the object itself. We describe the overall complex object deformation using a piecewise linear model. Thereby, we are able to identify each of the parts in the shape which share the same deformation. Thus, we are able to understand how an object and its parts were transformed. A remarkable property of the algorithm is the ability to automatically estimate the model complexity according to the overall complexity of the shape deformation. Specifically, the introduced methodology is used to analyze the deformation between original instances and reproductions of artworks. The nature of the analyzed alterations ranges from deliberate modifications by the artist to geometrical errors accumulated during the reproduction process of the image. The usage of this method within this application shows how productive the interaction between computer vision and the field of the humanities is. The goal is not to supplant human expertise, but to enhance and deepen connoisseurship about a given problem

    Analyse d’images de documents patrimoniaux : une approche structurelle à base de texture

    Get PDF
    Over the last few years, there has been tremendous growth in digitizing collections of cultural heritage documents. Thus, many challenges and open issues have been raised, such as information retrieval in digital libraries or analyzing page content of historical books. Recently, an important need has emerged which consists in designing a computer-aided characterization and categorization tool, able to index or group historical digitized book pages according to several criteria, mainly the layout structure and/or typographic/graphical characteristics of the historical document image content. Thus, the work conducted in this thesis presents an automatic approach for characterization and categorization of historical book pages. The proposed approach is applicable to a large variety of ancient books. In addition, it does not assume a priori knowledge regarding document image layout and content. It is based on the use of texture and graph algorithms to provide a rich and holistic description of the layout and content of the analyzed book pages to characterize and categorize historical book pages. The categorization is based on the characterization of the digitized page content by texture, shape, geometric and topological descriptors. This characterization is represented by a structural signature. More precisely, the signature-based characterization approach consists of two main stages. The first stage is extracting homogeneous regions. Then, the second one is proposing a graph-based page signature which is based on the extracted homogeneous regions, reflecting its layout and content. Afterwards, by comparing the different obtained graph-based signatures using a graph-matching paradigm, the similarities of digitized historical book page layout and/or content can be deduced. Subsequently, book pages with similar layout and/or content can be categorized and grouped, and a table of contents/summary of the analyzed digitized historical book can be provided automatically. As a consequence, numerous signature-based applications (e.g. information retrieval in digital libraries according to several criteria, page categorization) can be implemented for managing effectively a corpus or collections of books. To illustrate the effectiveness of the proposed page signature, a detailed experimental evaluation has been conducted in this work for assessing two possible categorization applications, unsupervised page classification and page stream segmentation. In addition, the different steps of the proposed approach have been evaluated on a large variety of historical document images.Les récents progrès dans la numérisation des collections de documents patrimoniaux ont ravivé de nouveaux défis afin de garantir une conservation durable et de fournir un accès plus large aux documents anciens. En parallèle de la recherche d'information dans les bibliothèques numériques ou l'analyse du contenu des pages numérisées dans les ouvrages anciens, la caractérisation et la catégorisation des pages d'ouvrages anciens a connu récemment un regain d'intérêt. Les efforts se concentrent autant sur le développement d'outils rapides et automatiques de caractérisation et catégorisation des pages d'ouvrages anciens, capables de classer les pages d'un ouvrage numérisé en fonction de plusieurs critères, notamment la structure des mises en page et/ou les caractéristiques typographiques/graphiques du contenu de ces pages. Ainsi, dans le cadre de cette thèse, nous proposons une approche permettant la caractérisation et la catégorisation automatiques des pages d'un ouvrage ancien. L'approche proposée se veut indépendante de la structure et du contenu de l'ouvrage analysé. Le principal avantage de ce travail réside dans le fait que l'approche s'affranchit des connaissances préalables, que ce soit concernant le contenu du document ou sa structure. Elle est basée sur une analyse des descripteurs de texture et une représentation structurelle en graphe afin de fournir une description riche permettant une catégorisation à partir du contenu graphique (capturé par la texture) et des mises en page (représentées par des graphes). En effet, cette catégorisation s'appuie sur la caractérisation du contenu de la page numérisée à l'aide d'une analyse des descripteurs de texture, de forme, géométriques et topologiques. Cette caractérisation est définie à l'aide d'une représentation structurelle. Dans le détail, l'approche de catégorisation se décompose en deux étapes principales successives. La première consiste à extraire des régions homogènes. La seconde vise à proposer une signature structurelle à base de texture, sous la forme d'un graphe, construite à partir des régions homogènes extraites et reflétant la structure de la page analysée. Cette signature assure la mise en œuvre de nombreuses applications pour gérer efficacement un corpus ou des collections de livres patrimoniaux (par exemple, la recherche d'information dans les bibliothèques numériques en fonction de plusieurs critères, ou la catégorisation des pages d'un même ouvrage). En comparant les différentes signatures structurelles par le biais de la distance d'édition entre graphes, les similitudes entre les pages d'un même ouvrage en termes de leurs mises en page et/ou contenus peuvent être déduites. Ainsi de suite, les pages ayant des mises en page et/ou contenus similaires peuvent être catégorisées, et un résumé/une table des matières de l'ouvrage analysé peut être alors généré automatiquement. Pour illustrer l'efficacité de la signature proposée, une étude expérimentale détaillée a été menée dans ce travail pour évaluer deux applications possibles de catégorisation de pages d'un même ouvrage, la classification non supervisée de pages et la segmentation de flux de pages d'un même ouvrage. En outre, les différentes étapes de l'approche proposée ont donné lieu à des évaluations par le biais d'expérimentations menées sur un large corpus de documents patrimoniaux

    Technology, Science and Culture

    Get PDF
    From the success of the first and second volume of this series, we are enthusiastic to continue our discussions on research topics related to the fields of Food Science, Intelligent Systems, Molecular Biomedicine, Water Science, and Creation and Theories of Culture. Our aims are to discuss the newest topics, theories, and research methods in each of the mentioned fields, to promote debates among top researchers and graduate students and to generate collaborative works among them

    Active Learning for Reducing Labeling Effort in Text Classification Tasks

    Get PDF
    Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce labeling effort by only using the data which the used model deems most informative. Little research has been done on AL in a text classification setting and next to none has involved the more recent, state-of-the-art Natural Language Processing (NLP) models. Here, we present an empirical study that compares different uncertainty-based algorithms with BERTbase_{base} as the used classifier. We evaluate the algorithms on two NLP classification datasets: Stanford Sentiment Treebank and KvK-Frontpages. Additionally, we explore heuristics that aim to solve presupposed problems of uncertainty-based AL; namely, that it is unscalable and that it is prone to selecting outliers. Furthermore, we explore the influence of the query-pool size on the performance of AL. Whereas it was found that the proposed heuristics for AL did not improve performance of AL; our results show that using uncertainty-based AL with BERTbase_{base} outperforms random sampling of data. This difference in performance can decrease as the query-pool size gets larger.Comment: Accepted as a conference paper at the joint 33rd Benelux Conference on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine Learning (BNAIC/BENELEARN 2021). This camera-ready version submitted to BNAIC/BENELEARN, adds several improvements including a more thorough discussion of related work plus an extended discussion section. 28 pages including references and appendice
    corecore