Search CORE

18 research outputs found

BoR: Bag-of-Relations for Symbol Retrieval

Author: K.C. Santosh
Lamiroy Bart
Wendling Laurent
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2014
Field of study

International audienceIn this paper, we address a new scheme for symbol retrieval based on bag-of-relations (BoRs) which are computed between extracted visual primitives (e.g. circle and corner). Our features consist of pairwise spatial relations from all possible combinations of individual visual primitives. The key characteristic of the overall process is to use topological relation information indexed in bags-of-relations and use this for recognition. As a consequence, directional relation matching takes place only with those candidates having similar topological configurations. A comprehensive study is made by using several different well known datasets such as GREC, FRESH and SESYD, and includes a comparison with state-of-the-art descriptors. Experiments provide interesting results on symbol spotting and other user-friendly symbol retrieval applications

INRIA a CCSD electronic archive server

HAL Descartes

Linear combination of multiresolution descriptors : application to graphics recognition /

Author: Ramos Terrades Oriol
Publication venue: Bellaterra : Universitat Autònoma de Barcelona,
Publication date: 01/01/2006
Field of study

Consultable des del TDXEn el camp de l'Anàlisi de Documents voldríem ser capaços de processar automàticament qualsevol tipus de document digital i d'extreure la informació rellevant. és a dir, voldríem conËixer la configuració del document, identificar cadascuna de les seves parts i reconËixer els seus continguts; per a poder fer cerques entre les components del document, però també, per fer cerques entre documents diferents. Aquest és un problema difícil que ha motivat diferents línies de recerca a diferents nivells. S'ha desenvolupat tot una sèrie de tècniques destinades a pre-processar la imatge per augmentar la seva qualitat, reduint el soroll dels sistemes d'adquisició i minimitzant els efectes de la degradació dels documents. També trobem molts treballs en la segmentació destinats a separar les àrees d'interès de la resta del document. Finalment, des de finals dels anys 60 fins a l'actualitat s'han proposat molts tipus descriptors que pretenen representar i identificar aquestes àrees d'interès. En aquesta tesis ens hem centrat en el darrer d'aquests problemes, la descripció de formes però també en la fusió de classificadors per a aplicar-los a una de les apliacions de l'Anàlisi de Documents, el reconeixement de símbols gràfics. En el reconeixement de formes, moltes aplicacions han de fer front al problema de descriure un conjunt gran i complex de formes per a reconèixer-les, o per a recuperar-les de gran bases de dades. En alguns casos, a més del gran nombre de formes, podem trobar altres dificultats com són la semblança entre formes o la variabilitat de classes de símbols. En aquest casos, un punt clau en el procés de reconeixement de formes és la definició de descriptors de gran capacitat de discriminació. Malauradament, un sol tipus de descriptors no sol ser suficient per aconseguir resultats satisfactoris i per tant, hem de combinar la informació provinent de diferents fonts per a millorar el comportament global del sistema de reconeixement. Aquesta combinació de la informació la hem realitzat a travÈs de la fusió de classificadors. En relació a la descripció de formes, tradicionalment els símbols gràfics s'han representat mitjançant descriptors estructurals, construïts a partir d'una representació vectorial. Els mètodes de vectorització són sensibles al soroll i a les distorsions dels símbols esboçats. Podem intentar evitar aquest problema definint gramàtiques o construint models deformables dels símbols. Una altra possibilitat, la que hem seguit en aquest treball, és fer servir descriptors que no necessiten d'una representació vectorial. En el context de la descripció de formes hem proposat un descriptor basat en la transformada de crestetes -en anglès «ridgelets»- que, gràcies a que hem unificat la terminologia i hem introduït un vocabulari per explicar i classificar els descriptors, podem definir com: multiresolució, polar, 2D, que conserva la informació i invariant a les similituds. D'altre banda, la propietat de multiresolució de la transformada de crestetes fa que obtinguem una representació en diferents nivells de resolució que ens permet dividir-la en grups de coeficients de crestetes que es poden considerar com a descriptors. D'aquesta manera, hem entrenat un classificador per a cada descriptor, i hem proposat unes regles de combinació lineals, IN i DN, que minimitzen l'error de classificació per aquells classificadors que compleixin un conjunt de restriccions, relatives a la distribució i dependËncia dels classificadors. Aquests enfocs teòrics han estat avaluats a partir d'un conjunt d'experiments que ens han donat els següents resultats: Els descriptors de crestetes descriuen millor els símbols que altres descriptors més genèrics. Els mètodes IN i DN redueixen l'error de classificació en relació a d'altres mètodes de referència. Per últim, el mètode IN aplicat als descriptors de crestetes, en combinació amb classificadors de tipus «boosting» aconsegueix uns encerts de reconeixement propers als 100% en les proves definides per a la base de dades de símbols gràfics del GREC'03.In the field of Document Analysis we would like to be able to automatically process any kind of digital document. We mean extracting the document layout and identifying each of its parts, recognising its contents and organising them in order to make searches of its components, through the document itself, but also through different documents. This is a challenger problem that has motivated different lines of research in the field of Document Analysis at different levels: Pre-processing techniques have been developed to upgrade the quality of the document image, reducing noise from the input devices and minimizing the effects of the degradation of documents. A deep study in segmentation has been carried out in order to separate the regions of interest from the document background. Finally, many descriptors have been proposed for representing and identifying these regions of interest since the end of 60s until now. In this thesis, we have focused on, this last problem, the shape description description and also on classifier fusion, to apply them to one of the application fields in the Document Analysis: the graphics recognition. In shape recognition, many applications have to face the problem of describing a large number of complex shapes for recognition or retrieval in large databases. Besides the large number of shapes, we can find other challenges for shape description, such as the similarity among some of the shapes or the variability of the shape classes. In these cases, one of the key issues is the design of highly discriminant shape descriptors. Unfortunately, one kind of descriptor is not usually enough to achieve satisfactory results and hence, we have to combine the information from different sources to improve the global performance of the recognition system. We have carried out this combination of information using classifier fusion. Concerning shape description, traditionally graphics have been represented using structural descriptors, which are based on a vectorial representation of the shape. Vectorization is quite sensitive to noise and to distortions of sketched symbols. We can try to overcome this problem using grammar descriptors or deformable models of shapes. Another possibility, which is the followed in this dissertation, is to propose descriptors that do not need a vectorial representation of the symbol. Thereby, in the context of shape description, we have proposed a descriptor based on the ridgelets transform which, thanks to we have unified the terminology used in shape description and the introduced vocabulary, we can define as: 2D, polar and multi-resolution descriptor information preserving and invariant to similarities. On the other hand, although ridgelets descriptor can be considered as a single descriptor, it offers a shape representation divided into groups of coefficients, which permit us to consider them as single descriptors. Thus, for each descriptor, we have trained a classifier and we have proposed two linear combination rules, IN and DN, that minimize the classification error of classifiers verifying a set of constraints concerning the dependence and the distribtuion of classifers. These theoretical approaches have been evaluated through an experimental evaluation in ridgelets descriptors, classifier fusion and applying the classifier fusion methods to ridge lets descriptors, obtaining the following results: Ridgelets descriptors have proven to represent graphics symbols better than general purpose descriptors. IN and DN methods reduce the misclassification rates regarding other reference fusion methods. Finally, the IN method applied to ridgelets descriptor, in combination of boosting algorithms, has reached recognition rates near to 100% in the test defined for the GREC'03 database

Thèses en Ligne

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

INRIA a CCSD electronic archive server

Diposit Digital de Documents de la UAB

Secretaría de Estado de Cultura

A Bayesian network for combining descriptors: application to symbol recognition

Author: Barrat Sabine
Tabbone Salvatore
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2009
Field of study

International audienceIn this paper, we propose a descriptor combination method, which enables to improve significantly the recognition rate compared to the recognition rates obtained by each descriptor. This approach is based on a probabilistic graphical model. This model also enables to handle both discrete and continuous-valued variables. In fact, in order to improve the recognition rate, we have combined two kinds of features: discrete features (corresponding to shapes measures) and continuous features (corresponding to shape descriptors). In order to solve the dimensionality problem due to the large dimension of visual features, we have adapted a variable selection method. Experimental results, obtained in a supervised learning context, on noisy and occluded symbols, show the feasibility of the approach

INRIA a CCSD electronic archive server

Integrating Vocabulary Clustering with Spatial Relations for Symbol Recognition

Author: K.C. Santosh
Lamiroy Bart
Wendling Laurent
Publication venue: Springer Verlag
Publication date: 17/05/2013
Field of study

International audienceThis paper develops a structural symbol recognition method with integrated statistical features. It applies spatial organization descriptors to the identified shape features within a fixed visual vocabulary that compose a symbol. It builds an attributed relational graph expressing the spatial relations between those visual vocabulary elements. In order to adapt the chosen vocabulary features to multiple and possible specialized contexts, we study the pertinence of unsupervised clustering to capture significant shape variations within a vocabulary class and thus refine the discriminative power of the method. This unsupervised clustering relies on cross-validation between several different cluster indices. The resulting approach is capable of determining part of the pertinent vocabulary and significantly increases recognition results with respect to the state-of-the-art. It is experimentally validated on complex electrical wiring diagram symbols

INRIA a CCSD electronic archive server

HAL Descartes

Statistical Characterization of Morphodynamic Signals Using Wavelet Analysis

Author: Gutierrez Ronald R.
Publication venue
Publication date: 25/09/2013
Field of study

Morphodynamic and hydrodynamic properties are concomitantly part of the entire dynamic of river systems and commonly present both temporal and spatial persistent variability. Therefore, the study of both river morphodynamic signals (e.g. bed forms and meandering and anabranching river morphometrics) and hydrodynamic signals (e.g. velocity fields, sediment concentrations) requires both temporal and spatial multi-scale signal representations. The present research is focused on the former type of signals and it is a first attempt to discriminate such signals and, subsequently, develop the theoretical background to link these processes at different spatial and temporal scales and determine the scales that have more influence on river evolution. The main contribution of this study are: [1] to design a methodology to discriminate bed form features (e.g. bars, dunes and ripples) via the combined application of robust spline filters and one-dimensional continuous wavelet transforms, allowing the quantitative recognition of bed form hierarchies. The methodology was tested by using synthetic bed form signals and subsequently applied to the analysis of bed form features from the Parana River, Argentina. [2] To develop a methodology for the statistical analysis of the spatial distribution of meandering rivers morphometrics by coupling the capabilities of one-dimensional wavelet transforms, principal component analysis and Frechet distance. A universal river classification method is also proposed. [3] To perform a novel study of the planimetric configuration of confluences in tropical free meandering rivers located in the upper Amazon catchment. River confluences in tropical environments represent areas where biota is concentrated; therefore, a better understanding and characterization of these features has a particular importance for the Amazonian ecosystem. [4] To evaluate the potential of two-dimensional wavelet transforms in the analysis of bed form features. The broader impact will be an improved understanding of river morphodynamics of the upper Amazon River for practical applications such as navigability. Furthermore, the project will provide an updated statistical analysis of the meandering rivers dynamics for practical applications, including erosion control, river ecology, and habitat restoration. The developed statistical tool will be included as an application of the RVR Meander platform (www.rvrmeander.org), which is a broadly used software for river restoration

D-Scholarship@Pitt

Modeling cognition with generative neural networks: The case of orthographic processing

Author: Testolin Alberto
Publication venue
Publication date: 01/01/2015
Field of study

This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words

Archivio istituzionale della ricerca - Università di Padova

Design and analysis of a content-based image retrieval system

Author: Hernández Mesa Pilar
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2017
Field of study

The automatic retrieval of images according to the similarity of their content is a challenging task with many application fields. In this book the automatic retrieval of images according to human spontaneous perception without further effort or knowledge is considered. A system is therefore designed and analyzed. Methods for the detection and extraction of regions and for the extraction and comparison of color, shape, and texture features are also investigated

KITopen

Directory of Open Access Books (DOAB)

Advances in Image Processing, Analysis and Recognition Technology

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

Directory of Open Access Books (DOAB)

Pattern Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

Directory of Open Access Books (DOAB)

Topics in Harmonic Analysis, Sparse Representations, and Data Analysis

Author: Li Weilin
Publication venue
Publication date: 01/01/2018
Field of study

Classical harmonic analysis has traditionally focused on linear and invertible transformations. Motivated by modern applications, there is a growing interest in non-linear analysis and synthesis operators. This thesis encompasses applications of computational harmonic analysis, with a strong emphasis on time-frequency methods, to modern problems arising in deep learning, data analysis, imaging, and signal processing. The first focus of this thesis deals with scattering transforms, which are particular realizations of convolutional neural networks. While the latter uses trained convolution kernels, scattering transforms use fixed ones, and this simplification allows mathematicians to develop a model of deep learning. Mallat originally introduced a wavelet scattering transform, but we study a complementary Fourier based version. We prove that the Fourier scattering transform enjoys properties that make it an effective feature extractor for classification, and we also construct a rotationally invariant modification of this transform. We provide experimental evidence that shows its effectiveness at representing complicated spectral data. The second focus of this thesis pertains to the mathematical foundations of super-resolution, which is concerned with the recovery of fine details from low-resolution observations. This imaging model can be mathematically formulated as an ill-posed inverse problem in the space of bounded complex measures. While the current theory primarily deals with the recovery of discrete measures with minimum separation greater than the Rayleigh length, we present alternative approaches. One direction exploits Beurling's results on minimal extrapolation to obtain a general theory that is pertinent to a wide class of measures, including those with geometric structure. Another approach is information theoretic and studies the min-max error for robust super-resolution of discrete measures below the Rayleigh length

Digital Repository at the University of Maryland