15 research outputs found

    Structural Data Recognition with Graph Model Boosting

    Get PDF
    This paper presents a novel method for structural data recognition using a large number of graph models. In general, prevalent methods for structural data recognition have two shortcomings: 1) Only a single model is used to capture structural variation. 2) Naive recognition methods are used, such as the nearest neighbor method. In this paper, we propose strengthening the recognition performance of these models as well as their ability to capture structural variation. The proposed method constructs a large number of graph models and trains decision trees using the models. This paper makes two main contributions. The first is a novel graph model that can quickly perform calculations, which allows us to construct several models in a feasible amount of time. The second contribution is a novel approach to structural data recognition: graph model boosting. Comprehensive structural variations can be captured with a large number of graph models constructed in a boosting framework, and a sophisticated classifier can be formed by aggregating the decision trees. Consequently, we can carry out structural data recognition with powerful recognition capability in the face of comprehensive structural variation. The experiments shows that the proposed method achieves impressive results and outperforms existing methods on datasets of IAM graph database repository.Comment: 8 page

    Efficient PET-CT image retrieval using graphs embedded into a vector space

    Get PDF
    Combined positron emission tomography and computed tomography (PET-CT) produces functional data (from PET) in relation to anatomical context (from CT) and it has made a major contribution to improved cancer diagnosis, tumour localisation, and staging. The ability to retrieve PET-CT images from large archives has potential applications in diagnosis, education, and research. PET-CT image retrieval requires the consideration of modality-specific 3D image features and spatial contextual relationships between features in both modalities. Graph-based retrieval methods have recently been applied to represent contextual relationships during PET-CT image retrieval. However, accurate methods are computationally complex, often requiring offline processing, and are unable to retrieve images at interactive rates. In this paper, we propose a method for PET-CT image retrieval using a vector space embedding of graph descriptors. Our method defines the vector space in terms of the distance between a graph representing a PET-CT image and a set of fixed-sized prototype graphs; each vector component measures the dissimilarity of the graph and a prototype. Our evaluation shows that our method is significantly faster (≈800× speedup, p 0.05)

    Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation

    Full text link
    Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various datasets of labeled graphs. However, the design of effective learning procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay between dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization algorithms. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of the parameters controlling such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, computing time, and parsimony in terms of structural complexity of the synthesized classification models. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio

    Data Reduction in the String Space for Efficient kNN Classification through Space Partitioning

    Get PDF
    Within the Pattern Recognition field, two representations are generally considered for encoding the data: statistical codifications, which describe elements as feature vectors, and structural representations, which encode elements as high-level symbolic data structures such as strings, trees or graphs. While the vast majority of classifiers are capable of addressing statistical spaces, only some particular methods are suitable for structural representations. The kNN classifier constitutes one of the scarce examples of algorithms capable of tackling both statistical and structural spaces. This method is based on the computation of the dissimilarity between all the samples of the set, which is the main reason for its high versatility, but in turn, for its low efficiency as well. Prototype Generation is one of the possibilities for palliating this issue. These mechanisms generate a reduced version of the initial dataset by performing data transformation and aggregation processes on the initial collection. Nevertheless, these generation processes are quite dependent on the data representation considered, being not generally well defined for structural data. In this work we present the adaptation of the generation-based reduction algorithm Reduction through Homogeneous Clusters to the case of string data. This algorithm performs the reduction by partitioning the space into class-homogeneous clusters for then generating a representative prototype as the median value of each group. Thus, the main issue to tackle is the retrieval of the median element of a set of strings. Our comprehensive experimentation comparatively assesses the performance of this algorithm in both the statistical and the string-based spaces. Results prove the relevance of our approach by showing a competitive compromise between classification rate and data reduction.This research work was partially funded by “Programa I+D+i de la Generalitat Valenciana” through grant ACIF/2019/ 042 and the Spanish Ministry through HISPAMUS project TIN2017-86576-R, partially funded by the EU

    A novel shape descriptor based on salient keypoints detection for binary image matching and retrieval

    Get PDF
    We introduce a shape descriptor that extracts keypoints from binary images and automatically detects the salient ones among them. The proposed descriptor operates as follows: First, the contours of the image are detected and an image transformation is used to generate background information. Next, pixels of the transformed image that have specific characteristics in their local areas are used to extract keypoints. Afterwards, the most salient keypoints are automatically detected by filtering out redundant and sensitive ones. Finally, a feature vector is calculated for each keypoint by using the distribution of contour points in its local area. The proposed descriptor is evaluated using public datasets of silhouette images, handwritten math expressions, hand-drawn diagram sketches, and noisy scanned logos. Experimental results show that the proposed descriptor compares strongly against state of the art methods, and that it is reliable when applied on challenging images such as fluctuated handwriting and noisy scanned images. Furthermore, we integrate our descripto
    corecore