Search CORE

104 research outputs found

Generic Scale-Space Architecture for Handwriting Documents Analysis

Author: Eglin Veronique
Emptoz Hubert
Joutel Guillaume
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

Learning Visual Features from Snapshots for Web Search

Author: Cheng Xueqi
Fan Yixing
Guo Jiafeng
Lan Yanyan
Pang Liang
Xu Jun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/10/2017
Field of study

When applying learning to rank algorithms to Web search, a large number of features are usually designed to capture the relevance signals. Most of these features are computed based on the extracted textual elements, link analysis, and user logs. However, Web pages are not solely linked texts, but have structured layout organizing a large variety of elements in different styles. Such layout itself can convey useful visual information, indicating the relevance of a Web page. For example, the query-independent layout (i.e., raw page layout) can help identify the page quality, while the query-dependent layout (i.e., page rendered with matched query words) can further tell rich structural information (e.g., size, position and proximity) of the matching signals. However, such visual information of layout has been seldom utilized in Web search in the past. In this work, we propose to learn rich visual features automatically from the layout of Web pages (i.e., Web page snapshots) for relevance ranking. Both query-independent and query-dependent snapshots are considered as the new inputs. We then propose a novel visual perception model inspired by human's visual search behaviors on page viewing to extract the visual features. This model can be learned end-to-end together with traditional human-crafted features. We also show that such visual features can be efficiently acquired in the online setting with an extended inverted indexing scheme. Experiments on benchmark collections demonstrate that learning visual features from Web page snapshots can significantly improve the performance of relevance ranking in ad-hoc Web retrieval tasks.Comment: CIKM 201

arXiv.org e-Print Archive

Arabic Manuscript Layout Analysis and Classification

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

Content based retrieval of PET neurological images

Author: Batty S.
Batty S.
Publication venue
Publication date: 01/01/2004
Field of study

Medical image management has posed challenges to many researchers, especially when the images have to be indexed and retrieved using their visual content that is meaningful to clinicians. In this study, an image retrieval system has been developed for 3D brain PET (Position emission tomography) images. It has been found that PET neurological images can be retrieved based upon their diagnostic status using only data pertaining to their content, and predominantly the visual content. During the study PET scans are spatially normalized, using existing techniques, and their visual data is quantified. The mid-sagittal-plane of each individual 3D PET scan is found and then utilized in the detection of abnormal asymmetries, such as tumours or physical injuries. All the asymmetries detected are referenced to the Talairarch and Tournoux anatomical atlas. The Cartesian co- ordinates in Talairarch space, of detected lesion, are employed along with the associated anatomical structure(s) as the indices within the content based image retrieval system. The anatomical atlas is then also utilized to isolate distinct anatomical areas that are related to a number of neurodegenerative disorders. After segmentation of the anatomical regions of interest algorithms are applied to characterize the texture of brain intensity using Gabor filters and to elucidate the mean index ratio of activation levels. These measurements are combined to produce a single feature vector that is incorporated into the content based image retrieval system. Experimental results on images with known diagnoses show that physical lesions such as head injuries and tumours can be, to a certain extent, detected correctly. Images with correctly detected and measured lesion are then retrieved from the database of images when a query pertains to the measured locale. Images with neurodegenerative disorder patterns have been indexed and retrieved via texture-based features. Retrieval accuracy is increased, for images from patients diagnosed with dementia, by combining the texture feature and mean index ratio value

Multi-feature approach for writer-independent offline signature verification

Author: Rivard Dominique
Publication venue: École de technologie supérieure
Publication date
Field of study

Some of the fundamental problems facing handwritten signature verification are the large number of users, the large number of features, the limited number of reference signatures for training, the high intra-personal variability of the signatures and the unavailability of forgeries as counterexamples. This research first presents a survey of offline signature verification techniques, focusing on the feature extraction and verification strategies. The goal is to present the most important advances, as well as the current challenges in this field. Of particular interest are the techniques that allow for designing a signature verification system based on a limited amount of data. Next is presented a novel offline signature verification system based on multiple feature extraction techniques, dichotomy transformation and boosting feature selection. Using multiple feature extraction techniques increases the diversity of information extracted from the signature, thereby producing features that mitigate intra-personal variability, while dichotomy transformation ensures writer-independent classification, thus relieving the verification system from the burden of a large number of users. Finally, using boosting feature selection allows for a low cost writer-independent verification system that selects features while learning. As such, the proposed system provides a practical framework to explore and learn from problems with numerous potential features. Comparison of simulation results from systems found in literature confirms the viability of the proposed system, even when only a single reference signature is available. The proposed system provides an efficient solution to a wide range problems (eg. biometric authentication) with limited training samples, new training samples emerging during operations, numerous classes, and few or no counterexamples

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Recommended from our members

Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification

Author: Almaadeed Noor
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content (i.e. text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system. A novel approach towards speaker identification is developed using wavelet analysis, and multiple neural networks including Probabilistic Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state- of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA). Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear. Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identi¯cation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that the proposed scheme is one of the best candidates for the fusion of face and voice due to its low computational time and high recognition accuracy

Brunel University Research Archive