4,864 research outputs found
RAPID ANALYTICAL VERIFICATION OF HANDWRITTEN ALPHANUMERIC ADDRESS FIELDS
Microsoft, Motorola, Siemens, Hitachi, IAPR, NICI, IUF
This paper presents a combination of fuzzy system and dynamic analytical model to deal with imprecise data derived from feature extraction in handwritten address images which are compared against postulated addresses for address verification. A dynamic buildingÂnumber locator is able to locate and recognise the buildingÂnumber, without knowing exactly where the buildingÂnumber starts in the candidate address line. The overall system achieved a correct sorting rate of 72.9%, 27.1% rejection rate and 0.0% error rate on a blind test set of 450 cursive handwritten addresses.
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Speaker Recognition in Content-based Image Retrieval for a High Degree of Accuracy
The purpose of this research is to measure the speaker recognition accuracy in Content-Based Image Retrieval. To support research in speaker recognition accuracy, we use two approaches for recognition system: identification and verification, an identification using fuzzy Mamdani, a verification using Manhattan distance. The test results in this research. The best of distance mean is size 32x32. The best of the verification for distance rate is 965, and the speaker recognition system has a standard error of 5% and the system accuracy is 95%. From these results, we find that there is an increase in accuracy of almost 2.5%. This is due to a combination of two approaches so the system can add to the accuracy of speaker recognition
Improving speaker turn embedding by crossmodal transfer learning from face embedding
Learning speaker turn embeddings has shown considerable improvement in
situations where conventional speaker modeling approaches fail. However, this
improvement is relatively limited when compared to the gain observed in face
embedding learning, which has been proven very successful for face verification
and clustering tasks. Assuming that face and voices from the same identities
share some latent properties (like age, gender, ethnicity), we propose three
transfer learning approaches to leverage the knowledge from the face domain
(learned from thousands of images and identities) for tasks in the speaker
domain. These approaches, namely target embedding transfer, relative distance
transfer, and clustering structure transfer, utilize the structure of the
source face embedding space at different granularities to regularize the target
speaker turn embedding space as optimizing terms. Our methods are evaluated on
two public broadcast corpora and yield promising advances over competitive
baselines in verification and audio clustering tasks, especially when dealing
with short speaker utterances. The analysis of the results also gives insight
into characteristics of the embedding spaces and shows their potential
applications
- …