6 research outputs found
An Online Content Based Email Attachments Retrieval System
E-mail is one of the most popular programs used by most people today. As a result of the continuous daily use, thousands of messages are accumulated in the electronic box of most individuals, which make it difficult for them after a period of time to retrieve the attachments of these messages. Most Email providers constantly improved their search technology, but till now there is something could not be done; i.e., searching inside attachments. Some email providers like Gmail has added searching words inside attachments for some file types (.pdf files, .doc documents, .ppt presentations) but for image files this feature not supported till now. However, E-mail providers and even modern researchers have not focused on retrieving the image attachments in the E- mail box. The paper was aimed to introduce a novel idea of using Content based Image Retrieval (CBIR) in E-mail application to retrieve images from email attachments based on entire contents. The work main phases are: feature extraction based on color features and connect to Email server to read Emails, the second phase is retrieving similar image attachments. The tests carried on email inbox contain 100 messages with 500 image attachments and gave good precision and recall rates When the threshold value is less than or equal to 0.4
Content-based indexing of low resolution documents
In any multimedia presentation, the trend for attendees taking pictures of slides that
interest them during the presentation using capturing devices is gaining popularity.
To enhance the image usefulness, the images captured could be linked to image or
video database. The database can be used for the purpose of file archiving, teaching
and learning, research and knowledge management, which concern image search.
However, the above-mentioned devices include cameras or mobiles phones have low
resolution resulted from poor lighting and noise. Content-Based Image Retrieval
(CBIR) is considered among the most interesting and promising fields as far as
image search is concerned. Image search is related with finding images that are
similar for the known query image found in a given image database. This thesis
concerns with the methods used for the purpose of identifying documents that are
captured using image capturing devices. In addition, the thesis also concerns with a
technique that can be used to retrieve images from an indexed image database. Both
concerns above apply digital image processing technique. To build an indexed
structure for fast and high quality content-based retrieval of an image, some existing
representative signatures and the key indexes used have been revised. The retrieval
performance is very much relying on how the indexing is done. The retrieval
approaches that are currently in existence including making use of shape, colour and
texture features. Putting into consideration these features relative to individual
databases, the majority of retrievals approaches have poor results on low resolution
documents, consuming a lot of time and in the some cases, for the given query image,
irrelevant images are obtained. The proposed identification and indexing method in
the thesis uses a Visual Signature (VS). VS consists of the captures slides textual
layout’s graphical information, shape’s moment and spatial distribution of colour.
This approach, which is signature-based are considered for fast and efficient
matching to fulfil the needs of real-time applications. The approach also has the
capability to overcome the problem low resolution document such as noisy image,
the environment’s varying lighting conditions and complex backgrounds. We present
hierarchy indexing techniques, whose foundation are tree and clustering. K-means
clustering are used for visual features like colour since their spatial distribution give a good image’s global information. Tree indexing for extracted layout and shape
features are structured hierarchically and Euclidean distance is used to get similarity
image for CBIR. The assessment of the proposed indexing scheme is conducted
based on recall and precision, a standard CBIR retrieval performance evaluation. We
develop CBIR system and conduct various retrieval experiments with the
fundamental aim of comparing the accuracy during image retrieval. A new algorithm
that can be used with integrated visual signatures, especially in late fusion query was
introduced. The algorithm has the capability of reducing any shortcoming associated
with normalisation in initial fusion technique. Slides from conferences, lectures and
meetings presentation are used for comparing the proposed technique’s performances
with that of the existing approaches with the help of real data. This finding of the
thesis presents exciting possibilities as the CBIR systems is able to produce high
quality result even for a query, which uses low resolution documents. In the future,
the utilization of multimodal signatures, relevance feedback and artificial intelligence
technique are recommended to be used in CBIR system to further enhance the
performance
Ανάπτυξη ολοκληρωμένου περιβάλλοντος ανάλυσης και ταξινόμησης μαστογραφικών εικόνων
Η μαστογραφία είναι μια αποτελεσματική και ασφαλής μέθοδος για την διάγνωση του
καρκίνου. Ωστόσο, η ερμηνεία των μαστογραφιών ενέχει δυσκολίες για τους
ακτινολόγους. Έτσι, έχουν αναπτυχθεί συστήματα υποβοηθούμενης διάγνωσης (CAD)
που παρέχουν μια δεύτερη γνώμη για την τελική τους διάγνωση.
Οι μικροασβεστώσεις είναι ευρήματα που σχετίζονται με τον καρκίνο του μαστού
και μπορεί να είναι καλοήθεις ή κακοήθεις.
Στην παρούσα εργασία παρουσιάζουμε το CAD σύστημα Ιπποκράτης-μστ που στοχεύει
στην ανάλυση και αξιολόγηση μεμονωμένων μικροασβεστώσεων και συμπλεγμάτων. Η
εφαρμογή περιλαμβάνει: α) αρχειοθέτηση ασθενών, β) χρήση τεχνικών ανάλυσης
μαστογραφικής εικόνας, γ) ανίχνευση και ανάλυση μικροασβεστώσεων δ) εξαγωγή
διάγνωσης στηριζόμενη στον αλγόριθμο SVM.
Επίσης, αναπτύξαμε ένα συνδυαστικό σχήμα ταξινόμησης μαστογραφικών εικόνων που
αποτελείται από έναν SVM και έναν νέο ταξινομητή που δημιουργήσαμε. Ο SVM
εκπαιδεύεται με ένα μικρό σύνολο χαρακτηριστικών των μικροασβεστώσεων που
επελέγησαν μετά από υπολογισμούς. Ο νέος ταξινομητής κατηγοριοποιεί νέες
μαστογραφίες με βάση το περιεχόμενό τους και στηρίζεται στον υπολογισμό
αποστάσεων ανάμεσα στο διάνυσμα χαρακτηριστικών της άγνωστης εικόνας και των
γνωστών εικόνων. Η απόφαση προκύπτει από τις ψήφους των κοντινότερων γνωστών
εικόνων. Η τελική πρόβλεψη της άγνωστης εικόνας, προκύπτει από τον συνδυασμό
των προβλέψεων των δυο ταξινομητών, με εφαρμογή ενός απλού κανόνα.
Επίσης, επικαιροποιήθηκε η διαδικτυακή βάση μαστογραφικών εικόνων MIRACLE.Mammography is an effective and safe method to diagnose breast cancer. However,
the interpretation of mammograms involves difficulties for the radiologists.
Hence, Computer Aided Diagnosis (CAD) systems have been developed to provide
radiologists a second opinion. Microcalcifications are benign or malignant
findings that relate to breast cancer.
In this master thesis we present the CAD system Hippoctates-mst which is based
on the analysis of single microcalcifications and clusters. The implementation
includes: a) patient’s archive, b) mammographic image analysis, c) MCs
detection and analysis d) final diagnosis based on SVM algorithm.
During this project a combined classification scheme has been developed to
classify mammographic images. This scheme consists of an SVM classifier and a
new classifier we have created. The SVM is trained with a small group of
features that were selected after calculations. The other classifier
categorizes new mammograms based on their content and relies on the calculation
of distances between the feature vector of the unknown image and the known
images. The decision is based on majority voting regarding the nearest known
images. The final prediction arises from the combination of the predictions of
the two classifiers by applying a simple rule.
In addition, we updated the MIRACLE database
Erschließung und bildliche Dokumentation von Wasserzeichen in Online-Datenbanken
Die Untersuchung von Wasserzeichen zählt in vielen quellenorientierten Wissenschaften wie der Musikwissenschaft oder der Mediävistik zu den Standardmethoden. In den 1990er Jahren entstanden die ersten Online-Wasserzeichendatenbanken. Die Wissenschaft erhielt dadurch Zugriff auf umfangreiches Vergleichs-material zur Datierung, Zuschreibung oder Echtheitsbestimmung. Die Erschließung und bildliche Dokumentation von Wasserzeichen stellt allerdings eine Herausforderung dar, da es sich um komplexe nicht-textuelle Objekte handelt. Die Arbeit analysiert und bewertet aktuelle Wasserzeichendatenbanken und diskutiert Konzepte zur Optimierung im Bereich der Erschließung und des Information Retrieval. Zunächst wird der spezielle Gegenstandsbereich der Wasserzeichen betrachtet. Darauf aufbauend werden inhaltliche und informationswissenschaftliche Anforderungen an Indexierungssprachen im Bereich der Wasserzeichenerschließung formuliert. Im Zentrum der Arbeit steht die Analyse und Evaluation der Datenbank „Wasserzeicheninformationssystem Deutschland (WZIS)“. Als Strategie zur Optimierung wird der Einsatz facettierter Indexierungssprachen erörtert
Framework for Automatic Identification of Paper Watermarks with Chain Codes
Title from PDF of title page viewed May 21, 2018Dissertation advisor: Reza DerakhshaniVitaIncludes bibliographical references (pages 220-235)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2017In this dissertation, I present a new framework for automated description, archiving, and
identification of paper watermarks found in historical documents and manuscripts. The early
manufacturers of paper have introduced the embedding of identifying marks and patterns as a sign
of a distinct origin and perhaps as a signature of quality. Thousands of watermarks have been
studied, classified, and archived. Most of the classification categories are based on image similarity
and are searchable based on a set of defined contextual descriptors. The novel method presented
here is for automatic classification, identification (matching) and retrieval of watermark images
based on chain code descriptors (CC). The approach for generation of unique CC includes a novel
image preprocessing method to provide a solution for rotation and scale invariant representation
of watermarks. The unique codes are truly reversible, providing high ratio lossless compression,
fast searching, and image matching. The development of a novel distance measure for CC
comparison is also presented. Examples for the complete process are given using the recently
acquired watermarks digitized with hyper-spectral imaging of Summa Theologica, the work of
Antonino Pierozzi (1389 – 1459). The performance of the algorithm on large datasets is
demonstrated using watermarks datasets from well-known library catalogue collections.Introduction -- Paper and paper watermarks -- Automatic identification of paper watermarks -- Rotation, Scale and translation invariant chain code -- Comparison of RST_Invariant chain code -- Automatic identification of watermarks with chain codes -- Watermark composite feature vector -- Summary -- Appendix A. Watermarks from the Bernstein Collection used in this study -- Appendix B. The original and transformed images of watermarks -- Appendix C. The transformed and scaled images of watermarks -- Appendix D. Example of chain cod