6 research outputs found
Chronological Profiling for Paleography
This paper approaches manuscript dating from a Bayesian perspective. Prior work on paleographic date recovery has generally sought to predict a single date for a manuscript. Bayesian analysis makes it possible to estimate a probability distribution that varies with respect to time. This in turn enables a number of alternative analyses that may be of more use to practitioners. For example, it may be useful to identify a range of years that will include a document’s creation date with a particular confidence level. The methods are demonstrated on a selection of Syriac documents created prior to 1300 CE
Isolated Character Forms from Dated Syriac Manuscripts
This paper describes a set of hand-isolated character samples selected from securely dated manuscripts written in Syriac between 300 and 1300 C.E., which are being made available for research purposes. The collection can be used for a number of applications, including ground truth for character segmentation and form analysis for paleographical dating. Several applications based upon convolutional neural networks demonstrate the possibilities of the data set
Re-ranking for Writer Identification and Writer Retrieval
Automatic writer identification is a common problem in document analysis.
State-of-the-art methods typically focus on the feature extraction step with
traditional or deep-learning-based techniques. In retrieval problems,
re-ranking is a commonly used technique to improve the results. Re-ranking
refines an initial ranking result by using the knowledge contained in the
ranked result, e. g., by exploiting nearest neighbor relations. To the best of
our knowledge, re-ranking has not been used for writer
identification/retrieval. A possible reason might be that publicly available
benchmark datasets contain only few samples per writer which makes a re-ranking
less promising. We show that a re-ranking step based on k-reciprocal nearest
neighbor relationships is advantageous for writer identification, even if only
a few samples per writer are available. We use these reciprocal relationships
in two ways: encode them into new vectors, as originally proposed, or integrate
them in terms of query-expansion. We show that both techniques outperform the
baseline results in terms of mAP on three writer identification datasets