1,027 research outputs found

    Chronological Profiling for Paleography

    Get PDF
    This paper approaches manuscript dating from a Bayesian perspective. Prior work on paleographic date recovery has generally sought to predict a single date for a manuscript. Bayesian analysis makes it possible to estimate a probability distribution that varies with respect to time. This in turn enables a number of alternative analyses that may be of more use to practitioners. For example, it may be useful to identify a range of years that will include a document’s creation date with a particular confidence level. The methods are demonstrated on a selection of Syriac documents created prior to 1300 CE

    Isolated Character Forms from Dated Syriac Manuscripts

    Get PDF
    This paper describes a set of hand-isolated character samples selected from securely dated manuscripts written in Syriac between 300 and 1300 C.E., which are being made available for research purposes. The collection can be used for a number of applications, including ground truth for character segmentation and form analysis for paleographical dating. Several applications based upon convolutional neural networks demonstrate the possibilities of the data set

    Image Enhancement Background for High Damage Malay Manuscripts using Adaptive Threshold Binarization

    Get PDF
    Jawi Manuscripts handwritten which are kept at Malaysia National Library (MNL), has aged over decades. Regardless of the intensive sustainable process conducted by MNL, these manuscripts are still not maintained in good quality, and neither can easily be read nor better view. Even thought, many states of the art methods have developed for image enhancement, none of them can solve extremely bad quality manuscripts. The quality of old Malay Manuscripts can be categorize into three types, namely: the background image is uneven, image effects and image effects expand patch. The aim of this paper is to discuss the methods used to value add the quality of the manuscript.  Our propose methods consist of several main methods, such as: Local Adaptive Equalization, Image Intensity Values, Automatic Threshold PP, and Adaptive Threshold Filtering. This paper is intend to achieve a better view image that geared to ease reading. Error Bit Phase achievement (TKB) has a smaller error value for proposed method (Adaptive Threshold Filtering Process / PAM) namely 0.0316 compared with Otsu’s Threshold Method / MNAO, Binary Threshold Value Method / MNAP, and Automatic Local Threshold Value Method / MNATA. The precision achievement (namely on ink bleed images) is using a proposed method more than 95% is compared with the state of the art methods MNAO, MNAP, MNATA and their performances are 75.82%, 90.68%, and 91.2% subsequently.  However, this paper’s achievement is using a proposed method / PAM, MNAO, MNAP, and MNATA for correspondingly the image of ink bleed case are 45.74%, 54.80%, 53.23% and 46.02%.  Conclusion, the proposed method produces a better character shape in comparison to other methods

    The computerization of archaeology: survey on AI techniques

    Full text link
    This paper analyses the application of artificial intelligence techniques to various areas of archaeology and more specifically: a) The use of software tools as a creative stimulus for the organization of exhibitions; the use of humanoid robots and holographic displays as guides that interact and involve museum visitors; b) The analysis of methods for the classification of fragments found in archaeological excavations and for the reconstruction of ceramics, with the recomposition of the parts of text missing from historical documents and epigraphs; c) The cataloguing and study of human remains to understand the social and historical context of belonging with the demonstration of the effectiveness of the AI techniques used; d) The detection of particularly difficult terrestrial archaeological sites with the analysis of the architectures of the Artificial Neural Networks most suitable for solving the problems presented by the site; the design of a study for the exploration of marine archaeological sites, located at depths that cannot be reached by man, through the construction of a freely explorable 3D version

    Fine Art Pattern Extraction and Recognition

    Get PDF
    This is a reprint of articles from the Special Issue published online in the open access journal Journal of Imaging (ISSN 2313-433X) (available at: https://www.mdpi.com/journal/jimaging/special issues/faper2020)

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∟ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Unsupervised quantification of entity consistency between photos and text in real-world news

    Get PDF
    Das World Wide Web und die sozialen Medien übernehmen im heutigen Informationszeitalter eine wichtige Rolle für die Vermittlung von Nachrichten und Informationen. In der Regel werden verschiedene Modalitäten im Sinne der Informationskodierung wie beispielsweise Fotos und Text verwendet, um Nachrichten effektiver zu vermitteln oder Aufmerksamkeit zu erregen. Kommunikations- und Sprachwissenschaftler erforschen das komplexe Zusammenspiel zwischen Modalitäten seit Jahrzehnten und haben unter Anderem untersucht, wie durch die Kombination der Modalitäten zusätzliche Informationen oder eine neue Bedeutungsebene entstehen können. Die Anzahl gemeinsamer Konzepte oder Entitäten (beispielsweise Personen, Orte und Ereignisse) zwischen Fotos und Text stellen einen wichtigen Aspekt für die Bewertung der Gesamtaussage und Bedeutung eines multimodalen Artikels dar. Automatisierte Ansätze zur Quantifizierung von Bild-Text-Beziehungen können für zahlreiche Anwendungen eingesetzt werden. Sie ermöglichen beispielsweise eine effiziente Exploration von Nachrichten, erleichtern die semantische Suche von Multimedia-Inhalten in (Web)-Archiven oder unterstützen menschliche Analysten bei der Evaluierung der Glaubwürdigkeit von Nachrichten. Allerdings gibt es bislang nur wenige Ansätze, die sich mit der Quantifizierung von Beziehungen zwischen Fotos und Text beschäftigen. Diese Ansätze berücksichtigen jedoch nicht explizit die intermodalen Beziehungen von Entitäten, welche eine wichtige Rolle in Nachrichten darstellen, oder basieren auf überwachten multimodalen Deep-Learning-Techniken. Diese überwachten Lernverfahren können ausschließlich die intermodalen Beziehungen von Entitäten detektieren, die in annotierten Trainingsdaten enthalten sind. Um diese Forschungslücke zu schließen, wird in dieser Arbeit ein unüberwachter Ansatz zur Quantifizierung der intermodalen Konsistenz von Entitäten zwischen Fotos und Text in realen multimodalen Nachrichtenartikeln vorgestellt. Im ersten Teil dieser Arbeit werden neuartige Verfahren auf Basis von Deep Learning zur Extrahierung von Informationen aus Fotos vorgestellt, um Ereignisse (Events), Orte, Zeitangaben und Personen automatisch zu erkennen. Diese Verfahren bilden eine wichtige Voraussetzung, um die Beziehungen von Entitäten zwischen Bild und Text zu bewerten. Zunächst wird ein Ansatz zur Ereignisklassifizierung präsentiert, der neuartige Optimierungsfunktionen und Gewichtungsschemata nutzt um Ontologie-Informationen aus einer Wissensdatenbank in ein Deep-Learning-Verfahren zu integrieren. Das Training erfolgt anhand eines neu vorgestellten Datensatzes, der 570.540 Fotos und eine Ontologie mit 148 Ereignistypen enthält. Der Ansatz übertrifft die Ergebnisse von Referenzsystemen die keine strukturierten Ontologie-Informationen verwenden. Weiterhin wird ein DeepLearning-Ansatz zur Schätzung des Aufnahmeortes von Fotos vorgeschlagen, der Kontextinformationen über die Umgebung (Innen-, Stadt-, oder Naturaufnahme) und von Erdpartitionen unterschiedlicher Granularität verwendet. Die vorgeschlagene Lösung übertrifft die bisher besten Ergebnisse von aktuellen Forschungsarbeiten, obwohl diese deutlich mehr Fotos zum Training verwenden. Darüber hinaus stellen wir den ersten Datensatz zur Schätzung des Aufnahmejahres von Fotos vor, der mehr als eine Million Bilder aus den Jahren 1930 bis 1999 umfasst. Dieser Datensatz wird für das Training von zwei Deep-Learning-Ansätzen zur Schätzung des Aufnahmejahres verwendet, welche die Aufgabe als Klassifizierungs- und Regressionsproblem behandeln. Beide Ansätze erzielen sehr gute Ergebnisse und übertreffen Annotationen von menschlichen Probanden. Schließlich wird ein neuartiger Ansatz zur Identifizierung von Personen des öffentlichen Lebens und ihres gemeinsamen Auftretens in Nachrichtenfotos aus der digitalen Bibliothek Internet Archiv präsentiert. Der Ansatz ermöglicht es unstrukturierte Webdaten aus dem Internet Archiv mit Metadaten, beispielsweise zur semantischen Suche, zu erweitern. Experimentelle Ergebnisse haben die Effektivität des zugrundeliegenden Deep-Learning-Ansatzes zur Personenerkennung bestätigt. Im zweiten Teil dieser Arbeit wird ein unüberwachtes System zur Quantifizierung von BildText-Beziehungen in realen Nachrichten vorgestellt. Im Gegensatz zu bisherigen Verfahren liefert es automatisch neuartige Maße der intermodalen Konsistenz für verschiedene Entitätstypen (Personen, Orte und Ereignisse) sowie den Gesamtkontext. Das System ist nicht auf vordefinierte Datensätze angewiesen, und kann daher mit der Vielzahl und Diversität von Entitäten und Themen in Nachrichten umgehen. Zur Extrahierung von Entitäten aus dem Text werden geeignete Methoden der natürlichen Sprachverarbeitung eingesetzt. Examplarbilder für diese Entitäten werden automatisch aus dem Internet beschafft. Die vorgeschlagenen Methoden zur Informationsextraktion aus Fotos werden auf die Nachrichten- und heruntergeladenen Exemplarbilder angewendet, um die intermodale Konsistenz von Entitäten zu quantifizieren. Es werden zwei Aufgaben untersucht um die Qualität des vorgeschlagenen Ansatzes in realen Anwendungen zu bewerten. Experimentelle Ergebnisse für die Dokumentverifikation und die Beschaffung von Nachrichten mit geringer (potenzielle Fehlinformation) oder hoher multimodalen Konsistenz zeigen den Nutzen und das Potenzial des Ansatzes zur Unterstützung menschlicher Analysten bei der Untersuchung von Nachrichten.In today’s information age, the World Wide Web and social media are important sources for news and information. Different modalities (in the sense of information encoding) such as photos and text are typically used to communicate news more effectively or to attract attention. Communication scientists, linguists, and semioticians have studied the complex interplay between modalities for decades and investigated, e.g., how their combination can carry additional information or add a new level of meaning. The number of shared concepts or entities (e.g., persons, locations, and events) between photos and text is an important aspect to evaluate the overall message and meaning of an article. Computational models for the quantification of image-text relations can enable many applications. For example, they allow for more efficient exploration of news, facilitate semantic search and multimedia retrieval in large (web) archives, or assist human assessors in evaluating news for credibility. To date, only a few approaches have been suggested that quantify relations between photos and text. However, they either do not explicitly consider the cross-modal relations of entities – which are important in the news – or rely on supervised deep learning approaches that can only detect the cross-modal presence of entities covered in the labeled training data. To address this research gap, this thesis proposes an unsupervised approach that can quantify entity consistency between photos and text in multimodal real-world news articles. The first part of this thesis presents novel approaches based on deep learning for information extraction from photos to recognize events, locations, dates, and persons. These approaches are an important prerequisite to measure the cross-modal presence of entities in text and photos. First, an ontology-driven event classification approach that leverages new loss functions and weighting schemes is presented. It is trained on a novel dataset of 570,540 photos and an ontology with 148 event types. The proposed system outperforms approaches that do not use structured ontology information. Second, a novel deep learning approach for geolocation estimation is proposed that uses additional contextual information on the environmental setting (indoor, urban, natural) and from earth partitions of different granularity. The proposed solution outperforms state-of-the-art approaches, which are trained with significantly more photos. Third, we introduce the first large-scale dataset for date estimation with more than one million photos taken between 1930 and 1999, along with two deep learning approaches that treat date estimation as a classification and regression problem. Both approaches achieve very good results that are superior to human annotations. Finally, a novel approach is presented that identifies public persons and their co-occurrences in news photos extracted from the Internet Archive, which collects time-versioned snapshots of web pages that are rarely enriched with metadata relevant to multimedia retrieval. Experimental results confirm the effectiveness of the deep learning approach for person identification. The second part of this thesis introduces an unsupervised approach capable of quantifying image-text relations in real-world news. Unlike related work, the proposed solution automatically provides novel measures of cross-modal consistency for different entity types (persons, locations, and events) as well as the overall context. The approach does not rely on any predefined datasets to cope with the large amount and diversity of entities and topics covered in the news. State-of-the-art tools for natural language processing are applied to extract named entities from the text. Example photos for these entities are automatically crawled from the Web. The proposed methods for information extraction from photos are applied to both news images and example photos to quantify the cross-modal consistency of entities. Two tasks are introduced to assess the quality of the proposed approach in real-world applications. Experimental results for document verification and retrieval of news with either low (potential misinformation) or high cross-modal similarities demonstrate the feasibility of the approach and its potential to support human assessors to study news

    An overview on user profiling in online social networks

    Get PDF
    Advances in Online Social Networks is creating huge data day in and out providing lot of opportunities to its users to express their interest and opinion. Due to the popularity and exposure of social networks, many intruders are using this platform for illegal purposes. Identifying such users is challenging and requires digging huge knowledge out of the data being flown in the social media. This work gives an insight to profile users in online social networks. User Profiles are established based on the behavioral patterns, correlations and activities of the user analyzed from the aggregated data using techniques like clustering, behavioral analysis, content analysis and face detection. Depending on application and purpose, the mechanism used in profiling users varies. Further study on other mechanisms used in profiling users is under the scope of future endeavors

    RITA: Group Attention is All You Need for Timeseries Analytics

    Full text link
    Timeseries analytics is of great importance in many real-world applications. Recently, the Transformer model, popular in natural language processing, has been leveraged to learn high quality feature embeddings from timeseries, core to the performance of various timeseries analytics tasks. However, the quadratic time and space complexities limit Transformers' scalability, especially for long timeseries. To address these issues, we develop a timeseries analytics tool, RITA, which uses a novel attention mechanism, named group attention, to address this scalability issue. Group attention dynamically clusters the objects based on their similarity into a small number of groups and approximately computes the attention at the coarse group granularity. It thus significantly reduces the time and space complexity, yet provides a theoretical guarantee on the quality of the computed attention. The dynamic scheduler of RITA continuously adapts the number of groups and the batch size in the training process, ensuring group attention always uses the fewest groups needed to meet the approximation quality requirement. Extensive experiments on various timeseries datasets and analytics tasks demonstrate that RITA outperforms the state-of-the-art in accuracy and is significantly faster -- with speedups of up to 63X

    SEEKING A COMMON THEME: A STUDY OF CERAMIC EFFIGY ARTIFACTS IN THE PRE-HISPANIC AMERICAN SOUTHWEST AND NORTHERN MEXICO USING COMPUTER IMAGE PATTERN RECOGNITION AND PHYLOGENETIC ANALYSIS

    Get PDF
    Effigy artifacts are found throughout the Pre-Hispanic American Southwest and Northern Mexico (PHASNM), as well as in other cultures around the world, with many sharing the same forms and design features. The earliest figurines within the PHASNM were partial anthropomorphic figurines made from fired clay, dating to between A.D. 287 and A.D. 312 (Morss 1954:27). They were found in a pit house village of Bluff Ruin in the Forestdale Valley of eastern Arizona, and they appeared to be associated with the Mogollon culture. The temporal range of the samples examined in this study is from approximately 200 A.D. to 1650 A.D., and the geographical range includes the Southwestern United States (Arizona, New Mexico, Texas, Colorado, and Utah) and the northcentral section of Mexico (Casas Grandes and the surrounding area). This research looks at the similarities among the markings of ceramic effigy artifacts from the PHASNM, using computer image pattern recognition, design analysis, and phylogenetics, to determine whether their ceramic traditions share a common theme and whether the specific method of social learning responsible for the transmission of information relating to ceramic effigy decoration can be identified. Transmission is possible in one of three ways: vertical transmission, where parents/teachers distribute information by encouraging imitation and sharing learned traditions with children/students (Richerson and Boyd 2005; Shennan 2002); horizontal transmission, where information is transmitted among peers, either from within the individual’s group or from interaction with peers from neighboring populations (Borgerhoff Mulder et al. 2006), and where the individual comes into contact with a wide range of attributes related to the item of interest and then adopts those that allow for the fastest, most economical methods of production and distribution (Eerkens et al 2006; Rogers 1983); and oblique transmission, where information is transmitted by adults, masters, or institutions of elite or higher social status, either internally or externally to the adopting cultural Type (Jensen 2016; Jordan 2014), and where particular traits are adopted or left out in disproportionate ways, creating patterns in localized traditions that can be empirically identified. Horizontal transmission can be broken into two types: unlimited, where contact is not confined to a particular group; and limited, where contact is restricted to a particular set of contacts. Using criteria for each of the categories as set forth by the New Mexico Office of Archaeological Studies Pottery Typology Project, the samples were classified in terms of cultural area (culture), branch, tradition, ware, and type. The research v group consisted of 360 photographic samples represented by 868 images that were resized to a 640x640 pixel format. The images were then examined through computer image pattern recognition (using YOLOv5) and through manual observation. This study resulted in a database representing 230 traits. These traits were assembled into groups by cultural area, branch, tradition, ware, and type, and phylogenetic analysis was applied to show how the different entities transfer information among each other
    • …
    corecore