6 research outputs found

    Fully convolutional neural networks for newspaper article segmentation

    Get PDF
    Segmenting newspaper pages into articles that semantically belong together is a necessary prerequisite for article-based information retrieval on print media collections like e.g. archives and libraries. It is challenging due to vastly differing layouts of papers, various content types and different languages, but commercially very relevant for e.g. media monitoring.  We present a semantic segmentation approach based on the visual appearance of each page. We apply a fully convolutional neural network (FCN) that we train in an end-to-end fashion to transform the input image into a segmentation mask in one pass. We show experimentally that the FCN performs very well: it outperforms a deep learning-based commercial solution by a large margin in terms of segmentation quality while in addition being computationally two orders of magnitude more efficient

    Analyse multicouche de la structure et de la forme des journaux

    Get PDF
    Understanding newspaper structure and design remains a challenging task due to the complex composition of pages with many visual and textual elements. Current approaches have focused on simple design types and analysed only broad classes for the components in a page. In this paper, we propose an approach to obtain a comprehensive understanding of a newspaper page through a multi-layered analysis of structure and design. Taking images of newspaper front pages as input, our approach uses a combination of computer vision techniques to segment newspapers with complex layouts into meaningful blocks of varying degrees of granularity, and convolutional neural network (CNN) to classify each block. The final output presents a visualization of the various layers of design elements present in the newspaper. Compared to previous approaches, our method introduces a much larger set of design-related labels (23 labels against less than 10 before) resulting in a very fine description of the pages, with high accuracy (83%). As a whole, this automated analysis would have potential applications such as cross-medium content adaptation, digital archiving, and UX design.La composition des pages d'un journal est complexe, comprenant de nombreux éléments visuels et textuels. Cela rend difficile l'analyse de la structure et de la forme de ces pages. Les approches actuelles se sont focalisées sur des documents simples et ont analysé uniquement les classes de base des composants d'une page. Dans ce rapport, nous proposons une approche permettant d’obtenir une compréhension complète d’une page de journal grâce à une analyse multicouche de la structure et de la forme. Notre système prend les images de pages de journaux en entrée et comprend deux parties. La première utilise des techniques de vision par ordinateur pour segmenter des pages complexes en blocs significatifs de différents degrés de granularité. La deuxième classe chaque bloc identifié avec un réseau de neurones à convolution (CNN). Le résultat final est une visualisation des différentes couches des composants d'une page. En comparaison des approches précédentes, notre méthode introduit un ensemble beaucoup plus large de classes (23 classes de composants d’une page par rapport à moins de 10 auparavant), donnant une description très fine des pages, avec une bonne précision (83 %). Cette méthode a des applications potentielles telles que l'adaptation de contenu multi-média, l'archivage numérique et la conception UX

    Deep learning in the wild

    Get PDF
    Invited paperDeep learning with neural networks is applied by an increasing number of people outside of classic research environments, due to the vast success of the methodology on a wide range of machine perception tasks. While this interest is fueled by beautiful success stories, practical work in deep learning on novel tasks without existing baselines remains challenging. This paper explores the specific challenges arising in the realm of real world tasks, based on case studies from research & development in conjunction with industry, and extracts lessons learned from them. It thus fills a gap between the publication of latest algorithmic and methodical developments, and the usually omitted nitty-gritty of how to make them work. Specifically, we give insight into deep learning projects on face matching, print media monitoring, industrial quality control, music scanning, strategy game playing, and automated machine learning, thereby providing best practices for deep learning in practice

    News Article Layout Extraction from Bitmaps Files

    Get PDF
    Cílem práce je extrakce novinových článků z rasterovaných předloh. Dále byl zpracován přehled moderních metod používaných pro detekci objektů se zaměřením na R-CNN metody. Tyto metody byly implementovány pomocí knihovny detectron2 programovacího jazyka Python. Dále, bylo nutné zpracovat poskytnuté novinové datasety. Součástí zpracování bylo převedení do rasterované podoby a vytvoření anotačního souboru z poskytnutých XML souborů. Provedené experimenty především prozkoumávají chování modelu na různých trénovacích datasetech. Vedlejším výstupem práce byla detekce jednotlivých novinových elementů.The aim of this thesis is the extraction of newspaper articles from bitmap files. State of the art of object detection was described with a focus on R-CNN architectures. These methods were implemented via detectron2, a Python library. Additionally, preprocessing of the provided dataset was necessary. The conversion to bitmap files was needed as well as the creation of annotation files from the provided XML files. Performed experiments are mainly exploring how the model performs with changes on the training dataset. The secondary output of the thesis was the detection of newspaper elements

    Метод передачі даних за допомогою нейронної мережі

    Get PDF
    Дана дипломна робота присвячена розробці та дослідженню методу передачі даних за допомогою нейронної мережі. Проведено дослідження методів обробки тексту за допомогою нейронних мереж. Проведено аналіз архітектури і принципу роботи автоенкодерів та функції втрат для його навчання. Побудовано автоенкодери для різних типів та розмірів даних. Виконано емпіричний підбір параметрів для навчання нейронної мережі. Для перевірки ефективності методу розоблено клієнт-серверну програму, що працює за протоколом TCP. Проведено тестування швидкості передачі даних по локальній мережі без ущільнення та з попереднім ущільненням на різних типах та розмірах даних. Виконано порівняльний аналіз розробленого методу з базовим методом передачі даних, який показав зменшення навантаження на мережу при використанні запропонованого методу.This work is devoted to the development and research of the method of data transmission using a neural network. The study of methods of text processing using neural networks was conducted. The analysis of the architecture and the principle of the work of autonomous encoders and the functions of losses for its training are given. Built-in autoconfigures for different types and sizes of data. The empirical selection of parameters for training of the neural network is fulfilled. To test the efficiency of the method, a client-server program running under TCP has been developed. The speed of data transmission on a local area network in full and with previous consolidation on different types and sizes of data has been checked. A comparative analysis of the developed method with the basic method of data transmission is performed by comparing the number of transmitted data in one minute
    corecore