Search CORE

105,127 research outputs found

Optical Graph Recognition

Author: Reislhuber Josef (M. Sc.)
Publication venue
Publication date: 08/01/2018
Field of study

Graphs are an important model for the representation of structural information between objects. One identifies objects and nodes as well as a binary relation between objects and edges. Graphs have many uses, e. g., in social sciences, life sciences and engineering. There are two primary representations: abstract and visual. The abstract representation is well suited for processing graphs by computers and is given by an adjacency list, an adjacency matrix or any abstract data structure. A visual representation is used by human users who prefer a picture. Common terms are diagram, scheme, plan, or network. The objective of Graph Drawing is to transform a graph into a visual representation called the drawing of a graph. The goal is a “nice” drawing. In this thesis we introduce Optical Graph Recognition. Optical Graph Recognition (OGR) reverses Graph Drawing and transforms a digital image of a graph into an abstract representation. Our approach consists of four phases: Preprocessing where we determine which pixels of an image are part of the graph, Segmentation where we recognize the nodes, Topology Recognition where we detect the edges and Postprocessing where we enrich the recognized graph with additional information. We apply established digital image processing methods and make use of the special property that the image contains nodes that are connected by edges. We have focused on developing algorithms that need as little parameters as possible or to automatically calibrate the parameters. Most false recognition results are caused by crossing edges as this makes tracing the edges difficult and can lead to other recognition errors. We have evaluated hand-drawn and computer-drawn graphs. Our algorithms have a very high recognition rate for computer-drawn graphs, e. g., from a set of 100000 computer-drawn graphs over 90% were correctly recognized. Most false recognition results where observed for hand-drawn graphs as they can include drawing errors and inaccuracies. For universal usability we have implemented a prototype called OGRup for mobile devices like smartphones or tablet computers. With our software it is possible to directly take a picture of a graph via a built in camera, recognize the graph, and then use the result for further processing. Furthermore, in order to gain more insight into the way a person draws a graph by hand, we have conducted a field study

QCompere @ REPERE 2013

Author: Barras Claude
Besacier Laurent
Bredin Hervé
Ekenel Hazim Kemal
Fortier Guillaume
Hua Gao
Le Viet-Bac
Mignon Alexis
Poignant Johann
Quénot Georges
Rosset Sophie
Roy Anindya
Sarkar Achintya
Stiefelhagen Rainer
Tapaswi Makarand
Verbeek Jakob
Yang Qian
Publication venue: HAL CCSD
Publication date: 22/08/2013
Field of study

International audienceWe describe QCompere consortium submissions to the REPERE 2013 evaluation campaign. The REPERE challenge aims at gathering four communities (face recognition, speaker identification, optical character recognition and named entity detection) towards the same goal: multimodal person recognition in TV broadcast. First, four mono-modal components are introduced (one for each foregoing community) constituting the elementary building blocks of our various submissions. Then, depending on the target modality (speaker or face recognition) and on the task (supervised or unsupervised recognition), four different fusion techniques are introduced: they can be summarized as propagation-, classifier-, rule- or graph-based approaches. Finally, their performance is evaluated on REPERE 2013 test set and their advantages and limitations are discussed

Hal - Université Grenoble Alpes

Graph Distillation for Action Detection with Privileged Modalities

Author: Bingbing Ni
C Zach
HS Koppula
J Liu
L Shao
M Liu
M Yu
R Caruana
SJ Pan
V Escorcia
V Vapnik
W Li
Z Ding
Z Qin
Publication venue
Publication date: 27/07/2018
Field of study

We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available. Common methods in transfer learning do not take advantage of the extra modalities potentially available in the source domain. On the other hand, previous work on multimodal learning only focuses on a single domain or task and does not handle the modality discrepancy between training and testing. In this work, we propose a method termed graph distillation that incorporates rich privileged information from a large-scale multimodal dataset in the source domain, and improves the learning in the target domain where training data and modalities are scarce. We evaluate our approach on action classification and detection tasks in multimodal videos, and show that our model outperforms the state-of-the-art by a large margin on the NTU RGB+D and PKU-MMD benchmarks. The code is released at http://alan.vision/eccv18_graph/.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

A Unified Multilingual Handwriting Recognition System using multigrams sub-lexical units

Author: Paquet Thierry
Soullard Yann
Swaileh Wassim
Publication venue: 'Elsevier BV'
Publication date: 28/08/2018
Field of study

We address the design of a unified multilingual system for handwriting recognition. Most of multi- lingual systems rests on specialized models that are trained on a single language and one of them is selected at test time. While some recognition systems are based on a unified optical model, dealing with a unified language model remains a major issue, as traditional language models are generally trained on corpora composed of large word lexicons per language. Here, we bring a solution by con- sidering language models based on sub-lexical units, called multigrams. Dealing with multigrams strongly reduces the lexicon size and thus decreases the language model complexity. This makes pos- sible the design of an end-to-end unified multilingual recognition system where both a single optical model and a single language model are trained on all the languages. We discuss the impact of the language unification on each model and show that our system reaches state-of-the-art methods perfor- mance with a strong reduction of the complexity.Comment: preprin

arXiv.org e-Print Archive

HAL - Normandie Université

Semi-Supervised First-Person Activity Recognition in Body-Worn Video

Author: Akar Osman
Bertozzi Andrea L.
Brantingham P. Jeffrey
Chen Honglin
Dhillon Adam
Haberland Matt
Li Hao
Song Alexander
Zhou Tiankuang
Publication venue
Publication date: 18/04/2019
Field of study

Body-worn cameras are now commonly used for logging daily life, sports, and law enforcement activities, creating a large volume of archived footage. This paper studies the problem of classifying frames of footage according to the activity of the camera-wearer with an emphasis on application to real-world police body-worn video. Real-world datasets pose a different set of challenges from existing egocentric vision datasets: the amount of footage of different activities is unbalanced, the data contains personally identifiable information, and in practice it is difficult to provide substantial training footage for a supervised approach. We address these challenges by extracting features based exclusively on motion information then segmenting the video footage using a semi-supervised classification algorithm. On publicly available datasets, our method achieves results comparable to, if not better than, supervised and/or deep learning methods using a fraction of the training data. It also shows promising results on real-world police body-worn video

arXiv.org e-Print Archive

eScholarship - University of California

Similarity suppression algorithm for designing pattern discrimination filters

Author: Selviah D.R.
Stamos E.
Publication venue
Publication date: 01/01/2002
Field of study

UCL Discovery