Search CORE

323 research outputs found

A Deep Understanding of Structural and Functional Behavior of Tabular and Graphical Modules in Technical Documents

Author: Alexiou Michail
Publication venue: CORE Scholar
Publication date: 01/01/2021
Field of study

The rapid increase of published research papers in recent years has escalated the need for automated ways to process and understand them. The successful recognition of the information that is contained in technical documents, depends on the understanding of the document’s individual modalities. These modalities include tables, graphics, diagrams and etc. as defined in Bourbakis’ pioneering work. However, the depth of understanding is correlated to the efficiency of detection and recognition. In this work, a novel methodology is proposed for automatic processing of and understanding of tables and graphics images in technical document. Previous attempts on tables and graphics understanding retrieve only superficial knowledge such as table contents and axis values. However, the focus on capturing the internal associations and relations between the extracted data from each figure is studied here. The proposed methodology is divided into the following steps: 1) figure detection, 2) figure recognition, 3) figure understanding, by figures we mean tables, graphics and diagrams. More specifically, we evaluate different heuristic and learning methods for classifying table and graphics images as part of the detection module. Table recognition and deep understanding includes the extraction of the knowledge that is illustrated in a table image along with the deeper associations between the table variables. The graphics recognition module follows a clustering based approach in order to recognize middle points. Middle points are 2D points where the direction of the curves changes. They delimit the straight line segments that construct the graphics curves. We use these detected middle points in order to understand various features of each line segment and the associations between them. Additionally, we convert the extracted internal tabular associations and the captured curves’ structural and functional behavior into a common and at the same time unique form of representation, which is the Stochastic Petri-net (SPN) graphs. The use of SPN graphs allow for the merging of different document modalities through the functions that describe them, without any prior knowledge about what these functions are. Finally, we achieve a higher level of document understanding through the synergistic merging of the aforementioned SPN graphs that we extract from the table and graphics modalities. We provide results from every step of the document modalities understanding methodologies and the synergistic merging as proof of concept for this research

CORE

Mathematical Formula Recognition and Automatic Detection and Translation of Algorithmic Components into Stochastic Petri Nets in Scientific Documents

Author: Kostalia Elisavet Elli
Publication venue: CORE Scholar
Publication date: 01/01/2021
Field of study

A great percentage of documents in scientific and engineering disciplines include mathematical formulas and/or algorithms. Exploring the mathematical formulas in the technical documents, we focused on the mathematical operations associations, their syntactical correctness, and the association of these components into attributed graphs and Stochastic Petri Nets (SPN). We also introduce a formal language to generate mathematical formulas and evaluate their syntactical correctness. The main contribution of this work focuses on the automatic segmentation of mathematical documents for the parsing and analysis of detected algorithmic components. To achieve this, we present a synergy of methods, such as string parsing according to mathematical rules, Formal Language Modeling, optical analysis of technical documents in forms of images, structural analysis of text in images, and graph and Stochastic Petri Net mapping. Finally, for the recognition of the algorithms, we enriched our rule based model with machine learning techniques to acquire better results

CORE

Architectural Refinement in HETS

Author: Codescu Mihai
Publication venue
Publication date: 01/01/2012
Field of study

The main objective of this work is to bring a number of improvements to the Heterogeneous Tool Set HETS, both from a theoretical and an implementation point of view. In the first part of the thesis we present a number of recent extensions of the tool, among which declarative specifications of logics, generalized theoroidal comorphisms, heterogeneous colimits and integration of the logic of the term rewriting system Maude. In the second part we concentrate on the CASL architectural refinement language, that we equip with a notion of refinement tree and with calculi for checking correctness and consistency of refinements. Soundness and completeness of these calculi is also investigated. Finally, we present the integration of the VSE refinement method in HETS as an institution comorphism. Thus, the proof manangement component of HETS remains unmodified

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

A Relational Triple Extraction Method Based on Feature Reasoning for Technological Patents

Author: Du Junping
Fang Runze
Guan Zeli
Shao Yingxia
Publication venue
Publication date: 06/10/2022
Field of study

The relation triples extraction method based on table filling can address the issues of relation overlap and bias propagation. However, most of them only establish separate table features for each relationship, which ignores the implicit relationship between different entity pairs and different relationship features. Therefore, a feature reasoning relational triple extraction method based on table filling for technological patents is proposed to explore the integration of entity recognition and entity relationship, and to extract entity relationship triples from multi-source scientific and technological patents data. Compared with the previous methods, the method we proposed for relational triple extraction has the following advantages: 1) The table filling method that saves more running space enhances the speed and efficiency of the model. 2) Based on the features of existing token pairs and table relations, reasoning the implicit relationship features, and improve the accuracy of triple extraction. On five benchmark datasets, we evaluated the model we suggested. The result suggest that our model is advanced and effective, and it performed well on most of these datasets

arXiv.org e-Print Archive

Probabilistic Tractable Models in Mixed Discrete-Continuous Domains

Author: Acharya
Albarghouthi
Bacchus
Bach
Baldoni
Baldoni
Barrett
Bekker
Belle
Belle
Belle
Bengio
Chan
Chavira
Chistikov
Dheeru
Dougherty
Fierens
Gebelein
Gens
Heckerman
Hsu
Kolb
Koller
Kulessa
Lauritzen
Liang
Lopez-Cruz
Molina
Molina
Morettin
Murphy
Murphy
Nitti
Peharz
Peharz
Peharz
Poon
Rashwan
Ravkic
Sanner
Schwarz
Shenoy
Shenoy
Speichert
Trapp
Valiant
Vergari
Vergari
Yang
Zeng
Publication venue: 'MIT Press - Journals'
Publication date: 02/06/2021
Field of study

Crossref

Edinburgh Research Explorer

Recommended from our members

Towards Segment-level Video Understanding: Detecting Activities from Untrimmed Videos

Author: Zhang Da
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

We generate massive amounts of video data every day. While most real-world videos are long and untrimmed with sparsely localized segments of interest, existing AI systems that can interpret videos today often rely on static image analysis or can only process temporal information in a short video snippet. To automatically understand the content of long video streams, this thesis mainly describes the efforts to design accurate, efficient, and intelligent deep learning algorithms for temporal activity detection in untrimmed videos. Detecting segments of interest from untrimmed videos is a key step towards segment-level video understanding. Depending on the purposes of tasks being performed, we address three different activity detection tasks: detecting activities of interest from videos without specific purposes (i.e., temporal activity detection); detecting temporal segment that best corresponds to a language query (i.e., natural language moment retrieval); and detecting activities given less supervision (i.e., weakly-supervised or few-shot activity detection).In temporal activity detection, We first propose a highly unified single-shot temporal activity detector based on fully 3D convolutional networks, by eliminating explicit temporal proposal and classification stages. Evaluations show that it achieves state-of-the-art on temporal activity detection while being super efficient to operate at 1271 FPS. We then investigate how to effectively apply a multi-scale architecture to model activities with various temporal length and frequency. We propose three novel architecture designs: (1) dynamic temporal sampling; (2) two-branch feature hierarchy; (3) multi-scale contextual feature fusion, and we combine all these components into a uniform network and achieve the state-of-the-art on a much larger temporal activity detection benchmark.In natural language moment retrieval, we aim to localize the segment that best corresponds to a given language query. We present a language-guided temporal attention module and an iterative graph adjustment network to handle the semantic and structural misalignment between video and language. The proposed model demonstrates superior capability to handle temporal relations, thus, significantly improves the state-of-the-art by a large margin.Finally, we study the problem of weakly-supervised and few-shot temporal activity detection to mitigate the drawbacks of huge amounts of supervision needed to train a temporal detection model. Namely, we answer the question if we can learn a temporal activity detector under weak supervision that is able to localize unseen activity classes. A novel meta-learning based detection method is accordingly proposed by adopting the few-shot learning technique of Relation Network. Results show that our method achieves performance superior or competitive to state-of-the-art approaches with stronger supervision.In summary, we propose a suite of algorithms and solutions to automatically detect segments of interest in long untrimmed videos. We hope our studies could provide insights for researchers to explore new deep learning paradigms for future computer vision research, especially on video-related topics

eScholarship - University of California

A Survey on Recent Named Entity Recognition and Relation Classification Methods with Focus on Few-Shot Learning Approaches

Author: Alqaaidi Sakher
Bozorgi Elika
Publication venue
Publication date: 29/10/2023
Field of study

Named entity recognition and relation classification are key stages for extracting information from unstructured text. Several natural language processing applications utilize the two tasks, such as information retrieval, knowledge graph construction and completion, question answering and other domain-specific applications, such as biomedical data mining. We present a survey of recent approaches in the two tasks with focus on few-shot learning approaches. Our work compares the main approaches followed in the two paradigms. Additionally, we report the latest metric scores in the two tasks with a structured analysis that considers the results in the few-shot learning scope

arXiv.org e-Print Archive

Combining Representation Learning with Logic for Language Processing

Author: Rocktäschel Tim
Publication venue
Publication date: 27/12/2017
Field of study

The current state-of-the-art in many natural language processing and automated knowledge base completion tasks is held by representation learning methods which learn distributed vector representations of symbols via gradient-based optimization. They require little or no hand-crafted features, thus avoiding the need for most preprocessing steps and task-specific assumptions. However, in many cases representation learning requires a large amount of annotated training data to generalize well to unseen data. Such labeled training data is provided by human annotators who often use formal logic as the language for specifying annotations. This thesis investigates different combinations of representation learning methods with logic for reducing the need for annotated training data, and for improving generalization.Comment: PhD Thesis, University College London, Submitted and accepted in 201

arXiv.org e-Print Archive

UCL Discovery