Search CORE

81 research outputs found

Learning Invariant Representations of Images for Computational Pathology

Author: Lafarge Maxime Wallace
Publication venue: Technische Universiteit Eindhoven
Publication date: 15/02/2021
Field of study

Pure OAI Repository

Typologically robust statistical machine translation:Understanding and exploiting differences and similarities between languages in machine translation

Author: Daiber J.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Automatic Generation of Training Corpus for Natural Language Processing Tasks

Author: Gulati Anmol
Gupta Shruti
Hoskere Jayakumar
Publication venue: Technical Disclosure Commons
Publication date: 06/10/2020
Field of study

Machine learning models that perform grammar error correction (GEC) suffer from insufficient training data. This disclosure describes techniques that automatically generate a large corpus of training data for GEC and other natural language processing tasks. With specific user permission, the techniques leverage the edit histories of documents by identifying changes to documents attributable to grammatical corrections by users. The training set for the GEC machine learning model is automatically augmented by sentences known to be ungrammatical (e.g., original text, before revision by user) or grammatical (e.g., text after revision by user), and labeled as such. The techniques enable the provision of a very large corpus of training data for grammar error-correcting or other natural language processing ML models

Technical Disclosure Common

2nd Conference on Language, Data and Knowledge (LDK 2019), May 20–23, 2019, Leipzig, Germany

Author: Buitelaar Paul
Chiarcos Christian
de Melo Gerard
Dojchinovski Milan
Eskevich Maria
Fäth Christian
Klimek Bettina
McCrae John P.
Publication venue
Publication date: 27/04/2023
Field of study

OPUS Augsburg

FST Morphology for the Endangered Skolt Sami Language

Author: Hämäläinen Mika
Rueter Jack
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

XV. Magyar Számítógépes Nyelvészeti Konferencia

Author
Publication venue
Publication date: 01/01/2019
Field of study

University of Szeged

Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

Author: Fernandez-Chaves David
Gonzalez-Jimenez Javier
Matez-Bandera Jose Luis
Monroy Javier
Petkov Nicolai
Ruiz-Sarmiento Jose Raul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Fishing massive black hole binaries with THAMES

Author: Chandra Koustav
Pai Archana
Sharma Kritti
Publication venue
Publication date: 04/08/2022
Field of study

Hierarchical mergers in a dense environment are one of the primary formation channels of intermediate-mass black hole (IMBH) binary system. We expect that the resulting massive binary system will exhibit mass asymmetry. The emitted gravitational-wave (GW) carry significant contribution from higher-order modes and hence complex waveform morphology due to superposition of different modes. Further, IMBH binaries exhibit lower merger frequency and shorter signal duration in the LIGO detector which increases the risk of them being misclassified as short-duration noisy glitches. Deep learning algorithms can be trained to discriminate noisy glitches from short GW transients. We present the

\mathtt{THAMES}

-- a deep-learning-based end-to-end signal detection algorithm for GW signals from quasi-circular nearly edge-on, mass asymmetric IMBH binaries in advanced GW detectors. Our study shows that it outperforms matched-filter based

\mathtt{PyCBC}

searches for higher mass asymmetric, nearly edge-on IMBH binaries. The maximum gain in the sensitive volume-time product for mass ratio

q \in (5, 10)

is by a factor of 5.24 (2.92) against

\mathtt{PyCBC-IMBH}

(

\mathtt{PyCBC-HM}

) search at a false alarm rate of 1 in 100 years. Compared to the broad

\mathtt{PyCBC}

search this factor is

\sim100

for the

q \in (10,18)

. One of the reasons for this leap in volumetric sensitivity is its ability to discriminate between signals with complex waveform morphology and noisy transients, clearly demonstrating the potential of deep learning algorithms in probing into complex signal morphology in the field of gravitational wave astronomy. With the current training set,

\mathtt{THAMES}

slightly underperforms with respect to

\mathtt{PyCBC}

-based searches targeting intermediate-mass black hole binaries with mass ratio

q \in (5, 10)

and detector frame total mass

M_T(1+z) \in (100,200)~M_\odot

.Comment: 21 pages, 19 figure

arXiv.org e-Print Archive

A Computational Lexicon and Representational Model for Arabic Multiword Expressions

Author: Alghamdi Ayman Ahmad O.
Publication venue: University of Leeds
Publication date: 01/10/2018
Field of study

The phenomenon of multiword expressions (MWEs) is increasingly recognised as a serious and challenging issue that has attracted the attention of researchers in various language-related disciplines. Research in these many areas has emphasised the primary role of MWEs in the process of analysing and understanding language, particularly in the computational treatment of natural languages. Ignoring MWE knowledge in any NLP system reduces the possibility of achieving high precision outputs. However, despite the enormous wealth of MWE research and language resources available for English and some other languages, research on Arabic MWEs (AMWEs) still faces multiple challenges, particularly in key computational tasks such as extraction, identification, evaluation, language resource building, and lexical representations. This research aims to remedy this deficiency by extending knowledge of AMWEs and making noteworthy contributions to the existing literature in three related research areas on the way towards building a computational lexicon of AMWEs. First, this study develops a general understanding of AMWEs by establishing a detailed conceptual framework that includes a description of an adopted AMWE concept and its distinctive properties at multiple linguistic levels. Second, in the use of AMWE extraction and discovery tasks, the study employs a hybrid approach that combines knowledge-based and data-driven computational methods for discovering multiple types of AMWEs. Third, this thesis presents a representative system for AMWEs which consists of multilayer encoding of extensive linguistic descriptions. This project also paves the way for further in-depth AMWE-aware studies in NLP and linguistics to gain new insights into this complicated phenomenon in standard Arabic. The implications of this research are related to the vital role of the AMWE lexicon, as a new lexical resource, in the improvement of various ANLP tasks and the potential opportunities this lexicon provides for linguists to analyse and explore AMWE phenomena

White Rose E-theses Online