Search CORE

31 research outputs found

DCU's experiments in NTCIR-8 IR4QA task

Author: Jiang Jie
Jones Gareth J.F.
Leveling Johannes
Min Jinming
Way Andy
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 01/01/2010
Field of study

We describe DCU's participation in the NTCIR-8 IR4QA task [16]. This task is a cross-language information retrieval(CLIR) task from English to Simplified Chinese which seeks to provide relevant documents for later cross language question answering (CLQA) tasks. For the IR4QA task, we submitted 5 official runs including two monolingual runs and three CLIR runs. For the monolingual retrieval we tested two information retrieval models. The results show that the KL-Divergence language model method performs better than the Okapi BM25 model for the Simplified Chinese retrieval task. This agrees with our previous CLIR experimental results at NTCIR-5. For the CLIR task, we compare query translation and document translation methods. In the query translation based runs, we tested a method for query expansion from external resource (QEE) before query translation. Our result for this run is slightly lower than the run without QEE. Our results show that the document translation method achieves 68.24% MAP performance compared to our best query translation run. For the document translation method, we found that the main issue is the lack of named entity translation in the documents since we do not have a suitable parallel corpus for training data for the statistical machine translation system. Our best CLIR run comes from the combination of query translation using Google translate and the KL-Divergence language model retrieval method. It achieves 79.94% MAP relative to our best monolingual run

DCU Online Research Access Service

Predicting visual context for unsupervised event segmentation in continuous photo-streams

Author: Bolanos Marc
Dang-Nguyen Duc-Tien
del Molino Ana Garcia
del Molino Ana Garcia
Gygli Michael
Lee Yong Jae
Lin Jie
Lin Wei-Hao
Ng Hamg Wei
Srivastava Nitish
Yamamoto Shuhei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2018
Field of study

Segmenting video content into events provides semantic structures for indexing, retrieval, and summarization. Since motion cues are not available in continuous photo-streams, and annotations in lifelogging are scarce and costly, the frames are usually clustered into events by comparing the visual features between them in an unsupervised way. However, such methodologies are ineffective to deal with heterogeneous events, e.g. taking a walk, and temporary changes in the sight direction, e.g. at a meeting. To address these limitations, we propose Contextual Event Segmentation (CES), a novel segmentation paradigm that uses an LSTM-based generative network to model the photo-stream sequences, predict their visual context, and track their evolution. CES decides whether a frame is an event boundary by comparing the visual context generated from the frames in the past, to the visual context predicted from the future. We implemented CES on a new and massive lifelogging dataset consisting of more than 1.5 million images spanning over 1,723 days. Experiments on the popular EDUB-Seg dataset show that our model outperforms the state-of-the-art by over 16% in f-measure. Furthermore, CES' performance is only 3 points below that of human annotators.Comment: Accepted for publication at the 2018 ACM Multimedia Conference (MM '18

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

A knowledge regularized hierarchical approach for emotion cause analysis

Author: Bing Lidong
Du Jiachen
Fan Chuang
Gui Lin
Mao Ruibin
Xu Ruifeng
Yan Hongyu
Yang Min
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Emotion cause analysis, which aims to identify the reasons behind emotions, is a key topic in sentiment analysis. A variety of neural network models have been proposed recently, however, these previous models mostly focus on the learning architecture with local textual information, ignoring the discourse and prior knowledge, which play crucial roles in human text comprehension. In this paper, we propose a new method to extract emotion cause with a hierarchical neural model and knowledge-based regularizations, which aims to incorporate discourse context information and restrain the parameters by sentiment lexicon and common knowledge. The experimental results demonstrate that our proposed method achieves the state-of-the-art performance on two public datasets in different languages (Chinese and English), outperforming a number of competitive baselines by at least 2.08% in F-measure

Crossref

Warwick Research Archives Portal Repository

Contextual search and exploration

Author: Clarke C.L.A.
Kamps J.
Kiseleva J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Making Presentation Math Computable

Author: Greiner-Petter André
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2023
Field of study

This Open-Access-book addresses the issue of translating mathematical expressions from LaTeX to the syntax of Computer Algebra Systems (CAS). Over the past decades, especially in the domain of Sciences, Technology, Engineering, and Mathematics (STEM), LaTeX has become the de-facto standard to typeset mathematical formulae in publications. Since scientists are generally required to publish their work, LaTeX has become an integral part of today's publishing workflow. On the other hand, modern research increasingly relies on CAS to simplify, manipulate, compute, and visualize mathematics. However, existing LaTeX import functions in CAS are limited to simple arithmetic expressions and are, therefore, insufficient for most use cases. Consequently, the workflow of experimenting and publishing in the Sciences often includes time-consuming and error-prone manual conversions between presentational LaTeX and computational CAS formats. To address the lack of a reliable and comprehensive translation tool between LaTeX and CAS, this thesis makes the following three contributions. First, it provides an approach to semantically enhance LaTeX expressions with sufficient semantic information for translations into CAS syntaxes. Second, it demonstrates the first context-aware LaTeX to CAS translation framework LaCASt. Third, the thesis provides a novel approach to evaluate the performance for LaTeX to CAS translations on large-scaled datasets with an automatic verification of equations in digital mathematical libraries. This is an open access book

Directory of Open Access Books (DOAB)

Making Presentation Math Computable

Author: Greiner-Petter André
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Tree-Based Representation and Generation of Natural and Mathematical Language

Author: Lan Andrew
Scarlatos Alexander
Publication venue
Publication date: 15/02/2023
Field of study

Mathematical language in scientific communications and educational scenarios is important yet relatively understudied compared to natural languages. Recent works on mathematical language focus either on representing stand-alone mathematical expressions, especially in their natural tree format, or mathematical reasoning in pre-trained natural language models. Existing works on jointly modeling and generating natural and mathematical languages simply treat mathematical expressions as text, without accounting for the rigid structural properties of mathematical expressions. In this paper, we propose a series of modifications to existing language models to jointly represent and generate text and math: representing mathematical expressions as sequences of node tokens in their operator tree format, using math symbol and tree position embeddings to preserve the semantic and structural properties of mathematical expressions, and using a constrained decoding method to generate mathematically valid expressions. We ground our modifications in GPT-2, resulting in a model MathGPT, and demonstrate that it outperforms baselines on mathematical expression generation tasks

arXiv.org e-Print Archive

Developing a distributed electronic health-record store for India

Author: Dowling Jim
Publication venue
Publication date: 01/01/2008
Field of study

The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive