263 research outputs found
A Survey on Retrieval of Mathematical Knowledge
We present a short survey of the literature on indexing and retrieval of
mathematical knowledge, with pointers to 72 papers and tentative taxonomies of
both retrieval problems and recurring techniques.Comment: CICM 2015, 20 page
Experiments in lifelog organisation and retrieval at NTCIR
Lifelogging can be described as the process by which individuals use various software and hardware devices to gather large archives of multimodal personal data from multiple sources and store them in a personal data archive, called a lifelog. The Lifelog task at NTCIR was a comparative benchmarking exercise with the aim of encouraging research into the organisation and retrieval of data from multimodal lifelogs. The Lifelog task ran for over 4 years from NTCIR-12 until NTCIR-14 (2015.02–2019.06); it supported participants to submit to five subtasks, each tackling a different challenge related to lifelog retrieval. In this chapter, a motivation is given for the Lifelog task and a review of progress since NTCIR-12 is presented. Finally, the lessons learned and challenges within the domain of lifelog retrieval are presented
SemEval-2016 task 5 : aspect based sentiment analysis
International audienceThis paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams
Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval
We summarize math search engines and search interfaces produced by the
Document and Pattern Recognition Lab in recent years, and in particular the min
math search interface and the Tangent search engine. Source code for both
systems are publicly available. "The Masses" refers to our emphasis on creating
systems for mathematical non-experts, who may be looking to define unfamiliar
notation, or browse documents based on the visual appearance of formulae rather
than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer
Mathematics (July, Washington DC
Overview of NTCIR-12 Lifelog Task
In this paper we review the NTCIR12-Lifelog pilot task,
which ran at NTCIR-12. We outline the test collection employed,
along with the tasks, the eight submissions and the
findings from this pilot task. We finish by suggesting future
plans for the task
MIRACLE at NTCIR-7 MOAT: First experiments on multilingual opinion analysis
This paper describes the participation of MIRACLE research consortium at NTCIR-7 Multilingual Opinion Analysis Task, our first attempt on sentiment analysis and second on East Asian languages. We took part in the main mandatory opinionated sentence judgment subtask (to decide whether each sentence expresses an opinion or not) and the optional relevance and polarity judgment subtasks (to decide whether a given sentence is relevant to the given topic and also the polarity of the expressed opinion). Our approach combines a semantic languagedependent tagging of the terms of the sentence and the topic and three different ad-hoc classifiers that provide the specific annotation for each subtask, run in cascade. These models have been trained with the corpus provided in NTCIR-6 Opinion Analysis pilot task
DCU at the NTCIR-12 SpokenQuery&Doc-2 task
We describe DCU’s participation in the NTCIR-12 SpokenQuery&Doc (SQD-2) task. In the context of the slide-group
retrieval sub-task, we experiment with a passage retrieval
method that re-scores each passage according to the relevance score of the document from which the passage is taken.
This is performed by linearly interpolating their relevance
scores which are calculated using the Okapi BM25 model of
probabilistic retrieval for passages and documents independently. In conjunction with this, we assess the benefits of
using pseudo-relevance feedback for expanding the textual
representation of the spoken queries with terms found in the
top-ranked documents and passages, and experiment with
a general multidimensional optimisation method to jointly
tune the BM25 and query expansion parameters with queries
and relevance data from the NTCIR-11 SQD-1 task. Retrieval experiments performed over the SQD-1 and SQD-2
queries confirm previous findings which affirm that integrating document information when ranking passages can lead
to improved passage retrieval effectiveness. Furthermore,
results indicate that no significant gains in retrieval effectiveness can be obtained by using query expansion in combination with our retrieval models over these two query sets
Cross-Language Question Re-Ranking
We study how to find relevant questions in community forums when the language
of the new questions is different from that of the existing questions in the
forum. In particular, we explore the Arabic-English language pair. We compare a
kernel-based system with a feed-forward neural network in a scenario where a
large parallel corpus is available for training a machine translation system,
bilingual dictionaries, and cross-language word embeddings. We observe that
both approaches degrade the performance of the system when working on the
translated text, especially the kernel-based system, which depends heavily on a
syntactic kernel. We address this issue using a cross-language tree kernel,
which compares the original Arabic tree to the English trees of the related
questions. We show that this kernel almost closes the performance gap with
respect to the monolingual system. On the neural network side, we use the
parallel corpus to train cross-language embeddings, which we then use to
represent the Arabic input and the English related questions in the same space.
The results also improve to close to those of the monolingual neural network.
Overall, the kernel system shows a better performance compared to the neural
network in all cases.Comment: SIGIR-2017; Community Question Answering; Cross-language Approaches;
Question Retrieval; Kernel-based Methods; Neural Networks; Distributed
Representation
- …