4 research outputs found
Modelling semantic relations with distributitional semantics and deep learning : question answering, entailment recognition and paraphrase detection
Nesta dissertação apresenta-se uma abordagem à tarefa de modelar relações semânticas
entre dois textos com base em modelos de semântica distribucional e em aprendizagem
profunda. O presente trabalho tira partido de várias disciplinas da ciência
cognitiva, com especial relevo para a computação, a linguística e a inteligência artificial,
e com fortes influência da neurociência e da psicologia cognitiva.
Os modelos de semântica distribucional (também conhecidos como ”word embeddings”)
são usados para representar o significado das palavras. As representações
semânticas das palavras podem ainda ser combinadas para obter o significado de
um excerto de um texto recorrendo ao uso da aprendizagem profunda, isto é, com o
apoio das redes neurais de convolução.
Esta abordagen é utilizada para replicar a experiência realizada por Bogdanova
et al. (2015) na tarefa de deteção de perguntas que podem ser respondidas as mesmas
respostas tal como estas foram respondidas em fóruns on-line. Os resultados do
desempenho obtidos pelas experiências apresentadas nesta dissertação são equivalentes
ou melhores que os resultados obtidos no trabalho de referência mencionado
acima.
Apresentao também um estudo sobre o impacto do pré-processamento apropriado
do texto, tendo em conta os resultados que podem ser obtidos pelas abordagens
adotadas no trabalho de referência supramencionado. Este estudo é levado a cabo
removendo-se certas pistas que podem levar o sistema, indevidamente, a detetar
perguntas equivalentes. Essa remoção das pistas leva a uma diminuição significativa
no desempenho do sistema desenvolvido no trabalho de referência.
Nesta dissertação é ainda apresentado um estudo sobre o impacto que os word
embeddings treinados previamente têm na tarefa de detetar perguntas semanticamente
equivalentes. Substituindo-se, aleatoriamente, word embeddings previamente
treinados por outros melhora-se o desempenho do sistema.
Além disso, o modelo foi utilizado na tarefa de reconhecimento de implicações
para Português, onde mostrou uma taxa de acerto similar à da baseline. Este trabalho também reporta os resultados da aplicação da abordagem adotada
numa competição para a deteção de paráfrases em Russo. A configuração final apresenta
duas melhorias: usa character embeddings em vez de word embeddings e usa
vários filtros de convolução. Esta configuração foi testado na execução padrão da
Tarefa 2 da competição relevante, e mostrou resultados competitivos.This dissertation presents an approach to the task of modelling semantic relations between
two texts, which is based on distributional semantic models and deep learning.
The present work takes advantage of various disciplines of cognitive science, mainly
computation, linguistics and artificial intelligence, with strong influences from neuroscience
and cognitive psychology.
Distributional semantic models (also known as word embeddings) are used to
represent the meaning of words. Word semantic representations can be further combined
towards obtaining the meaning of a larger chunk of a text using a deep learning
approach, namely with the support of convolutional neural networks.
These approaches are used to replicate the experiment carried out, by Bogdanova
et al. (2015), for the task of detecting questions that can be answered by exactly the
same answer in online user forums. Performance results obtained by my experiments
are comparable or better than the ones reported in that referenced work.
I present also a study on the impact of appropriate text preprocessing with respect
to the results that can be obtained by the approaches adopted in that referenced
work. Removing certain clues that can unduly help the system to detect equivalent
questions leads to a significant decrease in system’s performance supported by that
referenced work.
I also present a study of the impact that pre-trained word embeddings have in the
task of detecting the semantically equivalent questions. Replacing pre-trained word
embeddings by randomly initialised ones improves the performance of the system.
Additionally, the model was applied to the task of entailment recognition for Portuguese
and showed an accuracy on a level with the baseline.
This dissertation also reports on the results of an experimental study on the application
of the adopted approach to the shared task of sentence paraphrase detection
in Russian. The final set up contained two improvements: it uses several convolutional
filters and it uses character embeddings instead of word embeddings. It was tested in Task 2 standard run of the relevant shared task and it showed competitive
results
The Gamification of Crowdsourcing Systems: Empirical Investigations and Design
Recent developments in modern information and communication technologies have spawned two rising phenomena, gamification and crowdsourcing, which are increasingly being combined into gamified crowdsourcing systems. While a growing number of organizations employ crowdsourcing as a way to outsource tasks related to the inventing, producing, funding, or distributing of their products and services to the crowd – a large group of people reachable via the internet – crowdsourcing initiatives become enriched with design features from games to motivate the crowd to participate in these efforts. From a practical perspective, this combination seems intuitively appealing, since using gamification in crowdsourcing systems promises to increase motivations, participation and output quality, as well as to replace traditionally used financial incentives. However, people in large groups all have individual interests and motivations, which makes it complex to design gamification approaches for crowds. Further, crowdsourcing systems exist in various forms and are used for various tasks and problems, thus requiring different incentive mechanisms for different crowdsourcing types. The lack of a coherent understanding of the different facets of gamified crowdsourcing systems and the lack of knowledge about the motivational and behavioral effects of applying various types of gamification features in different crowdsourcing systems inhibit us from designing solutions that harness gamification’s full potential. Further, previous research canonically uses competitive gamification, although crowdsourcing systems often strive to produce cooperative outcomes. However, the potentially relevant field of cooperative gamification has to date barely been explored. With a specific focus on these shortcomings, this dissertation presents several studies to advance the understanding of using gamification in crowdsourcing systems