    An attentive neural architecture for joint segmentation and parsing and its application to real estate ads

    In processing human produced text using natural language processing (NLP) techniques, two fundamental subtasks that arise are (i) segmentation of the plain text into meaningful subunits (e.g., entities), and (ii) dependency parsing, to establish relations between subunits. In this paper, we develop a relatively simple and effective neural joint model that performs both segmentation and dependency parsing together, instead of one after the other as in most state-of-the-art works. We will focus in particular on the real estate ad setting, aiming to convert an ad to a structured description, which we name property tree, comprising the tasks of (1) identifying important entities of a property (e.g., rooms) from classifieds and (2) structuring them into a tree format. In this work, we propose a new joint model that is able to tackle the two tasks simultaneously and construct the property tree by (i) avoiding the error propagation that would arise from the subtasks one after the other in a pipelined fashion, and (ii) exploiting the interactions between the subtasks. For this purpose, we perform an extensive comparative study of the pipeline methods and the new proposed joint model, reporting an improvement of over three percentage points in the overall edge F1 score of the property tree. Also, we propose attention methods, to encourage our model to focus on salient tokens during the construction of the property tree. Thus we experimentally demonstrate the usefulness of attentive neural architectures for the proposed joint model, showcasing a further improvement of two percentage points in edge F1 score for our application.Comment: Preprint - Accepted for publication in Expert Systems with Application

    Identification and monitoring polarization from social network perspective

    Abstract. Polarization is a new phenomenon that threatens the cohesion and social development of our society. The raise of social media is known to have contributed significantly to the emergence of this phenomenon as it can be noticed from the multiplication of far right and racist online communities as well as the ill-structured political discourse. This can be noticed from scrutinizing recent US or EU elections. Automatic identification of polarization from social media plays a key role in devising appropriate defence strategy to tackle the issue and avoid escalation. This thesis implements several methods to identify polarization from Twitter data issued from Trump-Clinton US election campaign using metrics like Belief Polarization Index (BPI) and Sentiment Analysis. Furtherly, semantic role labelling and argument mining were applied to derive structure of arguments of polarized discourse. Especially, we constructed thirteen topics of interests that were used as potential candidates for polarized discourse. For each topic, the cosine distance of the frequency of the topic overtime between the two candidates was used to indicate the polarization (called as Belief Polarization Index). The statistics inference of sentiment scores was implemented to convey either a positive or negative polarity, which are then further examined using argument structure. All the proposed approaches provide attempts to measure the polarization between two individuals from different perspectives, which may give some hints or references for future research.Tiivistelmä. Polarisaatio on uusi ilmiö, joka uhkaa yhteiskuntamme yhteenkuuluvuutta ja sosiaalista kehitystä. Sosiaalisen median nousun tiedetään vaikuttaneen merkittävästi tämän ilmiön syntymiseen, koska se voidaan havaita äärioikeistolaisten ja rasististen verkkoyhteisöjen lisääntymisestä sekä huonosti jäsennellystä poliittisesta keskustelusta. Tämä voidaan havaita tarkastelemalla äskettäisiä Yhdysvaltojen tai EU: n vaaleja. Polarisaation automaattisella tunnistamisella sosiaalisesta mediasta on keskeinen rooli sopivan puolustusstrategian suunnittelussa ongelman ratkaisemiseksi ja eskalaation välttämiseksi. Tässä opinnäytetyössä toteutetaan useita menetelmiä polarisaation tunnistamiseksi Yhdysvaltain Trump-Clintonin vaalikampanjan Twitter-tiedoista käyttämällä mittareita, kuten vakaumuspolarisaatio indeksi (BPI) ja mielipiteiden analyysi. Lisäksi semanttisen roolin merkintöjä ja argumenttien louhintaa sovellettiin polarisoidun diskurssin argumenttien rakenteen johtamiseen. Erityisesti rakensimme kolmetoista aihepiiriä, joita käytettiin potentiaalisina ehdokkaina polarisoituneeseen keskusteluun. Kunkin aiheen kohdalla kahden ehdokkaan aiheiden ylityötiheyden kosinietäisyyttä käytettiin osoittamaan polarisaatiota (kutsutaan nimellä Belief Polarization Index). Tunnelmapisteiden tilastollinen päättely toteutettiin joko positiivisen tai negatiivisen napaisuuden välittämiseksi, joita sitten tutkitaan edelleen argumenttirakennetta käyttäen. Kaikki ehdotetut lähestymistavat tarjoavat yrityksiä mitata kahden ihmisen välistä polarisaatiota eri näkökulmista, mikä saattaa antaa vihjeitä tai viitteitä tulevaa tutkimusta varten

    Chatbot for digital marketing and customer support: an artificial intelligence approach

    Dissertação de mestrado em Computer ScienceHuman interaction with machines has never been so frequent as nowadays. In order to reduce the redundant workload of a human being that answers repeated and trivial questions regarding customer support on a digital marketing website, this work has the purpose of replacing this tedious job with an informatics tool, a dialogue tool. A dialogue tool like a Chatbot that could handle customer support to a digital marketing website, provides the opportunity of placing human resources on ”non mechanical tasks”. Given that Chatbots exchange messages directly with customers, they could collect required protocol information in all the interactions. In spite of the possibility of needing human assistance, he will not need to ask these standard questions and will improve its efficiency. By automating these required dialogues to answer questions about certain products, that would otherwise be responded by a human, the organizations will have the opportunity to place human resources in another sectors that are not so easily automated.A interação humana com máquinas nunca foi tão frequente como nos dias de hoje. Com a intenção de reduzir a quantidade de trabalho de um ser humano que receberia ao responder a questões triviais e repetidas no que diz respeito a Suporte ao Cliente, este trabalho tem o propósito de substituir um trabalho entediante por uma ferramenta informática, uma ferramenta que possibilite o diálogo entre o cliente e o serviço de suporte. Uma ferramenta como um Chatbot que poderia fornecer suporte ao cliente num website de marketing digital iria providenciar às empresas a oportunidade de alocar trabalhadores para tarefas ”menos mecânicas”. Dado que os Chatbots trocam mensagens diretamente com os clientes, estes podem recolher informações que são sempre necessárias e protocolares em todas as interações. Assim sendo, mesmo que este diálogo requira possivelmente um ser humano, este irá prescindir de fazer estas perguntas padrão, melhorando assim a eficiência deste trabalho (Suporte ao Cliente). Ao automatizar diálogos necessários para responder a questões acerca de produtos que, de outra forma seriam respondidas por um ser humano, as organizações estarão a poupar tempo e dinheiro que podem ser aplicados noutros sectores menos propícios a serem automatizados