    The Usage of twitter from the spanish media during the elections

    Los medios de comunicación tradic ionales han adoptado Twitter c omo otro canal por el que difundir su información y llegar hasta las audiencias digit ales. Este estudio tiene como objetivo analizar las características de los tuits publicados por los me dios españoles durante doce meses en los que se producen dos procesos electorales, para comprobar las di ferencias en la utilización de esta plataforma según el contenido de la cobertura y la fase en que se publica. La investigación se fundamenta en una metodología cua ntitativa, en la que se implem enta un análisis de contenido computarizado de las publicacione s en Twitter que hacen los dis tintos tipos de medios entre 2015 y 2016. Los resultados revelan un us o promocional de herramientas como el hashtag o las menciones en los medios con menos seguidores. Asimismo, la información polít ica se caracteriza por contener un mayor número de menciones y declaraciones de candidatos, especi almente durante la campaña electoral, favoreciendo la aparición de cámaras de eco en la es fera política en Twitter. Traditional media sources have adopted Twitter as a canal to broadcast information across digital audiences. This study has the objective of analyzing the characteristics of the tweets published by Spanish media sources during the twelve months in which presidential electoral processes took place. This was done to test the differences in the use of this platform, according to the content of the coverage, and the phase in which it was published. Our research uses a quantitative methodology, where we used computerized content analysis for Twitter publications that different media made during 2015 and 2016. Our results revealed a promotional use of tools such as the hashtag, or mentions in media with less followers. At the same time, political information was characterized by containing a higher use of mentions and statements by the candidates, especially during the election campaigning, favoring the creation of echo chambers in the political Twittersphere

    Tracking Knowledge Propagation Across Wikipedia Languages

    In this paper, we present a dataset of inter-language knowledge propagation in Wikipedia. Covering the entire 309 language editions and 33M articles, the dataset aims to track the full propagation history of Wikipedia concepts, and allow follow-up research on building predictive models of them. For this purpose, we align all the Wikipedia articles in a language-agnostic manner according to the concept they cover, which results in 13M propagation instances. To the best of our knowledge, this dataset is the first to explore the full inter-language propagation at a large scale. Together with the dataset, a holistic overview of the propagation and key insights about the underlying structural factors are provided to aid future research. For example, we find that although long cascades are unusual, the propagation tends to continue further once it reaches more than four language editions. We also find that the size of language editions is associated with the speed of propagation. We believe the dataset not only contributes to the prior literature on Wikipedia growth but also enables new use cases such as edit recommendation for addressing knowledge gaps, detection of disinformation, and cultural relationship analysis

    Galileo, a data platform for viewing news on social networks

    This article aims to introduce Galileo, a platform for extracting and organizing news media data on social networks. Galileo integrates publications made on the main social networks used in the information ecosystem, namely Facebook, Twitter, and Instagram. Currently, the system includes 97 media outlets from nine countries: Brazil, Chile, Germany, Japan, Mexico, South Korea, Spain, United Kingdom, and United States. Galileo uses a Twitter API and the service CrowdTangle to download Facebook and Instagram posts. This data is stored in a local database and can be accessed through a user-friendly interface, which allows for the analysis of different characteristics of the posts, such as their text, source popularity, and temporal dimension. Galileo is a tool for researchers interested in understanding news cycles and analyzing news content on social networks.

    Finding relevant people in online social networks

    The objective of this thesis is to develop novel techniques to find relevant people in Online Social Networks (OSN). To that end, we consider different notions of relevance, taking the point of view of the OSN providers (like Facebook) and advertisers, as well as considering the people who are trying to push new ideas and topics on the network. We go beyond people's popularity, showing that the users with a lot of followers are not necessarily the most relevant. Specifically, we develop three algorithms that allow to: (i) compute the monetary value that each user produces for OSN provider; (ii) find users that push new ideas and create trends; and (iii) a recommender system that allows advertisers (focusing in local shops, like restaurants or pubs) to find potential customers. Furthermore, we also provide useful insights about users' behavior according to their relevance and popularity, showing - among other things - that most active users are usually more relevant than the popular ones. Moreover, we show that usually very popular users arrive late to the new trends, and that there are less popular, but very active users that generate value and push new ideas in the network.L'objectiu d'aquesta tesi és desenvolupar noves tècniques per trobar persones rellevants en les Xarxes Socials a Internet. Així doncs, considerem diferents nocions de rellevància, tenint en compte el punt de vista dels prove ïdors del servei (com Facebook) i dels anunciants, però també de persones que intenten proposar noves idees i temes a la xarxa. La nostra investigació va més enllà de la popularitat de les persones, mostra que els usuaris amb molts seguidors no són necessàriament els més rellevants. Específicament, desenvolupem tres algorismes que permeten: (i) calcular el valor (monetari) que cada usuari produeix per al prove ïdor del servei; (ii) trobar usuaris que proposen noves idees i creen tendències; i (iii) un sistema de recomanació que permet als anunciants (centrant-nos en botigues locals, com ara un restaurant o un pub) trobar clients potencials. Addicionalment, lliurem informació útil sobre el comportament dels usuaris segons la seva rellevància i popularitat, mostrant, entre altres coses, que els usuaris més actius solen ser més rellevants que els populars. A més a més, mostrem que normalment els usuaris molt populars arriben tard a les noves tendències, mentre que usuaris de menor popularitat, però molt actius, generen valor i fomenten noves idees a la xarxa .El objetivo de esta tesis es desarrollar nuevas técnicas para encontrar personas relevantes en las Redes Sociales en Internet. Para ello, consideramos diferentes nociones de relevancia, tomando el punto de vista de los proveedores del servicio (como Facebook) y de los anunciantes, pero también de las personas que intentan proponer nuevas ideas y temas en la red. Nuestra investigación va más allá de la popularidad de las personas, mostrando que los usuarios con muchos seguidores no son necesariamente los más relevantes. Espeficamente, desarollamos tres algoritmos que permiten: (i) calcular el valor (monetario) que cada usuario produce para el proveedor del servicio; (ii) encontrar usuarios que proponen nuevas ideas y crean tendencias; y (iii) un sistema de recomendación que permite a los anunciantes (centrándonos en tiendas locales, tales como un restaurant o un pub) encontrar potenciales clientes. Adicionalmente, proporcionamos información útil sobre el comportamiento de los usuarios según su relevancia y popularidad, mostrando - entre otras cosas - que los usuarios más activos suelen ser más relevantes que los populares. Más aún, mostramos que normalmente los usuarios muy populares llegan tarde a las nuevas tendencias, y que existen usuarios menos populares, pero muy activos que generan valor y fomentan nuevas ideas en la red

    Making sense of massive amounts of scientific publications: the scientific knowledge miner project

    The World Wide Web has become the hugest repository ever for scientific publications and it continues to increase at an unprecedented rate. Nevertheless, this information overload makes the exploration of this content a very time-consuming task. In this landscape, the availability of text mining tools to characterize and explore distinctive features of the scientific literature is mandatory./nWe present the Scientific Knowledge Miner (SKM) Project, that aims to investigate new approaches and frameworks to facilitate the extraction of knowledge from scientific publications across different disciplines. More specifically, we will focus on citation characterization, recommendation and scientific document summarization.This work is supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502), by the European Project Dr. Inventor (FP7-ICT-2013.8.1 - Grant: 611383), the Catalonia Trade and Investment Agency (Agència per la competitivitat de l’empresa, ACCIÓ) and the TUNER project (TIN2015-65308-C5-5-R, MINECO/FEDER, UE)

    Information Collection of COVID-19 Pandemic Using Wikipedia Template Network

    COVID-19 범유행으로 인한 사회적 피해를 줄이기 위해 정확한 정보의 접근은 필수적이다. 위키백과는 접근성이 높은 인터넷 백과사전으로, 사용자들이 직접 편집을 할 수 있어 COVID-19와 같이 현재 진행 중인 사건에 대한 정보가 빠르게 갱신된다. 그러나 기존의 위키백과 정보 검색 방법으로는 문서 간의 관계를 포함한 정보를 수집하기 어려운 한계가 있다. 위키백과의 템플릿 형식은 높은 연관성을 가지는 문서에 선별적으로 적용되는 링크로 정보의 구조를 잘 반영한다. 이 연구에서는 템플릿을 활용하여 10개 언어 위키백과 내 COVID-19의 정보를 수집하고 네트워크 구조로 재구성하였다. 총 130,662개의 노드와 202,258개의 엣지로 구성된 10개의 네트워크 중 사용자 수가 많은 언어가 크기와 깊이가 큰 템플릿 네트워크를 가졌으며, 3홉 이내의 연결 구조 내에 COVID-19와 연관성이 높은 문서가 존재함을 확인할 수 있었다. 이 연구는 여러 언어에 적용 가능한 새로운 정보 검색 방법을 제안함으로써 특정 주제에 대한 문서의 구축에 기여한다.22Nkc

    Global Gender Differences in Wikipedia Readership

    Wikipedia represents the largest and most popular source of encyclopedic knowledge in the world, aiming to provide equal access to information worldwide. From a global online survey of 65,031 readers of Wikipedia and their corresponding reading logs, we present first evidence of gender differences in Wikipedia readership and how they manifest in records of user behavior. More specifically we report that (1) women are underrepresented among readers of Wikipedia, (2) women view fewer pages per reading session than men do, (3) men and women visit Wikipedia for similar reasons, and (4) men and women exhibit specific topical preferences. Our findings lay the foundation for identifying pathways toward knowledge equity in the usage of online encyclopedic knowledge