124 research outputs found

    Leveraging social relevance : using social networks to enhance literature access and microblog search

    Get PDF
    L'objectif principal d'un système de recherche d'information est de sélectionner les documents pertinents qui répondent au besoin en information exprimé par l'utilisateur à travers une requête. Depuis les années 1970-1980, divers modèles théoriques ont été proposés dans ce sens pour représenter les documents et les requêtes d'une part et les apparier d'autre part, indépendamment de tout utilisateur. Plus récemment, l'arrivée du Web 2.0 ou le Web social a remis en cause l'efficacité de ces modèles du fait qu'ils ignorent l'environnement dans lequel l'information se situe. En effet, l'utilisateur n'est plus un simple consommateur de l'information mais il participe également à sa production. Pour accélérer la production de l'information et améliorer la qualité de son travail, l'utilisateur échange de l'information avec son voisinage social dont il partage les mêmes centres d'intérêt. Il préfère généralement obtenir l'information d'un contact direct plutôt qu'à partir d'une source anonyme. Ainsi, l'utilisateur, influencé par son environnement socio-cultuel, donne autant d'importance à la proximité sociale de la ressource d'information autant qu'à la similarité des documents à sa requête. Dans le but de répondre à ces nouvelles attentes, la recherche d'information s'oriente vers l'implication de l'utilisateur et de sa composante sociale dans le processus de la recherche. Ainsi, le nouvel enjeu de la recherche d'information est de modéliser la pertinence compte tenu de la position sociale et de l'influence de sa communauté. Le second enjeu est d'apprendre à produire un ordre de pertinence qui traduise le mieux possible l'importance et l'autorité sociale. C'est dans ce cadre précis, que s'inscrit notre travail. Notre objectif est d'estimer une pertinence sociale en intégrant d'une part les caractéristiques sociales des ressources et d'autre part les mesures de pertinence basées sur les principes de la recherche d'information classique. Nous proposons dans cette thèse d'intégrer le réseau social d'information dans le processus de recherche d'information afin d'utiliser les relations sociales entre les acteurs sociaux comme une source d'évidence pour mesurer la pertinence d'un document en réponse à une requête. Deux modèles de recherche d'information sociale ont été proposés à des cadres applicatifs différents : la recherche d'information bibliographique et la recherche d'information dans les microblogs. Les importantes contributions de chaque modèle sont détaillées dans la suite. Un modèle social pour la recherche d'information bibliographique. Nous avons proposé un modèle générique de la recherche d'information sociale, déployé particulièrement pour l'accès aux ressources bibliographiques. Ce modèle représente les publications scientifiques au sein d'réseau social et évalue leur importance selon la position des auteurs dans le réseau. Comparativement aux approches précédentes, ce modèle intègre des nouvelles entités sociales représentées par les annotateurs et les annotations sociales. En plus des liens de coauteur, ce modèle exploite deux autres types de relations sociales : la citation et l'annotation sociale. Enfin, nous proposons de pondérer ces relations en tenant compte de la position des auteurs dans le réseau social et de leurs mutuelles collaborations. Un modèle social pour la recherche d'information dans les microblogs.} Nous avons proposé un modèle pour la recherche de tweets qui évalue la qualité des tweets selon deux contextes: le contexte social et le contexte temporel. Considérant cela, la qualité d'un tweet est estimé par l'importance sociale du blogueur correspondant. L'importance du blogueur est calculée par l'application de l'algorithme PageRank sur le réseau d'influence sociale. Dans ce même objectif, la qualité d'un tweet est évaluée selon sa date de publication. Les tweets soumis dans les périodes d'activité d'un terme de la requête sont alors caractérisés par une plus grande importance. Enfin, nous proposons d'intégrer l'importance sociale du blogueur et la magnitude temporelle avec les autres facteurs de pertinence en utilisant un modèle Bayésien.An information retrieval system aims at selecting relevant documents that meet user's information needs expressed with a textual query. For the years 1970-1980, various theoretical models have been proposed in this direction to represent, on the one hand, documents and queries and on the other hand to match information needs independently of the user. More recently, the arrival of Web 2.0, known also as the social Web, has questioned the effectiveness of these models since they ignore the environment in which the information is located. In fact, the user is no longer a simple consumer of information but also involved in its production. To accelerate the production of information and improve the quality of their work, users tend to exchange documents with their social neighborhood that shares the same interests. It is commonly preferred to obtain information from a direct contact rather than from an anonymous source. Thus, the user, under the influenced of his social environment, gives as much importance to the social prominence of the information as the textual similarity of documents at the query. In order to meet these new prospects, information retrieval is moving towards novel user centric approaches that take into account the social context within the retrieval process. Thus, the new challenge of an information retrieval system is to model the relevance with regards to the social position and the influence of individuals in their community. The second challenge is produce an accurate ranking of relevance that reflects as closely as possible the importance and the social authority of information producers. It is in this specific context that fits our work. Our goal is to estimate the social relevance of documents by integrating the social characteristics of resources as well as relevance metrics as defined in classical information retrieval field. We propose in this work to integrate the social information network in the retrieval process and exploit the social relations between social actors as a source of evidence to measure the relevance of a document in response to a query. Two social information retrieval models have been proposed in different application frameworks: literature access and microblog retrieval. The main contributions of each model are detailed in the following. A social information model for flexible literature access. We proposed a generic social information retrieval model for literature access. This model represents scientific papers within a social network and evaluates their importance according to the position of respective authors in the network. Compared to previous approaches, this model incorporates new social entities represented by annotators and social annotations (tags). In addition to co-authorships, this model includes two other types of social relationships: citation and social annotation. Finally, we propose to weight these relationships according to the position of authors in the social network and their mutual collaborations. A social model for information retrieval for microblog search. We proposed a microblog retrieval model that evaluates the quality of tweets in two contexts: the social context and temporal context. The quality of a tweet is estimated by the social importance of the corresponding blogger. In particular, blogger's importance is calculated by the applying PageRank algorithm on the network of social influence. With the same aim, the quality of a tweet is evaluated according to its date of publication. Tweets submitted in periods of activity of query terms are then characterized by a greater importance. Finally, we propose to integrate the social importance of blogger and the temporal magnitude tweets as well as other relevance factors using a Bayesian network model

    Local experts finding using user comments in location-based social networks

    Get PDF
    The opinions of local experts in the location-based social network are of great significance to the collection and dissemination of local information. In this paper, we investigated in-depth how the user comments can be used to identify the local expert over social networks. We first illustrate the existences of potential local experts in a social network using a scored model by considering the personal profiles, comments, friend relationship, and location preferences. Then, a multi-dimensional model is proposed to evaluate the local expert candidates and a local expert discovery algorithm is proposed to identify local experts. Meanwhile, a scoring algorithm is proposed to train the weights in the model. Finally, an expert recommendation list can be given based on the score ranks of the candidates. Experimental results demonstrate that effectiveness of proposed model and algorithm

    A computational approach to measuring the correlation between expertise and social media influence for celebrities on microblogs

    Get PDF
    Social media influence analysis, sometimes also called authority detection, aims to rank users based on their influence scores in social media. Existing approaches of social influence analysis usually focus on how to develop effective algorithms to quantize users’ influence scores. They rarely consider a person’s expertise levels which are arguably important to influence measures. In this paper, we propose a computational approach to measuring the correlation between expertise and social media influence, and we take a new perspective to understand social media influence by incorporating expertise into influence analysis. We carefully constructed a large dataset of 13,684 Chinese celebrities from Sina Weibo (literally ”Sina microblogging”). We found that there is a strong correlation between expertise levels and social media influence scores. Our analysis gave a good explanation of the phenomenon of “top across-domain influencers”. In addition, different expertise levels showed influence variation patterns: e.g., (1) high-expertise celebrities have stronger influence on the “audience” in their expertise domains; (2) expertise seems to be more important than relevance and participation for social media influence; (3) the audiences of top expertise celebrities are more likely to forward tweets on topics outside the expertise domains from high-expertise celebrities

    Mining and Analyzing the Academic Network

    Get PDF
    Social Network research has attracted the interests of many researchers, not only in analyzing the online social networking applications, such as Facebook and Twitter, but also in providing comprehensive services in scientific research domain. We define an Academic Network as a social network which integrates scientific factors, such as authors, papers, affiliations, publishing venues, and their relationships, such as co-authorship among authors and citations among papers. By mining and analyzing the academic network, we can provide users comprehensive services as searching for research experts, published papers, conferences, as well as detecting research communities or the evolutions hot research topics. We can also provide recommendations to users on with whom to collaborate, whom to cite and where to submit.In this dissertation, we investigate two main tasks that have fundamental applications in the academic network research. In the first, we address the problem of expertise retrieval, also known as expert finding or ranking, in which we identify and return a ranked list of researchers, based upon their estimated expertise or reputation, to user-specified queries. In the second, we address the problem of research action recommendation (prediction), specifically, the tasks of publishing venue recommendation, citation recommendation and coauthor recommendation. For both tasks, to effectively mine and integrate heterogeneous information and therefore develop well-functioning ranking or recommender systems is our principal goal. For the task of expertise retrieval, we first proposed or applied three modified versions of PageRank-like algorithms into citation network analysis; we then proposed an enhanced author-topic model by simultaneously modeling citation and publishing venue information; we finally incorporated the pair-wise learning-to-rank algorithm into traditional topic modeling process, and further improved the model by integrating groups of author-specific features. For the task of research action recommendation, we first proposed an improved neighborhood-based collaborative filtering approach for publishing venue recommendation; we then applied our proposed enhanced author-topic model and demonstrated its effectiveness in both cited author prediction and publishing venue prediction; finally we proposed an extended latent factor model that can jointly model several relations in an academic environment in a unified way and verified its performance in four recommendation tasks: the recommendation on author-co-authorship, author-paper citation, paper-paper citation and paper-venue submission. Extensive experiments conducted on large-scale real-world data sets demonstrated the superiority of our proposed models over other existing state-of-the-art methods

    Using contextual and social links in information retrieval

    Get PDF
    [no abstract

    An exploratory data analysis of the #Crowdfunding network on Twitter

    Get PDF
    Together, social media and crowdsourcing can help entrepreneurs to attract external finance and early-stage customers. This paper investigates the characteristics and discourse of an issue-centered public on Twitter organized around the hashtag #crowdfunding through the lens of social network theory. Using a dataset of 2,732,144 tweets published during a calendar year, we use exploratory data analysis to generate insights and hypotheses on who the users in the #crowdfunding network are, what they share, and how they are connected to each other. In order to do so, we adopt a range of descriptive, content, network analytics techniques. The results suggest that platforms, crowdfunders, and other actors who derive income from the crowdfunding economy play a key role in creating the network. Furthermore, latent ties (strangers) play a direct role in disseminating information, investing, and sending signals to platforms that further raises campaign prominence. We also introduce a new type of social tie, the “computer as a social actor”, previously unaddressed in entrepreneurial network literature, which play a role in sending signals to both platforms and networks. Our results suggest that homophily is a key driver for creating network sub-communities built around specific platforms, project types, domains, or geograph

    SOCIAL MEDIA ANALYTICS − A UNIFYING DEFINITION, COMPREHENSIVE FRAMEWORK, AND ASSESSMENT OF ALGORITHMS FOR IDENTIFYING INFLUENCERS IN SOCIAL MEDIA

    Get PDF
    Given its relative infancy, there is a dearth of research on a comprehensive view of business social media analytics (SMA). This dissertation first examines current literature related to SMA and develops an integrated, unifying definition of business SMA, providing a nuanced starting point for future business SMA research. This dissertation identifies several benefits of business SMA, and elaborates on some of them, while presenting recent empirical evidence in support of foregoing observations. The dissertation also describes several challenges facing business SMA today, along with supporting evidence from the literature, some of which also offer mitigating solutions in particular contexts. The second part of this dissertation studies one SMA implication focusing on identifying social influencer. Growing social media usage, accompanied by explosive growth in SMA, has resulted in increasing interest in finding automated ways of discovering influencers in online social interactions. Beginning 2008, many variants of multiple basic approaches have been proposed. Yet, there is no comprehensive study investigating the relative efficacy of these methods in specific settings. This dissertation investigates and reports on the relative performance of multiple methods on Twitter datasets containing between them tens of thousands to hundreds of thousands of tweets. Accordingly, the second part of the dissertation helps further an understanding of business SMA and its many aspects, grounded in recent empirical work, and is a basis for further research and development. This dissertation provides a relatively comprehensive understanding of SMA and the implementation SMA in influencer identification

    Generating social media for the movie world: TweetMovies

    Get PDF
    PFC realitzat en el marc d'un programa de mobilitat amb la Helsinki Metropolia University of Applied Sciences.TweetMovies is a social network developed using edge web technologies, like Ruby on Rails, jQuery or Ruby on Rails. In the process, some advanced development techniques were used, like Test-Driven Development or Behavior-Driven Development. The basics of SEO are used and explained in the project

    Automatically learning topics and difficulty levels of problems in online judge systems

    Get PDF
    Online Judge (OJ) systems have been widely used in many areas, including programming, mathematical problems solving, and job interviews. Unlike other online learning systems, such as Massive Open Online Course, most OJ systems are designed for self-directed learning without the intervention of teachers. Also, in most OJ systems, problems are simply listed in volumes and there is no clear organization of them by topics or difficulty levels. As such, problems in the same volume are mixed in terms of topics or difficulty levels. By analyzing large-scale users’ learning traces, we observe that there are two major learning modes (or patterns). Users either practice problems in a sequential manner from the same volume regardless of their topics or they attempt problems about the same topic, which may spread across multiple volumes. Our observation is consistent with the findings in classic educational psychology. Based on our observation, we propose a novel two-mode Markov topic model to automatically detect the topics of online problems by jointly characterizing the two learning modes. For further predicting the difficulty level of online problems, we propose a competition-based expertise model using the learned topic information. Extensive experiments on three large OJ datasets have demonstrated the effectiveness of our approach in three different tasks, including skill topic extraction, expertise competition prediction and problem recommendation
    corecore