6 research outputs found

    Community Assessment using Evidence Networks

    Get PDF
    Abstract. Community mining is a prominent approach for identifying (user) communities in social and ubiquitous contexts. While there are a variety of methods for community mining and detection, the effective evaluation and validation of the mined communities is usually non-trivial. Often there is no evaluation data at hand in order to validate the discovered groups. This paper proposes evidence networks using implicit information for the evaluation of communities. The presented evaluation approach is based on the idea of reconstructing existing social structures for the assessment and evaluation of a given clustering. We analyze and compare the presented evidence networks using user data from the real-world social bookmarking application BibSonomy. The results indicate that the evidence networks reflect the relative rating of the explicit ones very well.

    Information search and similarity based on Web 2.0 and semantic technologies

    Get PDF
    The World Wide Web provides a huge amount of information described in natural language at the current society’s disposal. Web search engines were born from the necessity of finding a particular piece of that information. Their ease of use and their utility have turned these engines into one of the most used web tools at a daily basis. To make a query, users just have to introduce a set of words - keywords - in natural language and the engine answers with a list of ordered resources which contain those words. The order is given by ranking algorithms. These algorithms use basically two types of features: dynamic and static factors. The dynamic factor has into account the query; that is, those documents which contain the keywords used to describe the query are more relevant for that query. The hyperlinks structure among documents is an example of a static factor of most current algorithms. For example, if most documents link to a particular document, this document may have more relevance than others because it is more popular. Even though currently there is a wide consensus on the good results that the majority of web search engines provides, these tools still suffer from some limitations, basically 1) the loneliness of the searching activity itself; and 2) the simple recovery process, based mainly on offering the documents that contains the exact terms used to describe the query. Considering the first problem, there is no doubt in the lonely and time-consuming process of searching relevant information in the World Wide Web. There are thousands of users out there that repeat previously executed queries, spending time in taking decisions of which documents are relevant or not; decisions that may have been taken previously and that may be do the job for similar or identical queries for other users. Considering the second problem, the textual nature of the current Web makes the reasoning capability of web search engines quite restricted; queries and web resources are described in natural language that, in some cases, can lead to ambiguity or other semantic-related difficulties. Computers do not know text; however, if semantics is incorporated to the text, meaning and sense is incorporated too. This way, queries and web resources will not be mere sets of terms, but lists of well-defined concepts. This thesis proposes a semantic layer, known as Itaca, which joins simplicity and effectiveness in order to endow with semantics both the resources stored in the World Wide Web and the queries used by users to find those resources. This is achieved through collaborative annotations and relevance feedback made by the users themselves, which describe both the queries and the web resources by means of Wikipedia concepts. Itaca extends the functional capabilities of current web search engines, providing a new ranking algorithm without dispensing traditional ranking models. Experiments show that this new architecture offers more precision in the final results obtained, keeping the simplicity and usability of the web search engines existing so far. Its particular design as a layer makes feasible its inclusion to current engines in a simple way.Internet pone a disposición de la sociedad una enorme cantidad de información descrita en lenguaje natural. Los buscadores web nacieron de la necesidad de encontrar un fragmento de información entre tanto volumen de datos. Su facilidad de manejo y su utilidad los han convertido en herramientas de uso diario entre la población. Para realizar una consulta, el usuario sólo tiene que introducir varias palabras clave en lenguaje natural y el buscador responde con una lista de recursos que contienen dichas palabras, ordenados en base a algoritmos de ranking. Estos algoritmos usan dos tipos de factores básicos: factores dinámicos y estáticos. El factor dinámico tiene en cuenta la consulta en sí; es decir, aquellos documentos donde estén las palabras utilizadas para describir la consulta serán más relevantes para dicha consulta. La estructura de hiperenlaces en los documentos electrónicos es un ejemplo de factor estático. Por ejemplo, si muchos documentos enlazan a otro documento, éste último documento podrá ser más relevante que otros. Si bien es cierto que actualmente hay consenso entre los buenos resultados de estos buscadores, todavía adolecen de ciertos problemas, destacando 1) la soledad en la que un usuario realiza una consulta; y 2) el modelo simple de recuperación, basado en ver si un documento contiene o no las palabras exactas usadas para describir la consulta. Con respecto al primer problema, no hay duda de que navegar en busca de cierta información relevante es una práctica solitaria y que consume mucho tiempo. Hay miles de usuarios ahí fuera que repiten sin saberlo una misma consulta, y las decisiones que toman muchos de ellos, descartando la información irrelevante y quedándose con la que realmente es útil, podrían servir de guía para otros muchos. Con respecto al segundo, el carácter textual de la Web actual hace que la capacidad de razonamiento en los buscadores se vea limitada, pues las consultas y los recursos están descritos en lenguaje natural que en ocasiones da origen a la ambigüedad. Los equipos informáticos no comprenden el texto que se incluye. Si se incorpora semántica al lenguaje, se incorpora significado, de forma que las consultas y los recursos electrónicos no son meros conjuntos de términos, sino una lista de conceptos claramente diferenciados. La presente tesis desarrolla una capa semántica, Itaca, que dota de significado tanto a los recursos almacenados en la Web como a las consultas que pueden formular los usuarios para encontrar dichos recursos. Todo ello se consigue a través de anotaciones colaborativas y de relevancia realizadas por los propios usuarios, que describen tanto consultas como recursos electrónicos mediante conceptos extraídos de Wikipedia. Itaca extiende las características funcionales de los buscadores web actuales, aportando un nuevo modelo de ranking sin tener que prescindir de los modelos actualmente en uso. Los experimentos demuestran que aporta una mayor precisión en los resultados finales, manteniendo la simplicidad y usabilidad de los buscadores que se conocen hasta ahora. Su particular diseño, a modo de capa, hace que su incorporación a buscadores ya existentes sea posible y sencilla.Programa Oficial de Posgrado en Ingeniería TelemáticaPresidente: Asunción Gómez Pérez.- Secretario: Mario Muñoz Organero.- Vocal: Anselmo Peñas Padill

    Identificação de critérios para avaliação de ideias: um método utilizando folksonomias

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia e Gestão do Conhecimento, Florianópolis, 2016.As ferramentas de cocriação encontram uma rica fonte de conhecimento baseada nas interações sociais que ocorrem na Web. Essa interação coletiva é a principal característica dos Sistemas de apoio à inovação, em especial para os sistemas de gestão de ideias. Entretanto, para avaliar ideias, as soluções atuais limitam-se a métodos baseados em formulários com critérios pré-estabelecidos ou, então, por métricas de engajamento social. O contexto organizacional é crítico para o sucesso de uma ideia, porém, ao considerar apenas índices de popularidade, as avaliações não agregam semanticamente o conhecimento atribuído pelo usuário, bem como não determinam quais critérios foram ponderados pela comunidade. A fim de compreender este conhecimento coletivo, a presente pesquisa propõe um método de identificação e análise de critérios para a avaliação de ideias. O desenvolvimento desse artefato é baseado na metodologia da ciência do design e explora o conhecimento a partir de atribuições sociais por notas e tags, as folksonomias. Assim, no contexto do front end da Inovação, o método representa uma apropriação semântica e qualitativa dos critérios atribuídos pela comunidade. A verificação utiliza técnicas da mineração de folksonomias em uma base de dados representada por um modelo de hipergrafo. Como resultado, o método permite evidenciar um conjunto de características a serem consideradas pela organização como critérios de avaliação. Além disso, a solução constata que a popularidade não é uma medida de consenso da comunidade, portanto sub comunidades auferem medidas mais precisas em suas atribuições; e a flexibilização temporal, própria das interações sociais, colaboram na recomendação de ideias baseada em tendências e no contexto organizacional.Abstract : Co-creation tools meet a rich source of knowledge on social interactions that occurs on the Web. This collective interaction is the main characteristic of innovation support systems, especially idea management systems. However, in order to evaluate ideas, current solutions are limited to methods based on forms with pre-established criteria or metrics of social engagement. The organizational context is critical to the success of an idea. Nevertheless, when considering just popularity ratings, the evaluations do not semantically aggregate the knowledge attributed by the user. It also does not determine what criteria was weighted by the community. In order to understand this collective knowledge, the present research proposes a method for identification and analysis of criteria in idea evaluation. The development of this artefact is based on the design science research methodology, and it explores the knowledge from social attributions using grades and tags, also known as folksonomy. Therefore, within the front end of innovation, the method represents a semantic, qualitative appropriation of criteria attributed by the community. The artefact was verified using folksonomy mining techniques in a database represented by a hypergraph model. As a result, the method allows to visualize a set of characteristics to be considered as evaluation criteria by any organization. In addition, the results showed that popularity is not a community s consensus measure. Therefore, sub communities get more precise measurements in their attributes; and temporal flexibility, which is specific to social interactions, collaborate on the idea recommendation based on trends and organizational context

    Leveraging social relevance : using social networks to enhance literature access and microblog search

    Get PDF
    L'objectif principal d'un système de recherche d'information est de sélectionner les documents pertinents qui répondent au besoin en information exprimé par l'utilisateur à travers une requête. Depuis les années 1970-1980, divers modèles théoriques ont été proposés dans ce sens pour représenter les documents et les requêtes d'une part et les apparier d'autre part, indépendamment de tout utilisateur. Plus récemment, l'arrivée du Web 2.0 ou le Web social a remis en cause l'efficacité de ces modèles du fait qu'ils ignorent l'environnement dans lequel l'information se situe. En effet, l'utilisateur n'est plus un simple consommateur de l'information mais il participe également à sa production. Pour accélérer la production de l'information et améliorer la qualité de son travail, l'utilisateur échange de l'information avec son voisinage social dont il partage les mêmes centres d'intérêt. Il préfère généralement obtenir l'information d'un contact direct plutôt qu'à partir d'une source anonyme. Ainsi, l'utilisateur, influencé par son environnement socio-cultuel, donne autant d'importance à la proximité sociale de la ressource d'information autant qu'à la similarité des documents à sa requête. Dans le but de répondre à ces nouvelles attentes, la recherche d'information s'oriente vers l'implication de l'utilisateur et de sa composante sociale dans le processus de la recherche. Ainsi, le nouvel enjeu de la recherche d'information est de modéliser la pertinence compte tenu de la position sociale et de l'influence de sa communauté. Le second enjeu est d'apprendre à produire un ordre de pertinence qui traduise le mieux possible l'importance et l'autorité sociale. C'est dans ce cadre précis, que s'inscrit notre travail. Notre objectif est d'estimer une pertinence sociale en intégrant d'une part les caractéristiques sociales des ressources et d'autre part les mesures de pertinence basées sur les principes de la recherche d'information classique. Nous proposons dans cette thèse d'intégrer le réseau social d'information dans le processus de recherche d'information afin d'utiliser les relations sociales entre les acteurs sociaux comme une source d'évidence pour mesurer la pertinence d'un document en réponse à une requête. Deux modèles de recherche d'information sociale ont été proposés à des cadres applicatifs différents : la recherche d'information bibliographique et la recherche d'information dans les microblogs. Les importantes contributions de chaque modèle sont détaillées dans la suite. Un modèle social pour la recherche d'information bibliographique. Nous avons proposé un modèle générique de la recherche d'information sociale, déployé particulièrement pour l'accès aux ressources bibliographiques. Ce modèle représente les publications scientifiques au sein d'réseau social et évalue leur importance selon la position des auteurs dans le réseau. Comparativement aux approches précédentes, ce modèle intègre des nouvelles entités sociales représentées par les annotateurs et les annotations sociales. En plus des liens de coauteur, ce modèle exploite deux autres types de relations sociales : la citation et l'annotation sociale. Enfin, nous proposons de pondérer ces relations en tenant compte de la position des auteurs dans le réseau social et de leurs mutuelles collaborations. Un modèle social pour la recherche d'information dans les microblogs.} Nous avons proposé un modèle pour la recherche de tweets qui évalue la qualité des tweets selon deux contextes: le contexte social et le contexte temporel. Considérant cela, la qualité d'un tweet est estimé par l'importance sociale du blogueur correspondant. L'importance du blogueur est calculée par l'application de l'algorithme PageRank sur le réseau d'influence sociale. Dans ce même objectif, la qualité d'un tweet est évaluée selon sa date de publication. Les tweets soumis dans les périodes d'activité d'un terme de la requête sont alors caractérisés par une plus grande importance. Enfin, nous proposons d'intégrer l'importance sociale du blogueur et la magnitude temporelle avec les autres facteurs de pertinence en utilisant un modèle Bayésien.An information retrieval system aims at selecting relevant documents that meet user's information needs expressed with a textual query. For the years 1970-1980, various theoretical models have been proposed in this direction to represent, on the one hand, documents and queries and on the other hand to match information needs independently of the user. More recently, the arrival of Web 2.0, known also as the social Web, has questioned the effectiveness of these models since they ignore the environment in which the information is located. In fact, the user is no longer a simple consumer of information but also involved in its production. To accelerate the production of information and improve the quality of their work, users tend to exchange documents with their social neighborhood that shares the same interests. It is commonly preferred to obtain information from a direct contact rather than from an anonymous source. Thus, the user, under the influenced of his social environment, gives as much importance to the social prominence of the information as the textual similarity of documents at the query. In order to meet these new prospects, information retrieval is moving towards novel user centric approaches that take into account the social context within the retrieval process. Thus, the new challenge of an information retrieval system is to model the relevance with regards to the social position and the influence of individuals in their community. The second challenge is produce an accurate ranking of relevance that reflects as closely as possible the importance and the social authority of information producers. It is in this specific context that fits our work. Our goal is to estimate the social relevance of documents by integrating the social characteristics of resources as well as relevance metrics as defined in classical information retrieval field. We propose in this work to integrate the social information network in the retrieval process and exploit the social relations between social actors as a source of evidence to measure the relevance of a document in response to a query. Two social information retrieval models have been proposed in different application frameworks: literature access and microblog retrieval. The main contributions of each model are detailed in the following. A social information model for flexible literature access. We proposed a generic social information retrieval model for literature access. This model represents scientific papers within a social network and evaluates their importance according to the position of respective authors in the network. Compared to previous approaches, this model incorporates new social entities represented by annotators and social annotations (tags). In addition to co-authorships, this model includes two other types of social relationships: citation and social annotation. Finally, we propose to weight these relationships according to the position of authors in the social network and their mutual collaborations. A social model for information retrieval for microblog search. We proposed a microblog retrieval model that evaluates the quality of tweets in two contexts: the social context and temporal context. The quality of a tweet is estimated by the social importance of the corresponding blogger. In particular, blogger's importance is calculated by the applying PageRank algorithm on the network of social influence. With the same aim, the quality of a tweet is evaluated according to its date of publication. Tweets submitted in periods of activity of query terms are then characterized by a greater importance. Finally, we propose to integrate the social importance of blogger and the temporal magnitude tweets as well as other relevance factors using a Bayesian network model

    Logsonomy - social information retrieval with logdata

    No full text
    corecore