867 research outputs found

    Biases in scholarly recommender systems: impact, prevalence, and mitigation

    Get PDF
    We create a simulated financial market and examine the effect of different levels of active and passive investment on fundamental market efficiency. In our simulated market, active, passive, and random investors interact with each other through issuing orders. Active and passive investors select their portfolio weights by optimizing Markowitz-based utility functions. We find that higher fractions of active investment within a market lead to an increased fundamental market efficiency. The marginal increase in fundamental market efficiency per additional active investor is lower in markets with higher levels of active investment. Furthermore, we find that a large fraction of passive investors within a market may facilitate technical price bubbles, resulting in market failure. By examining the effect of specific parameters on market outcomes, we find that that lower transaction costs, lower individual forecasting errors of active investors, and less restrictive portfolio constraints tend to increase fundamental market efficiency in the market

    Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

    Get PDF
    Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

    Citation recommendation: approaches and datasets

    Get PDF
    Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction to automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles

    From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

    Full text link
    We describe the state-of-the-art in performance modeling and prediction for Information Retrieval (IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its shortcomings and strengths. We present a framework for further research, identifying five major problem areas: understanding measures, performance analysis, making underlying assumptions explicit, identifying application features determining performance, and the development of prediction models describing the relationship between assumptions, features and resulting performanc

    Citation Recommendation: Approaches and Datasets

    Get PDF
    Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction into automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods, and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles.Comment: to be published in the International Journal on Digital Librarie

    Recommendations in Academic Social Media: the shaping of scholarly communication through algorithmic mediation

    Get PDF
    Scholarly communication is increasingly being mediated by Academic Social Media (ASM) platforms, which combine the functions of a scientifi c repository with social media features such as personal profi les, followers and comments. In ASM, algorithmic mediation is responsible for fi ltering the content and distributing it in personalised individual feeds and recommendations according to inferred relevance to users. However, if communication among researchers is intertwined with these platforms, in what ways may the recommendation algorithms in ASM shape scholarly communication? Scientifi c literature has been investigating how content is mediated in data-driven environments ranging from social media platforms to specifi c apps, whereas algorithmic mediation in scientifi c environments remains neglected. This thesis starts from the premise that ASM platforms are sociocultural artefacts embedded in a mutually shaping relationship with research practices and economic, political and social arrangements. Therefore, implications of algorithmic mediation can be studied through the artefact itself, peoples’ practices and the social/political/ economic arrangements that aff ect and are aff ected by such interactions. Most studies on ASM focus on one of these elements at a time, either examining design elements or the users’ behaviour on and perceptions about such platforms. In this thesis, a multifaceted approach is taken to analyse the artefact as well as the practices and arrangements traversed by algorithmic mediation. Chapter 1 reviews the literature about ASM platforms, and explains the history of algorithmic recommendations, starting from the fi rst Information Retrieval systems to current Recommender Systems, highlighting the use of diff erent data sources and techniques. The chapter also presents the mediation framework and how it applies to ASM platforms, before outlining the thesis. The rest of the thesis is divided in two parts. Part I focuses on how recommender systems in ASM shape what users can see and how users interact with and through the platform. Part II investigates how, in turn, researchers make sense of their online interactions within ASM. The end of Chapter 1 shows the methodological choices for each following chapter. Part I presents a case study of one of the most popular ASM platforms in which a walkthrough method was conducted in four steps (interface analysis, web code inspection, patent analysis and company inquiry using the General Data Protection Regulation (GDPR)). In Chapter 2 it is shown that almost all the content in ASM platforms are algorithmically mediated through mechanisms of profi ling, information selection and commodifi cation. It is also discussed how the company avoids explaining the workings of recommender systems and the mutually shaping characteristic of ASM platforms. Chapter 3 explores the distortions and biases that ASM platforms can uphold. Results show how profi ling, datafi cation and prioritization have the potential to foster homogeneity bias, discrimination, the Matthew eff ect of cumulative advantage in science and other distortions. Part II consists of two empirical studies involving participants from diff erent countries in interviews (n=11) and a research game (n=13). Chapter 4 presents the interviews combined with the show and tell technique. The results show the participant’s perceptions on ASM aff ordances, that revolve around six main themes: (1) getting access to relevant content; (2) reaching out to other scholars; (3) algorithmic impact on exposure to content; (4) to see and to be seen; (5) blurred boundaries of potential ethical or legal infringements, and (6) the more I give, the more I get. We argue that algorithmic mediation not only constructs a narration of the self, but also a narration of the relevant other in ASM platforms, confi guring an image of the relevant other that is both participatory and productive. Chapter 5 presents the design process of a research game and the results of the empirical sessions, where participants were observed while playing the game. There are two outcomes for the study. First, the human values researchers relate to algorithmic features in ASM, the most prominent being stimulation, universalism and self-direction. Second, the role of the researcher’s approach (collaborative, competitive or ambivalent) in academic tasks, showing the consequential choices people make regarding algo- rithmic features and the motivations behind those choices. The results led to four archetypal profi les: (1) the collaborative reader; (2) the competitive writer; (3) the collaborative disseminator; and (4) the ambivalent evaluator. The fi nal chapter summarises the ways in which ASM platforms forges people’s perceptions and the strategies people employ to use the systems in benefi t of their careers, answering each research question. Chapter 6 discusses the implications of algorithmic mediation for scholarly communication and science in general. The dissertation ends with refl ections on human agency in data-driven environments, the role of algorithmic inferences in science and the challenge of reconciling individual user’s needs with broader goals of the scientifi c community. By doing so, the contribution of this thesis is twofold, (1) providing in-depth knowledge about the ASM artefact, and (2) unfolding diff erent aspects of the human perspective in dealing with algorithmic mediation in ASM. Both perspectives are discussed in light of social arrangements that are mutually shaped by artefact and practices.A comunicação acadêmica é cada vez mais mediada por plataformas de Mídia Social Acadêmica (MSA), que combinam as funções de um repositório científi co com recursos de mídia social, como perfi s pessoais, seguidores e comentários. Nas MSA, a mediação algorítmica é responsável por fi ltrar o conteúdo e distribuí-lo em feeds e recomendações individuais personalizados de acordo com a relevância inferida para os usuários. No entanto, se a comunicação entre pesquisadores está entrelaçada com essas plataformas, de que forma os algoritmos de recomendação nas MSA podem moldar a comunicação acadêmica? A literatura científi ca vem investigando como o conteúdo é mediado em ambientes orientados por dados, desde plataformas de mídia social até aplicativos específi cos, enquanto a mediação algorítmica em ambientes científi cos permanece negligenciada. Esta tese parte da premissa de que as plataformas de MSA são artefatos socioculturais inseridos em uma relação mutuamente modeladora com práticas de pesquisa e arranjos econômicos, políticos e sociais. Portanto, as implicações da mediação algorítmica podem ser estudadas através do próprio artefato, das práticas humanas e dos arranjos sociais/políticos/ econômicos que afetam e são afetados por tais interações. A maioria dos estudos sobre MSA se concentra em um desses elementos de cada vez, seja examinando elementos de design ou o comportamento e percepções dos usuários sobre essas plataformas. Nesta tese, uma abordagem multifacetada é feita para analisar o artefato, bem como as práticas e arranjos atravessados pela mediação algorítmica. O Capítulo 1 revisa a literatura sobre plataformas de MSA e explica a história das recomendações algorítmicas, desde os primeiros sistemas de Recuperação de Informação até os atuais Sistemas de Recomendação, destacando o uso de diferentes fontes de dados e técnicas. O capítulo também apresenta o quadro teórico (mediation framework) e como ele se aplica às plataformas MSA, antes de delinear a estrutura da tese. O restante da tese está dividido em duas partes. A Parte I se concentra em como os sistemas de recomendação nas MSA moldam o que os usuários podem ver e como os usuários interagem com e na plataforma. A Parte II, por sua vez, investiga como os pesquisadores dão sentido às suas interações online dentro das MSA. O fi nal do Capítulo 1 mostra as opções metodológicas para cada capítulo seguinte. A Parte I apresenta um estudo de caso de uma das plataformas de MSA mais populares em que o walkthrough method foi realizado em quatro etapas (análise de interface, inspeção de código web, análise de patente e consulta à empresa usando o General Data Protection Regulation (GDPR)). No Capítulo 2 é mostrado que quase todo o conteúdo das plataformas ASM é mediado por algoritmos por meio de mecanismos de perfi - lamento, seleção de informações e mercantilização. Também é discutido como a empresa evita explicar o funcionamento dos sistemas de recomendação e a característica de modelagem mútua das plataformas de MSA. O Capítulo 3 explora as distorções e vieses que as plataformas de MSA podem sustentar. Os resultados mostram como o perfi lamento, a datifi cação e a priorização de conteúdo têm o potencial de promover viés de homogeneidade, discriminação o efeito Mateus de vantagem cumulativa na ciência e outras distorções. A Parte II consiste em dois estudos empíricos envolvendo participantes de diferentes países em entrevistas (n=11) e um jogo de pesquisa (n=13). O capítulo 4 apresenta as entrevistas combinadas com a técnica show and tell. Os resultados mostram as percepções dos participantes sobre as aff ordances das MSA, que giram em torno de seis temas principais: (1) ter acesso a conteúdos relevantes; (2) acesso a outros pesquisadores; (3) impacto algorítmico na exposição ao conteúdo; (4) ver e ser visto; (5) limites difusos de potenciais infrações éticas ou legais e (6) quanto mais eu dou, mais eu recebo. Argumentamos que a mediação algorítmica não apenas constrói uma narração do eu, mas também uma narração do outro nas plataformas de MSA, confi gurando uma imagem do outro ao mesmo tempo participativa e produtiva. O capítulo 5 apresenta o processo de design de um jogo de pesquisa e os resultados das sessões empíricas, onde os participantes foram observados enquanto jogavam o jogo. Há dois resultados para o estudo. Primeiro, quais valores humanos os pesquisadores relacionam com recursos algorítmicos nas MSA, sendo os mais proeminentes o estímulo, o universalismo e o autodirecionamento. Em segundo lugar, o papel da abordagem do pesquisador (colaborativa, competitiva ou ambivalente) em tarefas acadêmicas, mostrando as escolhas consequentes que as pessoas fazem em relação aos recursos algorítmicos e as motivações por trás dessas escolhas. Os resultados levaram a quatro perfi s arquetípicos: (1) o leitor colaborativo; (2) o escritor competitivo; (3) o divulgador colaborativo; e (4) o avaliador ambivalente. O capítulo fi nal (Capítulo 6) resume as maneiras pelas quais as plataformas de MSA forjam as percepções das pessoas e as estratégias que as pessoas empregam para usar os sistemas em benefício de suas carreiras, respondendo a cada questão de pesquisa. O capítulo discute ainda as implicações da mediação algorítmica para a comunicação acadêmica e a ciência em geral. A dissertação termina com refl exões sobre a agência humana em ambientes orientados por dados, o papel das inferências algorítmicas na ciência e o desafi o de conciliar as necessidades individuais do usuário com os objetivos mais amplos da comunidade científi ca. Ao fazê-lo, a contribuição desta tese é dupla, (1) fornecendo conhecimento aprofundado sobre o artefato plataformas de MSA, e (2) desdobrando diferentes aspectos da perspectiva humana ao lidar com mediação algorítmica em ASM. Ambas as perspectivas são discutidas à luz de arranjos sociais que são mutuamente moldados por artefatos e práticas
    corecore