1,202 research outputs found

    Capturing the influence of geopolitical ties from Wikipedia with reduced Google matrix

    Get PDF
    Interactions between countries originate from diverse aspects such as geographic proximity, trade, socio-cultural habits, language, religions, etc. Geopolitics studies the influence of a country’s geographic space on its political power and its relationships with other countries. This work reveals the potential of Wikipedia mining for geopolitical study. Actually, Wikipedia offers solid knowledge and strong correlations among countries by linking web pages together for different types of information (e.g. economical, historical, political, and many others). The major finding of this paper is to show that meaningful results on the influence of country ties can be extracted from the hyperlinked structure of Wikipedia. We leverage a novel stochastic matrix representation of Markov chains of complex directed networks called the reduced Google matrix theory. For a selected small size set of nodes, the reduced Google matrix concentrates direct and indirect links of the million-node sized Wikipedia network into a small Perron-Frobenius matrix keeping the PageRank probabilities of the global Wikipedia network. We perform a novel sensitivity analysis that leverages this reduced Google matrix to characterize the influence of relationships between countries from the global network. We apply this analysis to two chosen sets of countries (i.e. the set of 27 European Union countries and a set of 40 top worldwide countries). We show that with our sensitivity analysis we can exhibit easily very meaningful information on geopolitics from five different Wikipedia editions (English, Arabic, Russian, French and German)

    Google matrix analysis of Wikipedia networks

    Get PDF
    Cette thèse s’intéresse à l’analyse du réseau dirigé extrait de la structure des hyperliens deWikipédia. Notre objectif est de mesurer les interactions liant un sous-ensemble de pages duréseau Wikipédia. Par conséquent, nous proposons de tirer parti d’une nouvelle représentationmatricielle appelée matrice réduite de Google ou "reduced Google Matrix". Cette matrice réduitede Google (GR) est définie pour un sous-ensemble de pages donné (c-à-d un réseau réduit).Comme pour la matrice de Google standard, un composant de GR capture la probabilité que deuxnoeuds du réseau réduit soient directement connectés dans le réseau complet. Une desparticularités de GR est l’existence d’un autre composant qui explique la probabilité d’avoir deuxnoeuds indirectement connectés à travers tous les chemins possibles du réseau entier. Dans cettethèse, les résultats de notre étude de cas nous montrent que GR offre une représentation fiabledes liens directs et indirects (cachés). Nous montrons que l’analyse de GR est complémentaire àl’analyse de "PageRank" et peut être exploitée pour étudier l’influence d’une variation de lien surle reste de la structure du réseau. Les études de cas sont basées sur des réseaux Wikipédiaprovenant de différentes éditions linguistiques. Les interactions entre plusieurs groupes d’intérêtont été étudiées en détail : peintres, pays et groupes terroristes. Pour chaque étude, un réseauréduit a été construit. Les interactions directes et indirectes ont été analysées et confrontées à desfaits historiques, géopolitiques ou scientifiques. Une analyse de sensibilité est réalisée afin decomprendre l’influence des liens dans chaque groupe sur d’autres noeuds (ex : les pays dansnotre cas). Notre analyse montre qu’il est possible d’extraire des interactions précieuses entre lespeintres, les pays et les groupes terroristes. On retrouve par exemple, dans le réseau de peintresissu de GR, un regroupement des artistes par grand mouvement de l’histoire de la peinture. Lesinteractions bien connues entre les grands pays de l’UE ou dans le monde entier sont égalementsoulignées/mentionnées dans nos résultats. De même, le réseau de groupes terroristes présentedes liens pertinents en ligne avec leur idéologie ou leurs relations historiques ou géopolitiques.Nous concluons cette étude en montrant que l’analyse réduite de la matrice de Google est unenouvelle méthode d’analyse puissante pour les grands réseaux dirigés. Nous affirmons que cetteapproche pourra aussi bien s’appliquer à des données représentées sous la forme de graphesdynamiques. Cette approche offre de nouvelles possibilités permettant une analyse efficace desinteractions d’un groupe de noeuds enfoui dans un grand réseau dirig

    The infrastructural conditions of (de-)growth: The case of the internet

    Get PDF
    Infrastructure studies represent a domain that remains significantly uncharted among degrowth scholars. This is paradoxical considering that infrastructures constitute a fundamental prerequisite for the equitable distribution of many aspects of human well-being that degrowth proponents emphasize. Nonetheless, the substantial resource and energy consumption associated with infrastructures cannot be overlooked. The internet offers an instructive case study in this sense, at its best it forges human connections and is productive of considerable societal value. The resource implications of the often-overlooked internet physical layer of data-centres and submarine cables needs to be acknowledged. Furthermore, the ways in which assumptions of perpetual growth are built into this global infrastructure via the logic layer of internet protocols and other governing mechanisms such as finance and network design need to be examined if we are to determine the extent to which such infrastructures are inherently growth dependent. In making these two arguments, we draw upon the work of both Science and Technology Studies (STS) and Large Technological System (LTS) studies on the inherent problems of large infrastructures which have thus far seen little engagement with questions of degrowth. We review the case of the internet and suggest a number of scenarios that illustrate potential roles for such infrastructures in any planned reduction of economic activity

    The infrastructural conditions of (de-)growth: The case of the internet

    Get PDF
    Infrastructure studies represent a domain that remains significantly uncharted among degrowth scholars. This is paradoxical considering that infrastructures constitute a fundamental prerequisite for the equitable distribution of many aspects of human well-being that degrowth proponents emphasize. Nonetheless, the substantial resource and energy consumption associated with infrastructures cannot be overlooked. The internet offers an instructive case study in this sense, at its best it forges human connections and is productive of considerable societal value. The resource implications of the often-overlooked internet physical layer of data-centres and submarine cables needs to be acknowledged. Furthermore, the ways in which assumptions of perpetual growth are built into this global infrastructure via the logic layer of internet protocols and other governing mechanisms such as finance and network design need to be examined if we are to determine the extent to which such infrastructures are inherently growth dependent. In making these two arguments, we draw upon the work of both Science and Technology Studies (STS) and Large Technological System (LTS) studies on the inherent problems of large infrastructures which have thus far seen little engagement with questions of degrowth. We review the case of the internet and suggest a number of scenarios that illustrate potential roles for such infrastructures in any planned reduction of economic activity.Universidade de Vigo/CISU

    Risk assessment at AGI companies: A review of popular risk assessment techniques from other safety-critical industries

    Full text link
    Companies like OpenAI, Google DeepMind, and Anthropic have the stated goal of building artificial general intelligence (AGI) - AI systems that perform as well as or better than humans on a wide variety of cognitive tasks. However, there are increasing concerns that AGI would pose catastrophic risks. In light of this, AGI companies need to drastically improve their risk management practices. To support such efforts, this paper reviews popular risk assessment techniques from other safety-critical industries and suggests ways in which AGI companies could use them to assess catastrophic risks from AI. The paper discusses three risk identification techniques (scenario analysis, fishbone method, and risk typologies and taxonomies), five risk analysis techniques (causal mapping, Delphi technique, cross-impact analysis, bow tie analysis, and system-theoretic process analysis), and two risk evaluation techniques (checklists and risk matrices). For each of them, the paper explains how they work, suggests ways in which AGI companies could use them, discusses their benefits and limitations, and makes recommendations. Finally, the paper discusses when to conduct risk assessments, when to use which technique, and how to use any of them. The reviewed techniques will be obvious to risk management professionals in other industries. And they will not be sufficient to assess catastrophic risks from AI. However, AGI companies should not skip the straightforward step of reviewing best practices from other industries.Comment: 44 pages, 13 figures, 9 table

    A Complex Social Network Analyses of Online Finanical Communties in Times of Geopolitcal Military and Terrorist Events

    Get PDF
    Given the advances in technology the field of social network analysis has very much hit the forefront in recent years. The information age harnesses the use of social network analysis for multiple industries and for solving complex problems. Social network analysis is an important tool in the world of the military and counter intelligence, whether it’s the capture of Osama Bin Laden or uncovering hidden Al Qaeda terrorist networks, the world around us is built on networks, be that hidden or otherwise. Online social networks give new information in the world of intelligence agencies similarly online financial communities such as Yahoo Finance gives intelligent information to knowledge hungry investors. This thesis is concerned with the exploration and exploitation of online financial community dynamics and networks using social network analysis (SNA) as a mechanism. Social network analysis measurement techniques will be applied to understand the reaction of online investors to military and terrorist geopolitical events, the stock market’s reaction to these events and if it is possible to predict military stock prices after military and terrorist geopolitical events

    The Network Dynamics Of Social Influence In The Wisdom Of Crowds

    Get PDF
    Research on the wisdom of crowds is motivated by the observation that the average belief in a large group can be accurate even when group members are individually inaccurate. A common theoretical assumption in previous research is that accurate group beliefs can emerge only when group members are statistically independent. However, network models of belief formation suggest that the effect of social influence depends on the structure of social networks. We present a theoretical overview and two experimental studies showing that, under the right conditions, social influence can improve the accuracy of both individual group members and the group as a whole. The results support the argument that interacting groups can produce collective intelligence that surpasses the collected intelligence of independent individuals

    Mapping Crisis: Participation, Datafication, and Humanitarianism in the Age of Digital Mapping

    Get PDF
    This book brings together critical perspectives on the role that mapping people, knowledges and data now plays in humanitarian work, both in cartographic terms and through data visualisations. Since the rise of Google Earth in 2005, there has been an explosion in the use of mapping tools to quantify and assess the needs of the poor, including those affected by climate change and the wider neo-liberal agenda. Yet, while there has been a huge upsurge in the data produced around these issues, the representation of people remains questionable. Some have argued that representation has diminished in humanitarian crises as people are increasingly reduced to data points. In turn, this data becomes ever more difficult to analyse without vast computing power, leading to a dependency on the old colonial powers to refine the data of the poor, before selling it back to them. These issues are not entirely new, and questions around representation, participation and humanitarianism can be traced back beyond the speeches of Truman, but the digital age throws these issues back to the fore, as machine learning, algorithms and big data centres take over the process of mapping the subjugated and subaltern. This book questions whether, as we map crises, it is the map itself that is in crisis

    Effective distant supervision for end-to-end knowledge base population systems

    Get PDF
    The growing amounts of textual data require automatic methods for structuring relevant information so that it can be further processed by computers and systematically accessed by humans. The scenario dealt with in this dissertation is known as Knowledge Base Population (KBP), where relational information about entities is retrieved from a large text collection and stored in a database, structured according to a pre-specified schema. Most of the research in this dissertation is placed in the context of the KBP benchmark of the Text Analysis Conference (TAC KBP), which provides a test-bed to examine all steps in a complex end-to-end relation extraction setting. In this dissertation a new state of the art for the TAC KBP benchmark was achieved by focussing on the following research problems: (1) The KBP task was broken down into a modular pipeline of sub-problems, and the most pressing issues were identified and quantified at all steps. (2) The quality of semi-automatically generated training data was increased by developing noise-reduction methods, decreasing the influence of false-positive training examples. (3) A focus was laid on fine-grained entity type modelling, entity expansion, entity matching and tagging, to maintain as much recall as possible on the relational argument level. (4) A new set of effective methods for generating training data, encoding features and training relational classifiers was developed and compared with previous state-of-the-art methods.Die wachsende Menge an Textdaten erfordert Methoden, relevante Informationen so zu strukturieren, dass sie von Computern weiterverarbeitet werden können, und dass Menschen systematisch auf sie zugreifen können. Das in dieser Dissertation behandelte Szenario ist unter dem Begriff Knowledge Base Population (KBP) bekannt. Hier werden relationale Informationen über Entitäten aus großen Textbeständen automatisch zusammengetragen und gemäß einem vorgegebenen Schema strukturiert. Ein Großteil der Forschung der vorliegenden Dissertation ist im Kontext des TAC KBP Vergleichstests angesiedelt. Dieser stellt ein Testumfeld dar, um alle Schritte eines anfragebasierten Relationsextraktions-Systems zu untersuchen. Die in der vorliegenden Dissertation entwickelten Verfahren setzen einen neuen Standard für TAC KBP. Dies wurde durch eine Schwerpunktsetzung auf die folgenden Forschungsfragen erreicht: Erstens wurden die wichtigsten Unterprobleme von KBP identifiziert und die jeweiligen Effekte genau quantifiziert. Zweitens wurde die Qualität von halbautomatischen Trainingsdaten durch Methoden erhöht, die den Einfluss von falsch positiven Trainingsbeispielen verringern. Drittens wurde ein Schwerpunkt auf feingliedrige Typmodellierung, die Expansion von Entitätennamen und das Auffinden von Entitäten gelegt, um eine größtmögliche Abdeckung von relationalen Argumenten zu erreichen. Viertens wurde eine Reihe von neuen leistungsstarken Methoden entwickelt und untersucht, um Trainingsdaten zu erzeugen, Klassifizierungsmerkmale zu kodieren und relationale Klassifikatoren zu trainieren

    Effective distant supervision for end-to-end knowledge base population systems

    Get PDF
    The growing amounts of textual data require automatic methods for structuring relevant information so that it can be further processed by computers and systematically accessed by humans. The scenario dealt with in this dissertation is known as Knowledge Base Population (KBP), where relational information about entities is retrieved from a large text collection and stored in a database, structured according to a pre-specified schema. Most of the research in this dissertation is placed in the context of the KBP benchmark of the Text Analysis Conference (TAC KBP), which provides a test-bed to examine all steps in a complex end-to-end relation extraction setting. In this dissertation a new state of the art for the TAC KBP benchmark was achieved by focussing on the following research problems: (1) The KBP task was broken down into a modular pipeline of sub-problems, and the most pressing issues were identified and quantified at all steps. (2) The quality of semi-automatically generated training data was increased by developing noise-reduction methods, decreasing the influence of false-positive training examples. (3) A focus was laid on fine-grained entity type modelling, entity expansion, entity matching and tagging, to maintain as much recall as possible on the relational argument level. (4) A new set of effective methods for generating training data, encoding features and training relational classifiers was developed and compared with previous state-of-the-art methods.Die wachsende Menge an Textdaten erfordert Methoden, relevante Informationen so zu strukturieren, dass sie von Computern weiterverarbeitet werden können, und dass Menschen systematisch auf sie zugreifen können. Das in dieser Dissertation behandelte Szenario ist unter dem Begriff Knowledge Base Population (KBP) bekannt. Hier werden relationale Informationen über Entitäten aus großen Textbeständen automatisch zusammengetragen und gemäß einem vorgegebenen Schema strukturiert. Ein Großteil der Forschung der vorliegenden Dissertation ist im Kontext des TAC KBP Vergleichstests angesiedelt. Dieser stellt ein Testumfeld dar, um alle Schritte eines anfragebasierten Relationsextraktions-Systems zu untersuchen. Die in der vorliegenden Dissertation entwickelten Verfahren setzen einen neuen Standard für TAC KBP. Dies wurde durch eine Schwerpunktsetzung auf die folgenden Forschungsfragen erreicht: Erstens wurden die wichtigsten Unterprobleme von KBP identifiziert und die jeweiligen Effekte genau quantifiziert. Zweitens wurde die Qualität von halbautomatischen Trainingsdaten durch Methoden erhöht, die den Einfluss von falsch positiven Trainingsbeispielen verringern. Drittens wurde ein Schwerpunkt auf feingliedrige Typmodellierung, die Expansion von Entitätennamen und das Auffinden von Entitäten gelegt, um eine größtmögliche Abdeckung von relationalen Argumenten zu erreichen. Viertens wurde eine Reihe von neuen leistungsstarken Methoden entwickelt und untersucht, um Trainingsdaten zu erzeugen, Klassifizierungsmerkmale zu kodieren und relationale Klassifikatoren zu trainieren
    • …
    corecore