31,786 research outputs found

    Graphs in machine learning: an introduction

    Full text link
    Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globally. In both contexts, supervised and un-supervised, data can be relational (augmented with one or several global graphs) as described above, or graph valued. In this latter case, each object of interest is given as a full graph (possibly completed by other characteristics). In this context, natural tasks include graph clustering (as in producing clusters of graphs rather than clusters of nodes in a single graph), graph classification, etc. 1 Real networks One of the first practical studies on graphs can be dated back to the original work of Moreno [51] in the 30s. Since then, there has been a growing interest in graph analysis associated with strong developments in the modelling and the processing of these data. Graphs are now used in many scientific fields. In Biology [54, 2, 7], for instance, metabolic networks can describe pathways of biochemical reactions [41], while in social sciences networks are used to represent relation ties between actors [66, 56, 36, 34]. Other examples include powergrids [71] and the web [75]. Recently, networks have also been considered in other areas such as geography [22] and history [59, 39]. In machine learning, networks are seen as powerful tools to model problems in order to extract information from data and for prediction purposes. This is the object of this paper. For more complete surveys, we refer to [28, 62, 49, 45]. In this section, we introduce notations and highlight properties shared by most real networks. In Section 2, we then consider methods aiming at extracting information from a unique network. We will particularly focus on clustering methods where the goal is to find clusters of vertices. Finally, in Section 3, techniques that take a series of networks into account, where each network i

    Exploring the academic invisible web

    Get PDF
    Purpose: To provide a critical review of Bergman's 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on informetric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman's size estimate of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications: The precision of our estimate is limited due to a small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the Academic Invisible Web.Comment: 13 pages, 3 figure

    Link Prediction in Complex Networks: A Survey

    Full text link
    Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labelled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure

    Artificial table testing dynamically adaptive systems

    Get PDF
    Dynamically Adaptive Systems (DAS) are systems that modify their behavior and structure in response to changes in their surrounding environment. Critical mission systems increasingly incorporate adaptation and response to the environment; examples include disaster relief and space exploration systems. These systems can be decomposed in two parts: the adaptation policy that specifies how the system must react according to the environmental changes and the set of possible variants to reconfigure the system. A major challenge for testing these systems is the combinatorial explosions of variants and envi-ronment conditions to which the system must react. In this paper we focus on testing the adaption policy and propose a strategy for the selection of envi-ronmental variations that can reveal faults in the policy. Artificial Shaking Table Testing (ASTT) is a strategy inspired by shaking table testing (STT), a technique widely used in civil engineering to evaluate building's structural re-sistance to seismic events. ASTT makes use of artificial earthquakes that simu-late violent changes in the environmental conditions and stresses the system adaptation capability. We model the generation of artificial earthquakes as a search problem in which the goal is to optimize different types of envi-ronmental variations

    Moving outside the box: Researching e-learning in disruptive times

    Get PDF
    Indexación: Scopus.The rise of technology’s influence in a cross-section of fields within formal education, not to mention in the broader social world, has given rise to new forms in the way we view learning, i.e. what constitutes valid knowledge and how we arrive at that knowledge. Some scholars have claimed that technology is but a tool to support the meaning-making that lies at the root of knowledge production while others argue that technology is increasingly and inextricably intertwined not just with knowledge construction but with changes to knowledge makers themselves. Regardless which side one stands in this growing debate, it is difficult to deny that the processes we use to research learning supported by technology in order to understand these growing intricacies, have profound implications. In this paper, my aim is to argue and defend a call in the research on ICT for a critical reflective approach to researching technology use. Using examples from qualitative research in e-learning I have conducted on three continents over 15 years, and in diverse educational contexts, I seek to unravel the means and justification for research approaches that can lead to closing the gap between research and practice. These studies combined with those from a cross-disciplinary array of fields support the promotion of a research paradigm that examines the socio-cultural contexts of learning with ICT, at a time that coincides with technology becoming a social networking facilitator. Beyond the examples and justification of the merits and power of qualitative research to uncover the stories that matter in these socially embodied e-learning contexts, I discuss the methodologically and ethically charged decisions using emerging affordances of technology for analyzing and representing results, including visual ethnography. The implications both for the consumers and producers of research of moving outside the box of established research practices are yet unfathomable but excitinghttp://www.ejel.org/volume15/issue1/p5

    Mitigating Gender Bias in Machine Learning Data Sets

    Full text link
    Artificial Intelligence has the capacity to amplify and perpetuate societal biases and presents profound ethical implications for society. Gender bias has been identified in the context of employment advertising and recruitment tools, due to their reliance on underlying language processing and recommendation algorithms. Attempts to address such issues have involved testing learned associations, integrating concepts of fairness to machine learning and performing more rigorous analysis of training data. Mitigating bias when algorithms are trained on textual data is particularly challenging given the complex way gender ideology is embedded in language. This paper proposes a framework for the identification of gender bias in training data for machine learning.The work draws upon gender theory and sociolinguistics to systematically indicate levels of bias in textual training data and associated neural word embedding models, thus highlighting pathways for both removing bias from training data and critically assessing its impact.Comment: 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as part of the ECIR Conference) - http://bias.disim.univaq.i
    corecore