690 research outputs found

    Cognitive Component Analysis

    Get PDF

    Entropy Based Sensitivity Analysis and Visualization of Social Networks

    Get PDF
    Abstract. This paper introduces a technique to analyze and visualize a social network using Shannon's entropy model. We used degree entropy and presented novel measures such as, betweenness and closeness entropies to conduct network sensitivity analysis by means of evaluating the change of graph entropy via those measures. We integrated the result of our analyses into a visualization application where the social network is presented using node-link diagram. The size of visual representation of an actor depends on the amount of change in system entropy caused by the actor and color information is extracted from the graph clustering analysis. Filtering of edges and nodes is also provided to enable and improve the perception of complex graphs. The main contribution is that the information communicated from a social network is enhanced by means of sensitivity analyses and visualization techniques provided with this work

    Intelligence and Security Informatics

    Get PDF
    The book constitutes the proceedings of the First European Conference on Intelligence and Security Informatics, EuroISI 2008 Intelligence and security informatics (ISI) is a multidisciplinary field encompassing methodologies, models, algorithms, and advanced tools for intelligence analysis, homeland security, terrorism research as well as security-related public policies. These proceedings contain 25 original papers, out of 48 submissions received, related to the topics of intelligence and security informatics. These papers cover a broad range of fields such as: social network analysis, knowledge discovery, web-based intelligence and analysis, privacy protection, access control, digital rights management, malware and intrusion detection, surveillance, crisis management, and computational intelligence, among others.JRC.G.2-Support to external securit

    Probabilistic Inference of Twitter Users' Age based on What They Follow

    Get PDF
    Twitter provides an open and rich source of data for studying human behaviour at scale and is widely used in social and network sciences. However, a major criticism of Twitter data is that demographic information is largely absent. Enhancing Twitter data with user ages would advance our ability to study social network structures, information flows and the spread of contagions. Approaches toward age detection of Twitter users typically focus on specific properties of tweets, e.g., linguistic features, which are language dependent. In this paper, we devise a language-independent methodology for determining the age of Twitter users from data that is native to the Twitter ecosystem. The key idea is to use a Bayesian framework to generalise ground-truth age information from a few Twitter users to the entire network based on what/whom they follow. Our approach scales to inferring the age of 700 million Twitter accounts with high accuracy.Comment: 9 pages, 9 figure

    Helping with inquiries: theory and practice in forensic science

    Get PDF
    This thesis investigates the reasoning practices of forensic scientists, with specific focus on the application of the Bayesian form of probabilistic reasoning to forensic science matters. Facilitated in part by the insights of evidence scholarship, Bayes Theorem has been advocated as an essential resource for the interpretation and evaluation of forensic evidence, and has been used to support the production of specific technologies designed to aid forensic scientists in these processes. In the course of this research I have explored the ways in which Bayesian reasoning can be regarded as a socially constructed collection of practices, despite proposals that it is simply a logical way to reason about evidence. My data are drawn from two case studies. In the first, I demonstrate how the Bayesian algorithms used for the interpretation of complex DNA profiles are themselves elaborately constructed devices necessary for the anchoring of scientific practice to forensic contexts. In the second case study, an investigation of a more generalised framework of forensic investigation known as the Case Assessment and Interpretation (CAI) model, I show how the enactment of Bayesian reasoning is dependent on a series of embodied, experiential and intersubjective knowledge-forming activities. Whilst these practices may seem to be largely independent of theoretical representations of Bayesian reasoning, they are nonetheless necessary to bring the latter into being. This is at least partially due to the ambiguities and liminalities encountered in the process of applying Bayesianism to forensic investigation, and also may result from the heavy informational demands placed on the reasoner. I argue that these practices, or 'forms of Bayes', are necessary in order to negotiate areas of ontological uncertainty. The results of this thesis therefore challenge prevailing conceptions of Bayes Theorem as a universal, immutable signifier, able to be put to work unproblematically in any substantive domain, Instead, I have been able to highlight the diverse range of practices required for 'Bayesian' reasoners to negotiate the sociomaterial contingencies exposed in the process of its application

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Statistical learning methods for mining marketing and biological data

    Get PDF
    Nowadays, the value of data has been broadly recognized and emphasized. More and more decisions are made based on data and analysis rather than solely on experience and intuition. With the fast development of networking, data storage, and data collection capacity, data have increased dramatically in industry, science and engineering domains, which brings both great opportunities and challenges. To take advantage of the data flood, new computational methods are in demand to process, analyze and understand these datasets. This dissertation focuses on the development of statistical learning methods for online advertising and bioinformatics to model real world data with temporal or spatial changes. First, a collaborated online change-point detection method is proposed to identify the change-points in sparse time series. It leverages the signals from the auxiliary time series such as engagement metrics to compensate the sparse revenue data and improve detection efficiency and accuracy through smart collaboration. Second, a task-specific multi-task learning algorithm is developed to model the ever-changing video viewing behaviors. With the 1-regularized task-specific features and jointly estimated shared features, it allows different models to seek common ground while reserving differences. Third, an empirical Bayes method is proposed to identify 3\u27 and 5\u27 alternative splicing in RNA-seq data. It formulates alternative 3\u27 and 5\u27 splicing site selection as a change-point problem and provides for the first time a systematic framework to pool information across genes and integrate various information when available, in particular the useful junction read information, in order to obtain better performance

    Text mining analysis roadmap (TMAR) for service research

    Get PDF
    Purpose The purpose of this paper is to offer a step-by-step text mining analysis roadmap (TMAR) for service researchers. The paper provides guidance on how to choose between alternative tools, using illustrative examples from a range of business contexts. Design/methodology/approach The authors provide a six-stage TMAR on how to use text mining methods in practice. At each stage, the authors provide a guiding question, articulate the aim, identify a range of methods and demonstrate how machine learning and linguistic techniques can be used in practice with illustrative examples drawn from business, from an array of data types, services and contexts. Findings At each of the six stages, this paper demonstrates useful insights that result from the text mining techniques to provide an in-depth understanding of the phenomenon and actionable insights for research and practice. Originality/value There is little research to guide scholars and practitioners on how to gain insights from the extensive “big data” that arises from the different data sources. In a first, this paper addresses this important gap highlighting the advantages of using text mining to gain useful insights for theory testing and practice in different service contexts. </jats:sec
    corecore