24 research outputs found

    Uncovering Information Operations On Twitter Using Natural Language Processing And The Dynamic Wavelet Fingerprint

    Get PDF
    Information Operations (IO) are campaigns waged by covert, powerful entities to distort public discourse in a direction that is advantageous for them. It is the behaviors of the underlying networks that signal these campaigns in action, not the specific content they are posting. In this dissertation we introduce a social media analysis system that uncovers these behaviors by analyzing the specific post timings of underlying accounts and networks. The presented method first clusters tweets based on content using Natural Language Processing (NLP). Each of these clusters - referred to as topics - are plotted in time using the attached metadata for each tweet. These topic signals are then analyzed using the Dynamic Wavelet Fingerprint (DWFP), which creates binary images of each topic that describe localized behaviors in the topic\u27s propagation through Twitter. The features extracted from the DWFP and the underlying tweet metadata can be applied to various analyses. In this dissertation we present four applications of the presented method. First, we break down seven culturally significant tweet storms to identify characteristic, localized behavior that are common among and unique to each tweet storm. Next, we use the DWFP signal processing to identify bot accounts. Then this method is applied to a large dataset of tweets from the early weeks of the Covid-19 pandemic to identify densely connected communities, many of which display potential IO behaviors. Finally, this method is applied to a live-stream of Turkish tweets to identify coordinated networks working to push various agendas through a volatile time in Turkish politics

    2016 Oklahoma Research Day Full Program

    Get PDF
    This document contains all abstracts from the 2016 Oklahoma Research Day held at Northeastern State University

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

    Scanning the Science-Society Horizon

    No full text
    Science communication approaches have evolved over time gradually placing more importance on understanding the context of the communication and audience. The increase in people participating in social media on the Internet offers a new resource for monitoring what people are discussing. People self publish their views on social media, which provides a rich source of every day, every person thinking. This introduces the possibility of using passive monitoring of this public discussion to find information useful to science communicators, to allow them to better target their communications about different topics. This research study is focussed on understanding what open source intelligence, in the form of public tweets on Twitter, reveals about the contexts in which the word 'science' is used by the English speaking public. By conducting a series of studies based on simpler questions, I gradually build up a view of who is contributing on Twitter, how often, and what topics are being discussed that include the keyword 'science'. An open source a data gathering tool for Twitter data was developed and used to collect a dataset from Twitter with the keyword 'science' during 2011. After collection was completed, data was prepared for analysis by removing unwanted tweets. The size of the dataset (12.2 million tweets by 3.6 million users (authors)) required the use of mainly quantitative approaches, even though this only represents a very small proportion, about 0.02%, of the total tweets per day on Twitter Fourier analysis was used to create a model of the underlying temporal pattern of tweets per day and revealed a weekly pattern. The number of users per day followed a similar pattern, and most of these users did not use the word 'science' often on Twitter. An investigation of types of tweets suggests that people using the word 'science' were engaged in more sharing of both links, and other peoples tweets, than is usual on Twitter. Consideration of word frequency and bigrams in the text of the tweets found that while word frequencies were not particularly effective when trying to understand such a large dataset, bigrams were able to give insight into the contexts in which 'science' is being used in up to 19.19% of the tweets. The final study used Latent Dirichlet Allocation (LDA) topic modelling to identify the contexts in which 'science' was being used and gave a much richer view of the whole corpus than the bigram analysis. Although the thesis has focused on the single keyword 'science' the techniques developed should be applicable to other keywords and so be able to provide science communicators with a near real time source of information about what issues the public is concerned about, what they are saying about those issues and how that is changing over time

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Personality Identification from Social Media Using Deep Learning: A Review

    Get PDF
    Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov
    corecore