46 research outputs found

    SAMPLING AND CHARACTERIZING EVOLVING COMMUNITIES IN SOCIAL NETWORKS

    Get PDF
    One of the most important structures in social networks is communities. Understanding communities is useful in many applications, such as suggesting a friend for a user in an online friendship network, recommending a product for a user in an e-commerce network, etc. However, before studying anything about communities, researchers first need to collect appropriate data. Getting complete access to the data for community studies is unrealistic in most cases. In this work, we address the problem of crawling networks to identify community structure. Firstly, we present a network sampling technique to crawl the community structure of dynamic networks when there is a limitation on the number of nodes that can be queried. The process begins by obtaining a sample for the first-time step. In subsequent time steps, the crawling process is guided by community structure discoveries made in the past. Experiments conducted on the proposed approach and certain baseline techniques reveal the proposed approach has at least a 35% performance increase in cases when the total query budget is fixed over the entire period and at least an 8% increase in cases when the query budget is fixed per time step. Secondly, we propose a sampling technique to sample communities in node attributed edge streams when there is a limit on the maximum number of nodes that can be stored. The process learns if the nodal information can characterize communities. The nodal information is leveraged with the structural information to generate representative communities. If the nodal information does not characterize communities, only structural information is considered in assigning nodes to communities. The proposed approach provides a performance improvement of up to about 5 times that of baselines. Finally, we investigate factors that characterize the evolution of communities with respect to the number of active users. We perform this investigation on the Reddit social media platform. We begin by first analyzing individual conversations of one community and sees how that generalizes to other communities. The first community studied is Reddit’s changemyview. The changemyview community, in addition to its rich data source, has an interesting property where members whose view are changed award points to users that successfully changed their minds. From the changemyview community, we observe that the linguistic style and interactions of members of the community can significantly differentiate susceptible and non-susceptible users. Next, we examine other communities (subreddits), and investigate how the user behaviors observed from changemyview relate to patterns of community evolution. We learn that the linguistic style and interactions of members in a community can also significantly differentiate the different parts of the evolution of the community with respect to number of active users

    Identifying Twitter users who repost unreliable news sources with linguistic information

    Get PDF
    Social media has become a popular source for online news consumption with millions of users worldwide. However, it has become a primary platform for spreading disinformation with severe societal implications. Automatically identifying social media users that are likely to propagate posts from handles of unreliable news sources sometime in the future is of utmost importance for early detection and prevention of disinformation diffusion in a network, and has yet to be explored. To that end, we present a novel task for predicting whether a user will repost content from Twitter handles of unreliable news sources by leveraging linguistic information from the user’s own posts. We develop a new dataset of approximately 6.2K Twitter users mapped into two categories: (1) those that have reposted content from unreliable news sources; and (2) those that repost content only from reliable sources. For our task, we evaluate a battery of supervised machine learning models as well as state-of-the-art neural models, achieving up to 79.7 macro F1. In addition, our linguistic feature analysis uncovers differences in language use and style between the two user categories

    Recommending access control decisions to social media users

    Get PDF
    Social media has become an integral part of the Internet and has revolutionized interpersonal communication. The lines of separation between content creators and content consumers have blurred as normal users have platforms such as social media sites, blogs and microblogs at their disposal on which they can create and consume content as well as have the opportunity to interact with other users. This change has also led to several well documented privacy problems for the users. The privacy problems faced by social media users can be categorized into institutional privacy (related to the social network provider) and social privacy (related to the interpersonal communication between social media users) problems. The work presented in this thesis focuses on the social privacy issues that affect users on social media due to their interactions with members in their network who may represent various facets of their lives (such as work, family, school, etc.). In such a scenario, it is imperative for them to be able to appropriately control access to their information such that it reaches the appropriate audience. For example, a person may not want to share the same piece of information with their boss at work and their family members. These boundaries are defined by the nature of relationships people share with each other and are enforced by controlling access during communication. In real life, people are accustomed to do this but it becomes a greater challenge while interacting online. The primary contribution of the work presented in this thesis is to design an access control recommendation mechanism for social media users which would ease the burden on the user while sharing information with their contacts on the social network. The recommendation mechanism presented in this thesis, REACT (REcommending Access Control decisions To social media users), leverages information defining interpersonal relationships between social media users in conjunction with information about the content in order to appropriately represent the context of information disclosure. Prior research has pointed towards ways in which to employ information residing in the social network to represent social relationships between individuals. REACT relies on extensive empirical evaluation of such information in order to identify the most suitable types of information which can be used to predict access control decisions made by social media users. In particular, the work in this thesis advances the state of art in the following ways: (i) An empirical study to identify the most appropriate network based community detection algorithm to represent the type of interpersonal relationships in the resulting access control recommendation mechanism. This empirical study examines a goodness of fit of the communities produced by 8 popular network based community detection algorithms with the access control decisions made by social media users. (ii) Systematic feature engineering to derive the most appropriate profile attribute to represent the strength or closeness between social media users. The relationship strength is an essential indicator of access control preferences and the endeavor is to identify the minimal subset of attributes which can accurately represent this in the resulting access control recommendation mechanism. (iii) The suitable representation of interpersonal relationships in conjunction with information about the content that result in the design of an access control recommendation mechanism, REACT, which considers the overall context of information disclosure and is shown to produce highly accurate recommendations

    Machine perception and learning of complex social systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 125-136).The study of complex social systems has traditionally been an arduous process, involving extensive surveys, interviews, ethnographic studies, or analysis of online behavior. Today, however, it is possible to use the unprecedented amount of information generated by pervasive mobile phones to provide insights into the dynamics of both individual and group behavior. Information such as continuous proximity, location, communication and activity data, has been gathered from the phones of 100 human subjects at MIT. Systematic measurements from these 100 people over the course of eight months has generated one of the largest datasets of continuous human behavior ever collected, representing over 300,000 hours of daily activity. In this thesis we describe how this data can be used to uncover regular rules and structure in behavior of both individuals and organizations, infer relationships between subjects, verify self- report survey data, and study social network dynamics. By combining theoretical models with rich and systematic measurements, we show it is possible to gain insight into the underlying behavior of complex social systems.by Nathan Norfleet Eagle.Ph.D

    NON-VERBAL COMMUNICATION WITH PHYSIOLOGICAL SENSORS. THE AESTHETIC DOMAIN OF WEARABLES AND NEURAL NETWORKS

    Get PDF
    Historically, communication implies the transfer of information between bodies, yet this phenomenon is constantly adapting to new technological and cultural standards. In a digital context, it’s commonplace to envision systems that revolve around verbal modalities. However, behavioural analysis grounded in psychology research calls attention to the emotional information disclosed by non-verbal social cues, in particular, actions that are involuntary. This notion has circulated heavily into various interdisciplinary computing research fields, from which multiple studies have arisen, correlating non-verbal activity to socio-affective inferences. These are often derived from some form of motion capture and other wearable sensors, measuring the ‘invisible’ bioelectrical changes that occur from inside the body. This thesis proposes a motivation and methodology for using physiological sensory data as an expressive resource for technology-mediated interactions. Initialised from a thorough discussion on state-of-the-art technologies and established design principles regarding this topic, then applied to a novel approach alongside a selection of practice works to compliment this. We advocate for aesthetic experience, experimenting with abstract representations. Atypically from prevailing Affective Computing systems, the intention is not to infer or classify emotion but rather to create new opportunities for rich gestural exchange, unconfined to the verbal domain. Given the preliminary proposition of non-representation, we justify a correspondence with modern Machine Learning and multimedia interaction strategies, applying an iterative, human-centred approach to improve personalisation without the compromising emotional potential of bodily gesture. Where related studies in the past have successfully provoked strong design concepts through innovative fabrications, these are typically limited to simple linear, one-to-one mappings and often neglect multi-user environments; we foresee a vast potential. In our use cases, we adopt neural network architectures to generate highly granular biofeedback from low-dimensional input data. We present the following proof-of-concepts: Breathing Correspondence, a wearable biofeedback system inspired by Somaesthetic design principles; Latent Steps, a real-time auto-encoder to represent bodily experiences from sensor data, designed for dance performance; and Anti-Social Distancing Ensemble, an installation for public space interventions, analysing physical distance to generate a collective soundscape. Key findings are extracted from the individual reports to formulate an extensive technical and theoretical framework around this topic. The projects first aim to embrace some alternative perspectives already established within Affective Computing research. From here, these concepts evolve deeper, bridging theories from contemporary creative and technical practices with the advancement of biomedical technologies.Historicamente, os processos de comunicação implicam a transferência de informação entre organismos, mas este fenómeno está constantemente a adaptar-se a novos padrões tecnológicos e culturais. Num contexto digital, é comum encontrar sistemas que giram em torno de modalidades verbais. Contudo, a análise comportamental fundamentada na investigação psicológica chama a atenção para a informação emocional revelada por sinais sociais não verbais, em particular, acções que são involuntárias. Esta noção circulou fortemente em vários campos interdisciplinares de investigação na área das ciências da computação, dos quais surgiram múltiplos estudos, correlacionando a actividade nãoverbal com inferências sócio-afectivas. Estes são frequentemente derivados de alguma forma de captura de movimento e sensores “wearable”, medindo as alterações bioeléctricas “invisíveis” que ocorrem no interior do corpo. Nesta tese, propomos uma motivação e metodologia para a utilização de dados sensoriais fisiológicos como um recurso expressivo para interacções mediadas pela tecnologia. Iniciada a partir de uma discussão aprofundada sobre tecnologias de ponta e princípios de concepção estabelecidos relativamente a este tópico, depois aplicada a uma nova abordagem, juntamente com uma selecção de trabalhos práticos, para complementar esta. Defendemos a experiência estética, experimentando com representações abstractas. Contrariamente aos sistemas de Computação Afectiva predominantes, a intenção não é inferir ou classificar a emoção, mas sim criar novas oportunidades para uma rica troca gestual, não confinada ao domínio verbal. Dada a proposta preliminar de não representação, justificamos uma correspondência com estratégias modernas de Machine Learning e interacção multimédia, aplicando uma abordagem iterativa e centrada no ser humano para melhorar a personalização sem o potencial emocional comprometedor do gesto corporal. Nos casos em que estudos anteriores demonstraram com sucesso conceitos de design fortes através de fabricações inovadoras, estes limitam-se tipicamente a simples mapeamentos lineares, um-para-um, e muitas vezes negligenciam ambientes multi-utilizadores; com este trabalho, prevemos um potencial alargado. Nos nossos casos de utilização, adoptamos arquitecturas de redes neurais para gerar biofeedback altamente granular a partir de dados de entrada de baixa dimensão. Apresentamos as seguintes provas de conceitos: Breathing Correspondence, um sistema de biofeedback wearable inspirado nos princípios de design somaestético; Latent Steps, um modelo autoencoder em tempo real para representar experiências corporais a partir de dados de sensores, concebido para desempenho de dança; e Anti-Social Distancing Ensemble, uma instalação para intervenções no espaço público, analisando a distância física para gerar uma paisagem sonora colectiva. Os principais resultados são extraídos dos relatórios individuais, para formular um quadro técnico e teórico alargado para expandir sobre este tópico. Os projectos têm como primeiro objectivo abraçar algumas perspectivas alternativas às que já estão estabelecidas no âmbito da investigação da Computação Afectiva. A partir daqui, estes conceitos evoluem mais profundamente, fazendo a ponte entre as teorias das práticas criativas e técnicas contemporâneas com o avanço das tecnologias biomédicas

    Towards gestural understanding for intelligent robots

    Get PDF
    Fritsch JN. Towards gestural understanding for intelligent robots. Bielefeld: Universität Bielefeld; 2012.A strong driving force of scientific progress in the technical sciences is the quest for systems that assist humans in their daily life and make their life easier and more enjoyable. Nowadays smartphones are probably the most typical instances of such systems. Another class of systems that is getting increasing attention are intelligent robots. Instead of offering a smartphone touch screen to select actions, these systems are intended to offer a more natural human-machine interface to their users. Out of the large range of actions performed by humans, gestures performed with the hands play a very important role especially when humans interact with their direct surrounding like, e.g., pointing to an object or manipulating it. Consequently, a robot has to understand such gestures to offer an intuitive interface. Gestural understanding is, therefore, a key capability on the way to intelligent robots. This book deals with vision-based approaches for gestural understanding. Over the past two decades, this has been an intensive field of research which has resulted in a variety of algorithms to analyze human hand motions. Following a categorization of different gesture types and a review of other sensing techniques, the design of vision systems that achieve hand gesture understanding for intelligent robots is analyzed. For each of the individual algorithmic steps – hand detection, hand tracking, and trajectory-based gesture recognition – a separate Chapter introduces common techniques and algorithms and provides example methods. The resulting recognition algorithms are considering gestures in isolation and are often not sufficient for interacting with a robot who can only understand such gestures when incorporating the context like, e.g., what object was pointed at or manipulated. Going beyond a purely trajectory-based gesture recognition by incorporating context is an important prerequisite to achieve gesture understanding and is addressed explicitly in a separate Chapter of this book. Two types of context, user-provided context and situational context, are reviewed and existing approaches to incorporate context for gestural understanding are reviewed. Example approaches for both context types provide a deeper algorithmic insight into this field of research. An overview of recent robots capable of gesture recognition and understanding summarizes the currently realized human-robot interaction quality. The approaches for gesture understanding covered in this book are manually designed while humans learn to recognize gestures automatically during growing up. Promising research targeted at analyzing developmental learning in children in order to mimic this capability in technical systems is highlighted in the last Chapter completing this book as this research direction may be highly influential for creating future gesture understanding systems

    Teak: A Novel Computational And Gui Software Pipeline For Reconstructing Biological Networks, Detecting Activated Biological Subnetworks, And Querying Biological Networks.

    Get PDF
    As high-throughput gene expression data becomes cheaper and cheaper, researchers are faced with a deluge of data from which biological insights need to be extracted and mined since the rate of data accumulation far exceeds the rate of data analysis. There is a need for computational frameworks to bridge the gap and assist researchers in their tasks. The Topology Enrichment Analysis frameworK (TEAK) is an open source GUI and software pipeline that seeks to be one of many tools that fills in this gap and consists of three major modules. The first module, the Gene Set Cultural Algorithm, de novo infers biological networks from gene sets using the KEGG pathways as prior knowledge. The second and third modules query against the KEGG pathways using molecular profiling data and query graphs, respectively. In particular, the second module, also called TEAK, is a network partitioning module that partitions the KEGG pathways into both linear and nonlinear subpathways. In conjunction with molecular profiling data, the subpathways are ranked and displayed to the user within the TEAK GUI. Using a public microarray yeast data set, previously unreported fitness defects for dpl1 delta and lag1 delta mutants under conditions of nitrogen limitation were found using TEAK. Finally, the third module, the Query Structure Enrichment Analysis framework, is a network query module that allows researchers to query their biological hypotheses in the form of Directed Acyclic Graphs against the KEGG pathways

    Privacy Preserved Model Based Approaches for Generating Open Travel Behavioural Data

    Get PDF
    Location-aware technologies and smart phones are fast growing in usage and adoption as a medium of service request and delivery of daily activities. However, the increasing usage of these technologies has birthed new challenges that needs to be addressed. Privacy protection and the need for disaggregate mobility data for transportation modelling needs to be balanced for applications and academic research. This dissertation focuses on developing modern privacy mechanisms that seek to satisfy requirements on privacy and data utility for fine-grained travel behavioural modelling applications using large-scale mobility data. To accomplish this, we review the challenges and opportunities that are needed to be solved in order to harness the full potential of “Big Transportation Data”. Also, we perform a quantitative evaluation on the degree of privacy that are provided by popular location anonymization techniques when undertaken on sensitive location data (i.e. homes, offices) of a travel survey. As a step to solve the trade-off between privacy and utility, we develop a differentially-private generative model for simultaneously synthesizing both socio-economic attributes and sequences of activity diary. Adversarial attack models are proposed and tested to evaluate the effectiveness of the proposed system against privacy attacks. The results show that datasets from the developed privacy enhancing system can be used for travel behavioural modelling with satisfactory results while ensuring an acceptable level of privacy

    Hybrid intelligence for data mining

    Full text link
    Today, enormous amount of data are being recorded in all kinds of activities. This sheer size provides an excellent opportunity for data scientists to retrieve valuable information using data mining techniques. Due to the complexity of data in many neoteric problems, one-size-fits-all solutions are seldom able to provide satisfactory answers. Although the studies of data mining have been active, hybrid techniques are rarely scrutinized in detail. Currently, not many techniques can handle time-varying properties while performing their core functions, neither do they retrieve and combine information from heterogeneous dimensions, e.g., textual and numerical horizons. This thesis summarizes our investigations on hybrid methods to provide data mining solutions to problems involving non-trivial datasets, such as trajectories, microblogs, and financial data. First, time-varying dynamic Bayesian networks are extended to consider both causal and dynamic regularization requirements. Combining with density-based clustering, the enhancements overcome the difficulties in modeling spatial-temporal data where heterogeneous patterns, data sparseness and distribution skewness are common. Secondly, topic-based methods are proposed for emerging outbreak and virality predictions on microblogs. Complicated models that consider structural details are popular while others might have taken overly simplified assumptions to sacrifice accuracy for efficiency. Our proposed virality prediction solution delivers the benefits of both worlds. It considers the important characteristics of a structure yet without the burden of fine details to reduce complexity. Thirdly, the proposed topic-based approach for microblog mining is extended for sentiment prediction problems in finance. Sentiment-of-topic models are learned from both commentaries and prices for better risk management. Moreover, previously proposed, supervised topic model provides an avenue to associate market volatility with financial news yet it displays poor resolutions at extreme regions. To overcome this problem, extreme topic model is proposed to predict volatility in financial markets by using supervised learning. By mapping extreme events into Poisson point processes, volatile regions are magnified to reveal their hidden volatility-topic relationships. Lastly, some of the proposed hybrid methods are applied to service computing to verify that they are sufficiently generic for wider applications

    Bayesian methods for source attribution using HIV deep sequence data

    Get PDF
    The advent of pathogen deep-sequencing technology provides new opportunities for infec- tious disease surveillance, especially for fast-evolving viruses like human immunodeficiency virus (HIV). In particular, multiple reads per host contain detailed information on viral within- host diversity. This information allows the reconstruction of partial directed transmission networks, where estimates of who is source and who is recipient are directly available from the phylogenetic ordering of the viruses of any two individuals. This is a new approach for phylodynamics, and the topic of my thesis. In this thesis, I present updates to the bioinformatics pipeline used by the Phylogenetics And Networks for Generalised Epidemics in Africa consortium for processing HIV deep sequence data and running the phyloscanner program. I then present a semi-parametric Bayesian Poisson model for inferring infectious disease transmission flows and the sources of infection at the population level. The framework is computationally scalable in high- dimensional flow spaces thanks to Hilbert Space Gaussian process approximations, allows for sampling bias adjustments, and estimation of gender- and age-specific transmission flows at a finer resolution than previously possible. In this sense, the methods that I developed enable us to overcome some problems which have been unable to be solved by conventional phylodynamic approaches. We apply the approach to densely sampled, population-based HIV deep-sequence data from Rakai, Uganda. I focus on characterising age-specific transmission dynamics, and examining the sources of HIV infections in adolescent and young women in particular.Open Acces
    corecore