103 research outputs found

    Reliable online social network data collection

    Get PDF
    Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help understanding users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inacurrate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially-aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies, and introduce our own methodology and user study based on the Experience Sampling Method; we claim our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.Postprin

    Predicting the Evolution of Communities with Online Inductive Logic Programming

    Get PDF
    In the recent years research on dynamic social network has increased, which is also due to the availability of data sets from streaming media. Modeling a network\u27s dynamic behaviour can be performed at the level of communities, which represent their mesoscale structure. Communities arise as a result of user to user interaction. In the current work we aim to predict the evolution of communities, i.e. to predict their future form. While this problem has been studied in the past as a supervised learning problem with a variety of classifiers, the problem is that the "knowledge" of a classifier is opaque and consequently incomprehensible to a human. Thus we have employed first order logic, and in particular the event calculus to represent the communities and their evolution. We addressed the problem of predicting the evolution as an online Inductive Logic Programming problem (ILP), where the issue is to learn first order logical clauses that associate evolutionary events, and particular Growth, Shrinkage, Continuation and Dissolution to lower level events. The lower level events are features that represent the structural and temporal characteristics of communities. Experiments have been performed on a real life data set form the Mathematics StackExchange forum, with the OLED framework for ILP. In doing so we have produced clauses that model both short term and long term correlations

    Uncovering Hierarchical Structure in Social Networks using Isospectral Reductions

    Full text link
    We employ the recently developed theory of isospectral network reductions to analyze multi-mode social networks. This procedure allows us to uncover the hierarchical structure of the networks we consider as well as the hierarchical structure of each mode of the network. Additionally, by performing a dynamical analysis of these networks we are able to analyze the evolution of their structure allowing us to find a number of other network features. We apply both of these approaches to the Southern Women Data Set, one of the most studied social networks and demonstrate that these techniques provide new information, which complements previous findings.Comment: 17 pages, 5 figures, 5 table

    A compression-based method for detecting anomalies in textual data

    Full text link
    Nowadays, information and communications technology systems are fundamental assets of our social and economical model, and thus they should be properly protected against the malicious activity of cybercriminals. Defence mechanisms are generally articulated around tools that trace and store information in several ways, the simplest one being the generation of plain text files coined as security logs. Such log files are usually inspected, in a semi-automatic way, by security analysts to detect events that may affect system integrity, confidentiality and availability. On this basis, we propose a parameter-free method to detect security incidents from structured text regardless its nature. We use the Normalized Compression Distance to obtain a set of features that can be used by a Support Vector Machine to classify events from a heterogeneous cybersecurity environment. In particular, we explore and validate the application of our method in four different cybersecurity domains: HTTP anomaly identification, spam detection, Domain Generation Algorithms tracking and sentiment analysis. The results obtained show the validity and flexibility of our approach in different security scenarios with a low configuration burdenThis research has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 872855 (TRESCA project), from the Comunidad de Madrid (Spain) under the projects CYNAMON (P2018/TCS-4566) and S2017/BMD-3688, co-financed with FSE and FEDER EU funds, by the Consejo Superior de Investigaciones Científicas (CSIC) under the project LINKA20216 (“Advancing in cybersecurity technologies”, i-LINK+ program), and by Spanish project MINECO/FEDER TIN2017-84452-

    The Effectiveness of a Career Services’ Digital Dirt Workshop for Undergraduate Students

    Get PDF
    Undergraduate students use Facebook or Myspace to communicate with their peers on the internet. Some of these individuals do not realize that their future employers may have access to their Facebook or Myspace profiles. Any negative information these employers discover about their candidates is “Digital Dirt”. The purpose of this study was to discover the effectiveness of a university-based career services’ Digital Dirt workshop for undergraduate students. This study sought to determine if participants would have different survey responses after the Digital Dirt workshop intervention (post-test) than they had before the Digital Dirt workshop intervention (pre-test). The results of this study indicated that participants are more likely to remove pictures and personal information from their social networking profiles after participation in the Digital Dirt workshop than before attending the workshop

    A fuzzy logic-based text classification method for social media

    Get PDF
    Social media offer abundant information for studying people’s behaviors, emotions and opinions during the evolution of various rare events such as natural disasters. It is useful to analyze the correlation between social media and human-affected events. This study uses Hurricane Sandy 2012 related Twitter text data to conduct information extraction and text classification. Considering that the original data contains different topics, we need to find the data related to Hurricane Sandy. A fuzzy logic-based approach is introduced to solve the problem of text classification. Inputs used in the proposed fuzzy logic-based model are multiple useful features extracted from each Twitter’s message. The output is its degree of relevance for each message to Sandy. A number of fuzzy rules are designed and different defuzzification methods are combined in order to obtain desired classification results. This work compares the proposed method with the well-known keyword search method in terms of correctness rate and quantity. The result shows that the proposed fuzzy logic-based approach is more suitable to classify Twitter messages than keyword word method

    An Empirical Approach for Extreme Behavior Identification through Tweets Using Machine Learning

    Get PDF
    This research was supported by the Ministry of Trade, Industry & Energy (MOTIE, Korea) under Industrial Technology Innovation Program. No.10063130, Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), and MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2019-2016-0-00313) supervised by the IITP (Institute for Information & communications Technology Promotion), and the 2018 Yeungnam University Research Grant.Peer reviewe
    • 

    corecore