29 research outputs found

    The origin of bursts and heavy tails in human dynamics

    Full text link
    The dynamics of many social, technological and economic phenomena are driven by individual human actions, turning the quantitative understanding of human behavior into a central question of modern science. Current models of human dynamics, used from risk assessment to communications, assume that human actions are randomly distributed in time and thus well approximated by Poisson processes. In contrast, there is increasing evidence that the timing of many human activities, ranging from communication to entertainment and work patterns, follow non-Poisson statistics, characterized by bursts of rapidly occurring events separated by long periods of inactivity. Here we show that the bursty nature of human behavior is a consequence of a decision based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, most tasks being rapidly executed, while a few experience very long waiting times. In contrast, priority blind execution is well approximated by uniform interevent statistics. These findings have important implications from resource management to service allocation in both communications and retail.Comment: Supplementary Material available at http://www.nd.edu/~network

    Word statistics in Blogs and RSS feeds: Towards empirical universal evidence

    Get PDF
    We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words' frequencies, the empirical analysis is performed by studying classes of "frequently-equivalent" words, i.e. by grouping words depending on their frequencies. Two limiting cases are considered: the dilute limit, i.e. for those words that are used less than once a day, and the dense limit for frequent words. In both cases, extreme events occur more frequently than expected from the Poisson hypothesis. These deviations from Poisson statistics reveal non-trivial time correlations between events that are associated with bursts of activities. The distribution of waiting times is shown to behave like a stretched exponential and to have the same shape for different sets of words sharing a common frequency, thereby revealing universal features.Comment: 16 pages, 6 figure

    KISS: Stochastic Packet Inspection Classifier for UDP Traffic

    Get PDF
    This paper proposes KISS, a novel Internet classifica- tion engine. Motivated by the expected raise of UDP traffic, which stems from the momentum of Peer-to-Peer (P2P) streaming appli- cations, we propose a novel classification framework that leverages on statistical characterization of payload. Statistical signatures are derived by the means of a Chi-Square-like test, which extracts the protocol "format," but ignores the protocol "semantic" and "synchronization" rules. The signatures feed a decision process based either on the geometric distance among samples, or on Sup- port Vector Machines. KISS is very accurate, and its signatures are intrinsically robust to packet sampling, reordering, and flow asym- metry, so that it can be used on almost any network. KISS is tested in different scenarios, considering traditional client-server proto- cols, VoIP, and both traditional and new P2P Internet applications. Results are astonishing. The average True Positive percentage is 99.6%, with the worst case equal to 98.1,% while results are al- most perfect when dealing with new P2P streaming applications

    Interactive Web-based Applications Enforcing Communication and Cooperation in Distributed Teams

    Get PDF
    In this paper we show the work ongoing at CRS4, on the topic of collaboration tools. We describe DJ-Lab, a plugin for the popular integrated development environment IntelliJ Idea that supports the practice of remote pair programming; XP4IDE, that automates the activity of tracking of XP managed development projects and integrates in the IDE a view of the project tasks; and WebRogue, an application for virtual presence in Web sites, that allows web users to see the other people connected to a web server and communicate and cooperate in various ways. The philosophies of these applications are analyzed to spot analogies and differences, potential evolutions, technical and human limitations, and track a path for future development.17-2

    Creating dynamic groups using context-awareness

    Full text link
    This article presents the conceptual communication model of dynamic groups, that dynamically utilizes three traditional communication metaphors through the use of context-based information. Dynamic groups makes creation, management and usage of groups easy. It enables social network structures to be maintained in both virtual and face-to-face settings as well as in the combination thereof. This article defines the dynamic management of advanced contact lists which can include presence and status information, a/synchronous multimedia communication tools, and methods for structuring social networks. It also contains an initial evaluation and a proposed architecture for technical realisation.Godkänd; 2007; 20071130 (miabac

    Behavioral Profiling of SCADA Network Traffic using Machine Learning Algorithms

    Get PDF
    Mixed traffic networks containing both traditional ICT network traffic and SCADA network traffic are more commonplace now due to the desire for remote control and monitoring of industrial processes. The ability to identify SCADA devices on a mixed traffic network with zero prior knowledge, such as port, protocol or IP address, is desirable since SCADA devices are communicating over corporate networks but typically use non-standard ports and proprietary protocols. Four supervised ML algorithms are tested on a mixed traffic dataset containing 116,527 dataflows from both SCADA and traditional ICT networks: Naive Bayes, NBTree, BayesNet, and J4.8. Using packet timing, packet size and data throughput as traffic behavior categories, this research calculates 24 attributes from each device dataflow. All four algorithms are tested with three attribute subsets: a full set and two reduced attribute subsets. The attributes and ML algorithms chosen for experimentation successfully demonstrate that a TPR of .9935 for SCADA network traffic is feasible on a given network. It also successfully identifies an optimal attribute subset, while maintaining at least a .99 TPR. The optimal attribute subset provides the SCADA network traffic behaviors that most effectively differentiating them from traditional ICT network traffic

    Text Mining and Cybercrime

    Full text link
    This chapter describes the state of technology for studying Internet crimes against children, specifically sexual predation and cyberbullying. We begin by presenting a survey of relevant research articles that are related to the study of cybercrime. This survey includes a discussion of our work on the classification of chat logs that contain bullying or predatory behavior. Many commercial enterprises have developed parental control software to monitor these behaviors, and the latest version of some of these tools provides features that profess to protect children against predators and bullies. The chapter concludes with a discussion of these products and offers suggestions for continued research in this interesting and timely sub-field of text mining. 1.

    Remaining popular: power-law regularities in network dynamics

    Get PDF
    Abstract The structure of networks has been a focal research topic over the past few decades. These research efforts have enabled the discovery of numerous structural patterns and regularities, bringing forth advancements in many fields. In particular, the ubiquitous power-law patterns evident in degree distributions, graph eigenvalues and human mobility patterns have provided the opportunity to model many different complex systems. However, regularities in the dynamical patterns of networks remain a considerably less explored terrain. In this study we examine the dynamics of networks, focusing on stability characteristics of node popularity, and present our results using various empirical datasets. Specifically, we address several intriguing questions – for how long are popular nodes expected to remain so? How much time is expected to pass between two consecutive popularity periods? What characterizes nodes which manage to maintain their popularity for long periods of time? Surprisingly, we find that such temporal aspects are governed by a power-law regime, and that these power-law regularities are equally likely across all node ages
    corecore