2,316 research outputs found

    TONTA: Trend-based Online Network Traffic Analysis in ad-hoc IoT networks

    Get PDF
    Internet of Things (IoT) refers to a system of interconnected heterogeneous smart devices communicatingwithout human intervention. A significant portion of existing IoT networks is under the umbrella of ad-hoc andquasi ad-hoc networks. Ad-hoc based IoT networks suffer from the lack of resource-rich network infrastructuresthat are able to perform heavyweight network management tasks using, e.g. machine learning-based NetworkTraffic Monitoring and Analysis (NTMA) techniques. Designing light-weight NTMA techniques that do notneed to be (re-) trained has received much attention due to the time complexity of the training phase. In thisstudy, a novel pattern recognition method, called Trend-based Online Network Traffic Analysis (TONTA), isproposed for ad-hoc IoT networks to monitor network performance. The proposed method uses a statisticallight-weight Trend Change Detection (TCD) method in an online manner. TONTA discovers predominant trendsand recognizes abrupt or gradual time-series dataset changes to analyze the IoT network traffic. TONTA isthen compared with RuLSIF as an offline benchmark TCD technique. The results show that TONTA detectsapproximately 60% less false positive alarms than RuLSIF.publishedVersio

    Modeling Individual Cyclic Variation in Human Behavior

    Full text link
    Cycles are fundamental to human health and behavior. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov model method for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to model variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets -- of menstrual cycle symptoms and physical activity tracking data -- yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.Comment: Accepted at WWW 201

    Avalanche analysis from multi-electrode ensemble recordings in cat, monkey and human cerebral cortex during wakefulness and sleep

    Get PDF
    Self-organized critical states are found in many natural systems, from earthquakes to forest fires, they have also been observed in neural systems, particularly, in neuronal cultures. However, the presence of critical states in the awake brain remains controversial. Here, we compared avalanche analyses performed on different in vivo preparations during wakefulness, slow-wave sleep and REM sleep, using high-density electrode arrays in cat motor cortex (96 electrodes), monkey motor cortex and premotor cortex and human temporal cortex (96 electrodes) in epileptic patients. In neuronal avalanches defined from units (up to 160 single units), the size of avalanches never clearly scaled as power-law, but rather scaled exponentially or displayed intermediate scaling. We also analyzed the dynamics of local field potentials (LFPs) and in particular LFP negative peaks (nLFPs) among the different electrodes (up to 96 sites in temporal cortex or up to 128 sites in adjacent motor and pre-motor cortices). In this case, the avalanches defined from nLFPs displayed power-law scaling in double log representations, as reported previously in monkey. However, avalanche defined as positive LFP (pLFP) peaks, which are less directly related to neuronal firing, also displayed apparent power-law scaling. Closer examination of this scaling using more reliable cumulative distribution functions (CDF) and other rigorous statistical measures, did not confirm power-law scaling. The same pattern was seen for cats, monkey and human, as well as for different brain states of wakefulness and sleep. We also tested other alternative distributions. Multiple exponential fitting yielded optimal fits of the avalanche dynamics with bi-exponential distributions. Collectively, these results show no clear evidence for power-law scaling or self-organized critical states in the awake and sleeping brain of mammals, from cat to man.Comment: In press in: Frontiers in Physiology, 2012, special issue "Critical Brain Dynamics" (Edited by He BY, Daffertshofer A, Boonstra TW); 33 pages, 13 figures. 3 table

    Analysis of Heterogeneous Data Sources for Veterinary Syndromic Surveillance to Improve Public Health Response and Aid Decision Making

    Get PDF
    The standard technique of implementing veterinary syndromic surveillance (VSyS) is the detection of temporal or spatial anomalies in the occurrence of health incidents above a set threshold in an observed population using the Frequentist modelling approach. Most implementation of this technique also requires the removal of historical outbreaks from the datasets to construct baselines. Unfortunately, some challenges exist, such as data scarcity, delayed reporting of health incidents, and variable data availability from sources, which make the VSyS implementation and alarm interpretation difficult, particularly when quantifying surveillance risk with associated uncertainties. This problem indicates that alternate or improved techniques are required to interpret alarms when incorporating uncertainties and previous knowledge of health incidents into the model to inform decision-making. Such methods must be capable of retaining historical outbreaks to assess surveillance risk. In this research work, the Stochastic Quantitative Risk Assessment (SQRA) model was proposed and developed for detecting and quantifying the risk of disease outbreaks with associated uncertainties using the Bayesian probabilistic approach in PyMC3. A systematic and comparative evaluation of the available techniques was used to select the most appropriate method and software packages based on flexibility, efficiency, usability, ability to retain historical outbreaks, and the ease of developing a model in Python. The social media datasets (Twitter) were first applied to infer a possible disease outbreak incident with associated uncertainties. Then, the inferences were subsequently updated using datasets from the clinical and other healthcare sources to reduce uncertainties in the model and validate the outbreak. Therefore, the proposed SQRA model demonstrates an approach that uses the successive refinement of analysis of different data streams to define a changepoint signalling a disease outbreak. The SQRA model was tested and validated to show the method's effectiveness and reliability for differentiating and identifying risk regions with corresponding changepoints to interpret an ongoing disease outbreak incident. This demonstrates that a technique such as the SQRA method obtained through this research may aid in overcoming some of the difficulties identified in VSyS, such as data scarcity, delayed reporting, and variable availability of data from sources, ultimately contributing to science and practice

    An outlier detection method to improve gathered datasets for network behavior analysis in IoT

    Get PDF
    Outlier detection is a subfield of data mining to determine data points that notably deviate from the rest of a dataset. Their deviation can indicate that these data points are generated by errors and should therefore be removed or repaired. There are many reasons for outliers in a network dataset such as human or instrument errors, noise or system behavior changes. On the other side, Network Behavior Analysis (NBA) is a way to monitor traffic and recognize unusual actions in a network. Analyzing data trends in NBA methods is a common way to interpret network situation. Outliers can deviate and produce erroneous trends that influence the results of the NBA methods. This paper presents an approach that based on a method for trend detection divides the data set into subsets where contextual outliers are discovered. The outliers can then be removed to have a clear dataset that better shows the network behavior when using NBA methods. Increasing the accuracy and reliability are the goals of our method. We compare the proposed method with the Hampel method on simulated IoT network data.publishedVersio

    Methods for event time series prediction and anomaly detection

    Get PDF
    Event time series are sequences of events occurring in continuous time. They arise in many real-world problems and may represent, for example, posts in social media, administrations of medications to patients, or adverse events, such as episodes of atrial fibrillation or earthquakes. In this work, we study and develop methods for prediction and anomaly detection on event time series. We study two general approaches. The first approach converts event time series to regular time series of counts via time discretization. We develop methods relying on (a) nonparametric time series decomposition and (b) dynamic linear models for regular time series. The second approach models the events in continuous time directly. We develop methods relying on point processes. For prediction, we develop a new model based on point processes to combine the advantages of existing models. It is flexible enough to capture complex dependency structures between events, while not sacrificing applicability in common scenarios. For anomaly detection, we develop methods that can detect new types of anomalies in continuous time and that show advantages compared to time discretization

    Statistical methods used for intrusion detection

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2006Includes bibliographical references (leaves: 58-64)Text in English; Abstract: Turkish and Englishx, 71 leavesComputer networks are being attacked everyday. Intrusion detection systems are used to detect and reduce effects of these attacks. Signature based intrusion detection systems can only identify known attacks and are ineffective against novel and unknown attacks. Intrusion detection using anomaly detection aims to detect unknown attacks and there exist algorithms developed for this goal. In this study, performance of five anomaly detection algorithms and a signature based intrusion detection system is demonstrated on synthetic and real data sets. A portion of attacks are detected using Snort and SPADE algorithms. PHAD and other algorithms could not detect considerable portion of the attacks in tests due to lack of sufficiently long enough training data
    corecore