78,635 research outputs found

    Complex graph stream mining

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Recent years have witnessed a dramatic increase of information due to the ever development of modern technologies. The large scale of information makes data analysis, particularly data mining and knowledge discovery tasks, unprecedentedly challenging. First, data is becoming more and more interconnected. In a variety of domains such as social networks, chemical compounds, and XML documents, data is no longer represented by a flat table with instance-feature format, but exhibits complex structures indicating dependency relationships. Second, data is evolving more and more dynamically. Emerging applications such as social networks continuously generate information over time. Third, the learning tasks in many real-life applications become more and more complicated in that there are various constraints on the number of labelled data, class distributions, misclassification costs, or the number of learning tasks etc. Considering the above challenges, this research aims to investigate theoretical foundations, study new algorithm designs and system frameworks to enable the mining of complex graph streams from three aspects, including (1) Correlated Graph Stream Mining, (2) Graph Stream Classifications, and (3) Complex Task Graph Classification. In particular, correlated graph stream mining intends to carry out structured pattern search and support the query of similar graphs from a graph stream. Due to the dynamic changing nature of the streaming data and the inherent complexity of the graph query process, treating graph streams as static datasets is computationally infeasible or ineffective. Therefore, we proposed a novel algorithm, CGStream, to identify correlated graphs from a data stream, by using a sliding window, which covers a number of consecutive batches of stream data records. Experimental results demonstrate that the proposed algorithm is several times, or even an order of magnitude, more efficient than the straightforward algorithms. Graph stream classification aims to build effective and efficient classification models for graph streams with continuous growing volumes and dynamic changes. We proposed two methods for complex graph stream classification. Due to the inherent complexity of graph structure, labelling graph data is very expensive. To solve this problem, we proposed a gLSU algorithm, which aims to select discriminative subgraph features with minimum redundancy by using both labelled and unlabelled graphs for graph streams. The second approach handles graph streams with imbalanced class distributions and noise. Both frameworks use an instance weighting scheme to capture the underlying concept drifts of graph streams and achieve significant performance gain on benchmark graph streams. Complex task graph classification aims to address the graph classification problems with complex constraints. We studied two complex task graph classification problems, cost-sensitive graph classification of large-scale graphs and multi-task graph classification. As in medical diagnosis the misclassification cost/risk for different classes is inherently different and large scale graph classification is highly demanded in real-life applications, we proposed a CogBoost algorithm for cost-sensitive classification of large scale graphs. To overcome the limitation of insufficient labelled graphs for a specific learning task, we further proposed effective algorithms to leverage multiple graph learning tasks to select subgraph features and regularize multiple tasks to achieve better generalization performance for all learning tasks

    Graph ensemble boosting for imbalanced noisy graph stream classification

    Full text link
    © 2014 IEEE. Many applications involve stream data with structural dependency, graph representations, and continuously increasing volumes. For these applications, it is very common that their class distributions are imbalanced with minority (or positive) samples being only a small portion of the population, which imposes significant challenges for learning models to accurately identify minority samples. This problem is further complicated with the presence of noise, because they are similar to minority samples and any treatment for the class imbalance may falsely focus on the noise and result in deterioration of accuracy. In this paper, we propose a classification model to tackle imbalanced graph streams with noise. Our method, graph ensemble boosting, employs an ensemble-based framework to partition graph stream into chunks each containing a number of noisy graphs with imbalanced class distributions. For each individual chunk, we propose a boosting algorithm to combine discriminative subgraph pattern selection and model learning as a unified framework for graph classification. To tackle concept drifting in graph streams, an instance level weighting mechanism is used to dynamically adjust the instance weight, through which the boosting framework can emphasize on difficult graph samples. The classifiers built from different graph chunks form an ensemble for graph stream classification. Experiments on real-life imbalanced graph streams demonstrate clear benefits of our boosting design for handling imbalanced noisy graph stream

    An Empirical Study on Budget-Aware Online Kernel Algorithms for Streams of Graphs

    Full text link
    Kernel methods are considered an effective technique for on-line learning. Many approaches have been developed for compactly representing the dual solution of a kernel method when the problem imposes memory constraints. However, in literature no work is specifically tailored to streams of graphs. Motivated by the fact that the size of the feature space representation of many state-of-the-art graph kernels is relatively small and thus it is explicitly computable, we study whether executing kernel algorithms in the feature space can be more effective than the classical dual approach. We study three different algorithms and various strategies for managing the budget. Efficiency and efficacy of the proposed approaches are experimentally assessed on relatively large graph streams exhibiting concept drift. It turns out that, when strict memory budget constraints have to be enforced, working in feature space, given the current state of the art on graph kernels, is more than a viable alternative to dual approaches, both in terms of speed and classification performance.Comment: Author's version of the manuscript, to appear in Neurocomputing (ELSEVIER

    Developing a novel approach to analyse the regimes of temporary streams and their controls on aquatic biota

    Get PDF
    Temporary streams are those water courses that undergo the recurrent cessation of flow or the complete drying of their channel. The biological communities in temporary stream reaches are strongly dependent on the temporal changes of the aquatic habitats determined by the hydrological conditions. The use of the aquatic fauna structural and functional characteristics to assess the ecological quality of a temporary stream reach can not therefore be made without taking into account the controls imposed by the hydrological regime. This paper develops some methods for analysing temporary streams' aquatic regimes, based on the definition of six aquatic states that summarize the sets of mesohabitats occurring on a given reach at a particular moment, depending on the hydrological conditions: flood, riffles, connected, pools, dry and arid. We used the water discharge records from gauging stations or simulations using rainfall-runoff models to infer the temporal patterns of occurrence of these states using the developed aquatic states frequency graph. The visual analysis of this graph is complemented by the development of two metrics based on the permanence of flow and the seasonal predictability of zero flow periods. Finally, a classification of the aquatic regimes of temporary streams in terms of their influence over the development of aquatic life is put forward, defining Permanent, Temporary-pools, Temporary-dry and Episodic regime types. All these methods were tested with data from eight temporary streams around the Mediterranean from MIRAGE project and its application was a precondition to assess the ecological quality of these streams using the current methods prescribed in the European Water Framework Directive for macroinvertebrate communities

    Two Stream Scene Understanding on Graph Embedding

    Full text link
    The paper presents a novel two-stream network architecture for enhancing scene understanding in computer vision. This architecture utilizes a graph feature stream and an image feature stream, aiming to merge the strengths of both modalities for improved performance in image classification and scene graph generation tasks. The graph feature stream network comprises a segmentation structure, scene graph generation, and a graph representation module. The segmentation structure employs the UPSNet architecture with a backbone that can be a residual network, Vit, or Swin Transformer. The scene graph generation component focuses on extracting object labels and neighborhood relationships from the semantic map to create a scene graph. Graph Convolutional Networks (GCN), GraphSAGE, and Graph Attention Networks (GAT) are employed for graph representation, with an emphasis on capturing node features and their interconnections. The image feature stream network, on the other hand, focuses on image classification through the use of Vision Transformer and Swin Transformer models. The two streams are fused using various data fusion methods. This fusion is designed to leverage the complementary strengths of graph-based and image-based features.Experiments conducted on the ADE20K dataset demonstrate the effectiveness of the proposed two-stream network in improving image classification accuracy compared to conventional methods. This research provides a significant contribution to the field of computer vision, particularly in the areas of scene understanding and image classification, by effectively combining graph-based and image-based approaches

    Computing Graph Descriptors on Edge Streams

    Full text link
    Feature extraction is an essential task in graph analytics. These feature vectors, called graph descriptors, are used in downstream vector-space-based graph analysis models. This idea has proved fruitful in the past, with spectral-based graph descriptors providing state-of-the-art classification accuracy. However, known algorithms to compute meaningful descriptors do not scale to large graphs since: (1) they require storing the entire graph in memory, and (2) the end-user has no control over the algorithm's runtime. In this paper, we present streaming algorithms to approximately compute three different graph descriptors capturing the essential structure of graphs. Operating on edge streams allows us to avoid storing the entire graph in memory, and controlling the sample size enables us to keep the runtime of our algorithms within desired bounds. We demonstrate the efficacy of the proposed descriptors by analyzing the approximation error and classification accuracy. Our scalable algorithms compute descriptors of graphs with millions of edges within minutes. Moreover, these descriptors yield predictive accuracy comparable to the state-of-the-art methods but can be computed using only 25% as much memory.Comment: Extension of work accepted to PAKDD 202

    Network Sampling: From Static to Streaming Graphs

    Full text link
    Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Our experimental results indicate that our proposed family of sampling methods more accurately preserves the underlying properties of the graph for both static and streaming graphs. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms

    Graph-Based Object Classification for Neuromorphic Vision Sensing

    Get PDF
    Neuromorphic vision sensing (NVS) devices represent visual information as sequences of asynchronous discrete events (a.k.a., "spikes'") in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, object classification with NVS streams cannot leverage on state-of-the-art convolutional neural networks (CNNs), since NVS does not produce frame representations. To circumvent this mismatch between sensing and processing with CNNs, we propose a compact graph representation for NVS. We couple this with novel residual graph CNN architectures and show that, when trained on spatio-temporal NVS data for object classification, such residual graph CNNs preserve the spatial and temporal coherence of spike events, while requiring less computation and memory. Finally, to address the absence of large real-world NVS datasets for complex recognition tasks, we present and make available a 100k dataset of NVS recordings of the American sign language letters, acquired with an iniLabs DAVIS240c device under real-world conditions

    Newton flows for elliptic functions II:Structural stability: classification and representation

    Get PDF
    In our previous paper we associated to each non-constant elliptic function f on a torus T a dynamical system, the elliptic Newton flow corresponding to f. We characterized the functions for which these flows are structurally stable and showed a genericity result. In the present paper we focus on the classification and representation of these structurally stable flows. The phase portrait of a structurally stable elliptic Newton flow generates a connected, cellularly embedded graph G(f) on a torus T with r vertices, 2r edges and r faces that fulfil certain combinatorial properties (Euler, Hall) on some of its subgraphs. The graph G(f) determines the conjugacy class of the flow [classification]. A connected, cellularly embedded toroidal graph G with the above Euler and Hall properties, is called a Newton graph. Any Newton graph G can be realized as the graph G(f) of the structurally stable Newton flow for some function f. This leads to: up till conjugacy between flows and (topological) equivalency between graphs, there is a one to one correspondence between the structurally stable Newton flows and Newton graphs, both with respect to the same order r of the underlying functions f [representation]. Finally, we clarify the analogy between rational and elliptic Newton flows, and show that the detection of elliptic Newton flows is possible in polynomial time. The proofs of the above results rely on Peixoto’s characterization/classification theorems for structurally stable dynamical systems on compact 2-dimensional manifolds, Stiemke’s theorem of the alternatives, Hall’s theorem of distinct representatives, the Heffter–Edmonds–Ringer rotation principle for embedded graphs, an existence theorem on gradient dynamical systems by Smale, and an interpretation of Newton flows as steady streams

    Activities Recognition and Fall Detection in Continuous Data Streams Using Radar Sensor

    Get PDF
    This student paper presents a Quadratic-kernel Support Vector Machine (SVM) based FMCW (Frequency Modulated Continuous Wave) radar system to recognize daily activities and detect fall accidents. Data collected in this work is divided into two different collection modes, namely, snapshots mode (different activities individually collected in isolation) and continuous activity mode (continuous streams of activities collected one after the other). For the continuous activity streams, a sliding window approach with 4s duration and 70% overlapping has achieved 84.7% classification accuracy and subsequent improvement of 2.6% has been proved by using Sequential Forward Selection (SFS) on six participants to identify an optimal feature set. A ‘tracking’ graph has been utilized to verify that the radar system can correctly identify falls as critical events among the other activities
    • …
    corecore