76 research outputs found

    Online Spectral Clustering on Network Streams

    Get PDF
    Graph is an extremely useful representation of a wide variety of practical systems in data analysis. Recently, with the fast accumulation of stream data from various type of networks, significant research interests have arisen on spectral clustering for network streams (or evolving networks). Compared with the general spectral clustering problem, the data analysis of this new type of problems may have additional requirements, such as short processing time, scalability in distributed computing environments, and temporal variation tracking. However, to design a spectral clustering method to satisfy these requirements certainly presents non-trivial efforts. There are three major challenges for the new algorithm design. The first challenge is online clustering computation. Most of the existing spectral methods on evolving networks are off-line methods, using standard eigensystem solvers such as the Lanczos method. It needs to recompute solutions from scratch at each time point. The second challenge is the parallelization of algorithms. To parallelize such algorithms is non-trivial since standard eigen solvers are iterative algorithms and the number of iterations can not be predetermined. The third challenge is the very limited existing work. In addition, there exists multiple limitations in the existing method, such as computational inefficiency on large similarity changes, the lack of sound theoretical basis, and the lack of effective way to handle accumulated approximate errors and large data variations over time. In this thesis, we proposed a new online spectral graph clustering approach with a family of three novel spectrum approximation algorithms. Our algorithms incrementally update the eigenpairs in an online manner to improve the computational performance. Our approaches outperformed the existing method in computational efficiency and scalability while retaining competitive or even better clustering accuracy. We derived our spectrum approximation techniques GEPT and EEPT through formal theoretical analysis. The well established matrix perturbation theory forms a solid theoretic foundation for our online clustering method. We facilitated our clustering method with a new metric to track accumulated approximation errors and measure the short-term temporal variation. The metric not only provides a balance between computational efficiency and clustering accuracy, but also offers a useful tool to adapt the online algorithm to the condition of unexpected drastic noise. In addition, we discussed our preliminary work on approximate graph mining with evolutionary process, non-stationary Bayesian Network structure learning from non-stationary time series data, and Bayesian Network structure learning with text priors imposed by non-parametric hierarchical topic modeling

    Signal separation of musical instruments: simulation-based methods for musical signal decomposition and transcription

    Get PDF
    This thesis presents techniques for the modelling of musical signals, with particular regard to monophonic and polyphonic pitch estimation. Musical signals are modelled as a set of notes, each comprising of a set of harmonically-related sinusoids. An hierarchical model is presented that is very general and applicable to any signal that can be decomposed as the sum of basis functions. Parameter estimation is posed within a Bayesian framework, allowing for the incorporation of prior information about model parameters. The resulting posterior distribution is of variable dimension and so reversible jump MCMC simulation techniques are employed for the parameter estimation task. The extension of the model to time-varying signals with high posterior correlations between model parameters is described. The parameters and hyperparameters of several frames of data are estimated jointly to achieve a more robust detection. A general model for the description of time-varying homogeneous and heterogeneous multiple component signals is developed, and then applied to the analysis of musical signals. The importance of high level musical and perceptual psychological knowledge in the formulation of the model is highlighted, and attention is drawn to the limitation of pure signal processing techniques for dealing with musical signals. Gestalt psychological grouping principles motivate the hierarchical signal model, and component identifiability is considered in terms of perceptual streaming where each component establishes its own context. A major emphasis of this thesis is the practical application of MCMC techniques, which are generally deemed to be too slow for many applications. Through the design of efficient transition kernels highly optimised for harmonic models, and by careful choice of assumptions and approximations, implementations approaching the order of realtime are viable.Engineering and Physical Sciences Research Counci

    Predicting the emotions expressed in music

    Get PDF

    Unique networks: a method to identity disease-specific regulatory networks from microarray data

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The survival of any organismis determined by the mechanisms triggered in response to the inputs received. Underlying mechanisms are described by graphical networks that can be inferred from different types of data such as microarrays. Deriving robust and reliable networks can be complicated due to the microarray structure of the data characterized by a discrepancy between the number of genes and samples of several orders of magnitude, bias and noise. Researchers overcome this problem by integrating independent data together and deriving the common mechanisms through consensus network analysis. Different conditions generate different inputs to the organism which reacts triggering different mechanisms with similarities and differences. A lot of effort has been spent into identifying the commonalities under different conditions. Highlighting similarities may overshadow the differences which often identify the main characteristics of the triggered mechanisms. In this thesis we introduce the concept of study-specific mechanism. We develop a pipeline to semiautomatically identify study-specific networks called unique-networks through a combination of consensus approach, graphical similarities and network analysis. The main pipeline called UNIP (Unique Networks Identification Pipeline) takes a set of independent studies, builds gene regulatory networks for each of them, calculates an adaptation of the sensitivity measure based on the networks graphical similarities, applies clustering to group the studies who generate the most similar networks into study-clusters and derives the consensus networks. Once each study-cluster is associated with a consensus-network, we identify the links that appear only in the consensus network under consideration but not in the others (unique-connections). Considering the genes involved in the unique-connections we build Bayesian networks to derive the unique-networks. Finally, we exploit the inference tool to calculate each gene prediction-accuracy across all studies to further refine the unique-networks. Biological validation through different software and the literature are explored to validate our method. UNIP is first applied to a set of synthetic data perturbed with different levels of noise to study the performance and verify its reliability. Then, wheat under stress conditions and different types of cancer are explored. Finally, we develop a user-friendly interface to combine the set of studies by using AND and NOT logic operators. Based on the findings, UNIP is a robust and reliable method to analyse large sets of transcriptomic data. It easily detects the main complex relationships between transcriptional expression of genes specific for different conditions and also highlights structures and nodes that could be potential targets for further research
    • …
    corecore