2,929 research outputs found

    A survey of machine learning techniques applied to self organizing cellular networks

    Get PDF
    In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future

    A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications

    Full text link
    This survey samples from the ever-growing family of adaptive resonance theory (ART) neural network models used to perform the three primary machine learning modalities, namely, unsupervised, supervised and reinforcement learning. It comprises a representative list from classic to modern ART models, thereby painting a general picture of the architectures developed by researchers over the past 30 years. The learning dynamics of these ART models are briefly described, and their distinctive characteristics such as code representation, long-term memory and corresponding geometric interpretation are discussed. Useful engineering properties of ART (speed, configurability, explainability, parallelization and hardware implementation) are examined along with current challenges. Finally, a compilation of online software libraries is provided. It is expected that this overview will be helpful to new and seasoned ART researchers

    Unsupervised and semi-supervised clustering with learnable cluster dependent kernels.

    Get PDF
    Despite the large number of existing clustering methods, clustering remains a challenging task especially when the structure of the data does not correspond to easily separable categories, and when clusters vary in size, density and shape. Existing kernel based approaches allow to adapt a specific similarity measure in order to make the problem easier. Although good results were obtained using the Gaussian kernel function, its performance depends on the selection of the scaling parameter. Moreover, since one global parameter is used for the entire data set, it may not be possible to find one optimal scaling parameter when there are large variations between the distributions of the different clusters in the feature space. One way to learn optimal scaling parameters is through an exhaustive search of one optimal scaling parameter for each cluster. However, this approach is not practical since it is computationally expensive especially when the data includes a large number of clusters and when the dynamic range of possible values of the scaling parameters is large. Moreover, it is not trivial to evaluate the resulting partition in order to select the optimal parameters. To overcome this limitation, we introduce two new fuzzy relational clustering techniques that learn cluster dependent Gaussian kernels. The first algorithm called clustering and Local Scale Learning algorithm (LSL) minimizes one objective function for both the optimal partition and for cluster dependent scaling parameters that reflect the intra-cluster characteristics of the data. The second algorithm, called Fuzzy clustering with Learnable Cluster dependent Kernels (FLeCK) learns the scaling parameters by optimizing both the intra-cluster and the inter-cluster dissimilarities. Consequently, the learned scale parameters reflect the relative density, size, and position of each cluster with respect to the other clusters. We also introduce semi-supervised versions of LSL and FLeCK. These algorithms generate a fuzzy partition of the data and learn the optimal kernel resolution of each cluster simultaneously. We show that the incorporation of a small set of constraints can guide the clustering process to better learn the scaling parameters and the fuzzy memberships in order to obtain a better partition of the data. In particular, we show that the partial supervision is even more useful on real high dimensional data sets where the algorithms are more susceptible to local minima. All of the proposed algorithms are optimized iteratively by dynamically updating the partition and the scaling parameter in each iteration. This makes these algorithms simple and fast. Moreover, our algorithms are formulated to work on relational data. This makes them applicable to data where objects cannot be represented by vectors or when clusters of similar objects cannot be represented efficiently by a single prototype. Our extensive experiments show that FLeCK and SS-FLeCK outperform existing algorithms. In particular, we show that when data include clusters with various inter-cluster and intra-cluster distances, learning cluster dependent kernel is crucial in obtaining a good partition

    Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams

    Full text link
    The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction (DA3DA^3) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only 25%25\% label proportions. It shows highly competitive performance even if compared with fully supervised learners with 100%100\% label proportions.Comment: This paper has been accepted for publication in Information Science

    Searching for network modules

    Full text link
    When analyzing complex networks a key target is to uncover their modular structure, which means searching for a family of modules, namely node subsets spanning each a subnetwork more densely connected than the average. This work proposes a novel type of objective function for graph clustering, in the form of a multilinear polynomial whose coefficients are determined by network topology. It may be thought of as a potential function, to be maximized, taking its values on fuzzy clusterings or families of fuzzy subsets of nodes over which every node distributes a unit membership. When suitably parametrized, this potential is shown to attain its maximum when every node concentrates its all unit membership on some module. The output thus is a partition, while the original discrete optimization problem is turned into a continuous version allowing to conceive alternative search strategies. The instance of the problem being a pseudo-Boolean function assigning real-valued cluster scores to node subsets, modularity maximization is employed to exemplify a so-called quadratic form, in that the scores of singletons and pairs also fully determine the scores of larger clusters, while the resulting multilinear polynomial potential function has degree 2. After considering further quadratic instances, different from modularity and obtained by interpreting network topology in alternative manners, a greedy local-search strategy for the continuous framework is analytically compared with an existing greedy agglomerative procedure for the discrete case. Overlapping is finally discussed in terms of multiple runs, i.e. several local searches with different initializations.Comment: 10 page

    Machine learning techniques implementation in power optimization, data processing, and bio-medical applications

    Get PDF
    The rapid progress and development in machine-learning algorithms becomes a key factor in determining the future of humanity. These algorithms and techniques were utilized to solve a wide spectrum of problems extended from data mining and knowledge discovery to unsupervised learning and optimization. This dissertation consists of two study areas. The first area investigates the use of reinforcement learning and adaptive critic design algorithms in the field of power grid control. The second area in this dissertation, consisting of three papers, focuses on developing and applying clustering algorithms on biomedical data. The first paper presents a novel modelling approach for demand side management of electric water heaters using Q-learning and action-dependent heuristic dynamic programming. The implemented approaches provide an efficient load management mechanism that reduces the overall power cost and smooths grid load profile. The second paper implements an ensemble statistical and subspace-clustering model for analyzing the heterogeneous data of the autism spectrum disorder. The paper implements a novel k-dimensional algorithm that shows efficiency in handling heterogeneous dataset. The third paper provides a unified learning model for clustering neuroimaging data to identify the potential risk factors for suboptimal brain aging. In the last paper, clustering and clustering validation indices are utilized to identify the groups of compounds that are responsible for plant uptake and contaminant transportation from roots to plants edible parts --Abstract, page iv
    corecore