207,643 research outputs found

    Improving Quality of the Solution for the Team Formation Problem in Social Networks Using SCAN Variant and Evolutionary Computation

    Get PDF
    Social Network Analysis helps to visualize and understand the roles and relationships that ease or impede the collaboration and sharing of the information and knowledge in an organization. In this research work, we will focus on the Team Formation Problem (TFP) which is an open problem where we need to identify an ideal team, with members of complementary talent or skills, to solve any given task. Current research suggests that TFP solutions have been attempted with evolutionary computation approach using Cultural Algorithms (CA) and Genetic Algorithms (GA). However, SCAN (Structural Clustering Algorithm for Networks) variants such as WSCAN (Weighted Structural Clustering Algorithm for Networks) demonstrate a high capability to find solutions for another type of network problems. In this thesis, we first propose to use WSCAN-TFP algorithm to deal with the problem of team formation in social networks, and we our findings indicate that WSCAN-TFP algorithm worked faster than the evolutionary algorithms counterparts but was of lower performance compared to CAs and GAs. Next, we propose two hybrid solutions by combining GA and CA with a modified WSCAN-TFP algorithm. To test the performance of our proposed approaches, we define multiple quality criteria based on communication cost (CC), average fitness score (AFS) and average processing time. We used big datasets from DBLP nodes network with sizes 50K and 100K. The results show that our proposed methods HGA and HCA can find the near-optimal solutions faster with minimum communication cost with the improvement of 66%\approx 66\% and 57%\approx 57\% in average fitness in comparison to existing GA and CA methods respectively

    Nonparametric Feature Extraction from Dendrograms

    Full text link
    We propose feature extraction from dendrograms in a nonparametric way. The Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance measures can be inferred from different types of dendrograms, level functions and distance functions. Via an appropriate embedding, we compute a vector-based representation of the inferred distances, in order to enable many numerical machine learning algorithms to employ such distances. Then, to address the model selection problem, we study the aggregation of different dendrogram-based distances respectively in solution space and in representation space in the spirit of deep representations. In the first approach, for example for the clustering problem, we build a graph with positive and negative edge weights according to the consistency of the clustering labels of different objects among different solutions, in the context of ensemble methods. Then, we use an efficient variant of correlation clustering to produce the final clusters. In the second approach, we investigate the sequential combination of different distances and features sequentially in the spirit of multi-layered architectures to obtain the final features. Finally, we demonstrate the effectiveness of our approach via several numerical studies

    Cost functions for pairwise data clustering

    Full text link
    Cost functions for non-hierarchical pairwise clustering are introduced, in the probabilistic autoencoder framework, by the request of maximal average similarity between the input and the output of the autoencoder. The partition provided by these cost functions identifies clusters with dense connected regions in data space; differences and similarities with respect to a well known cost function for pairwise clustering are outlined.Comment: 5 pages, 4 figure

    Practical Attacks Against Graph-based Clustering

    Full text link
    Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art network-level, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specifically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.Comment: ACM CCS 201
    corecore