594 research outputs found

    A review of clustering techniques and developments

    Full text link
    © 2017 Elsevier B.V. This paper presents a comprehensive study on clustering: exiting methods and developments made at various times. Clustering is defined as an unsupervised learning where the objects are grouped on the basis of some similarity inherent among them. There are different methods for clustering the objects such as hierarchical, partitional, grid, density based and model based. The approaches used in these methods are discussed with their respective states of art and applicability. The measures of similarity as well as the evaluation criteria, which are the central components of clustering, are also presented in the paper. The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted

    A Trust Management Framework for Decision Support Systems

    Get PDF
    In the era of information explosion, it is critical to develop a framework which can extract useful information and help people to make “educated” decisions. In our lives, whether we are aware of it, trust has turned out to be very helpful for us to make decisions. At the same time, cognitive trust, especially in large systems, such as Facebook, Twitter, and so on, needs support from computer systems. Therefore, we need a framework that can effectively, but also intuitively, let people express their trust, and enable the system to automatically and securely summarize the massive amounts of trust information, so that a user of the system can make “educated” decisions, or at least not blind decisions. Inspired by the similarities between human trust and physical measurements, this dissertation proposes a measurement theory based trust management framework. It consists of three phases: trust modeling, trust inference, and decision making. Instead of proposing specific trust inference formulas, this dissertation proposes a fundamental framework which is flexible and can be adapted by many different inference formulas. Validation experiments are done on two data sets: the Epinions.com data set and the Twitter data set. This dissertation also adapts the measurement theory based trust management framework for two decision support applications. In the first application, the real stock market data is used as ground truth for the measurement theory based trust management framework. Basically, the correlation between the sentiment expressed on Twitter and stock market data is measured. Compared with existing works which do not differentiate tweets’ authors, this dissertation analyzes trust among stock investors on Twitter and uses the trust network to differentiate tweets’ authors. The results show that by using the measurement theory based trust framework, Twitter sentiment valence is able to reflect abnormal stock returns better than treating all the authors as equally important or weighting them by their number of followers. In the second application, the measurement theory based trust management framework is used to help to detect and prevent from being attacked in cloud computing scenarios. In this application, each single flow is treated as a measurement. The simulation results show that the measurement theory based trust management framework is able to provide guidance for cloud administrators and customers to make decisions, e.g. migrating tasks from suspect nodes to trustworthy nodes, dynamically allocating resources according to trust information, and managing the trade-off between the degree of redundancy and the cost of resources

    Relational clustering models for knowledge discovery and recommender systems

    Get PDF
    Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining (KDD). It aims at partitioning a given dataset into some homogeneous clusters so as to reflect the natural hidden data structure. Various heuristic or statistical approaches have been developed for analyzing propositional datasets. Nevertheless, in relational clustering the existence of multi-type relationships will greatly degrade the performance of traditional clustering algorithms. This issue motivates us to find more effective algorithms to conduct the cluster analysis upon relational datasets. In this thesis we comprehensively study the idea of Representative Objects for approximating data distribution and then design a multi-phase clustering framework for analyzing relational datasets with high effectiveness and efficiency. The second task considered in this thesis is to provide some better data models for people as well as machines to browse and navigate a dataset. The hierarchical taxonomy is widely used for this purpose. Compared with manually created taxonomies, automatically derived ones are more appealing because of their low creation/maintenance cost and high scalability. Up to now, the taxonomy generation techniques are mainly used to organize document corpus. We investigate the possibility of utilizing them upon relational datasets and then propose some algorithmic improvements. Another non-trivial problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize the content of each node. Unfortunately, this field has not been investigated sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing some novel approaches. The final goal of our cluster analysis and taxonomy generation techniques is to improve the scalability of recommender systems that are developed to tackle the problem of information overload. Recent research in recommender systems integrates the exploitation of domain knowledge to improve the recommendation quality, which however reduces the scalability of the whole system at the same time. We address this issue by applying the automatically derived taxonomy to preserve the pair-wise similarities between items, and then modeling the user visits by another hierarchical structure. Experimental results show that the computational complexity of the recommendation procedure can be greatly reduced and thus the system scalability be improved

    A survey of machine learning techniques applied to self organizing cellular networks

    Get PDF
    In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future

    A semantic enhanced hybrid recommendation approach: A case study of e-Government tourism service recommendation system

    Full text link
    © 2015 Elsevier B.V.All rights reserved. Recommender systems are effectively used as a personalized information filtering technology to automatically predict and identify a set of interesting items on behalf of users according to their personal needs and preferences. Collaborative Filtering (CF) approach is commonly used in the context of recommender systems; however, obtaining better prediction accuracy and overcoming the main limitations of the standard CF recommendation algorithms, such as sparsity and cold-start item problems, remain a significant challenge. Recent developments in personalization and recommendation techniques support the use of semantic enhanced hybrid recommender systems, which incorporate ontology-based semantic similarity measure with other recommendation approaches to improve the quality of recommendations. Consequently, this paper presents the effectiveness of utilizing semantic knowledge of items to enhance the recommendation quality. It proposes a new Inferential Ontology-based Semantic Similarity (IOBSS) measure to evaluate semantic similarity between items in a specific domain of interest by taking into account their explicit hierarchical relationships, shared attributes and implicit relationships. The paper further proposes a hybrid semantic enhanced recommendation approach by combining the new IOBSS measure and the standard item-based CF approach. A set of experiments with promising results validates the effectiveness of the proposed hybrid approach, using a case study of the Australian e-Government tourism services

    Networks and trust: systems for understanding and supporting internet security

    Get PDF
    Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)
    • …
    corecore