115 research outputs found

    Consensus-Based Agglomerative Hierarchical Clustering

    Get PDF
    Producción CientíficaIn this contribution, we consider that a set of agents assess a set of alternatives through numbers in the unit interval. In this setting, we introduce a measure that assigns a degree of consensus to each subset of agents with respect to every subset of alternatives. This consensus measure is defined as 1 minus the outcome generated by a symmetric aggregation function to the distances between the corresponding individual assessments. We establish some properties of the consensus measure, some of them depending on the used aggregation function. We also introduce an agglomerative hierarchical clustering procedure that is generated by similarity functions based on the previous consensus measuresMinisterio de Economía, Industria y Competitividad (ECO2012-32178)Junta de Castilla y León (programa de apoyo a proyectos de investigación – Ref. VA066U13

    Consensus-based clustering under hesitant qualitative assessments

    Get PDF
    Producción CientíficaIn this paper, we consider that agents judge the feasible alternatives through linguistic terms – when they are confident in their opinions – or linguistic expressions formed by several consecutive linguistic terms – when they hesitate. In this context, we propose an agglomerative hierarchical clustering process where the clusters of agents are generated by using a distance-based consensus measure.Ministerio de Economía, Industria y Competitividad (ECO2012-32178)Junta de Castilla y León (programa de apoyo a proyectos de investigación – Ref. VA066U13

    Informational Paradigm, management of uncertainty and theoretical formalisms in the clustering framework: A review

    Get PDF
    Fifty years have gone by since the publication of the first paper on clustering based on fuzzy sets theory. In 1965, L.A. Zadeh had published “Fuzzy Sets” [335]. After only one year, the first effects of this seminal paper began to emerge, with the pioneering paper on clustering by Bellman, Kalaba, Zadeh [33], in which they proposed a prototypal of clustering algorithm based on the fuzzy sets theory

    Modified balanced random forest for improving imbalanced data prediction

    Get PDF
    This paper proposes a Modified Balanced Random Forest (MBRF) algorithm as a classification technique to address imbalanced data. The MBRF process changes the process in a Balanced Random Forest by applying an under-sampling strategy based on clustering techniques for each data bootstrap decision tree in the Random Forest algorithm. To find the optimal performance of our proposed method compared with four clustering techniques, like: K-MEANS, Spectral Clustering, Agglomerative Clustering, and Ward Hierarchical Clustering. The experimental result show the Ward Hierarchical Clustering Technique achieved optimal performance, also the proposed MBRF method yielded better performance compared to the Balanced Random Forest (BRF) and Random Forest (RF) algorithms, with a sensitivity value or true positive rate (TPR) of 93.42%, a specificity or true negative rate (TNR) of 93.60%, and the best AUC accuracy value of 93.51%. Moreover, MBRF also reduced process running time

    Knowledge aggregation in people recommender systems : matching skills to tasks

    Get PDF
    People recommender systems (PRS) are a special type of RS. They are often adopted to identify people capable of performing a task. Recommending people poses several challenges not exhibited in traditional RS. Elements such as availability, overload, unresponsiveness, and bad recommendations can have adverse effects. This thesis explores how people’s preferences can be elicited for single-event matchmaking under uncertainty and how to align them with appropriate tasks. Different methodologies are introduced to profile people, each based on the nature of the information from which it was obtained. These methodologies are developed into three use cases to illustrate the challenges of PRS and the steps taken to address them. Each one emphasizes the priorities of the matching process and the constraints under which these recommendations are made. First, multi-criteria profiles are derived completely from heterogeneous sources in an implicit manner characterizing users from multiple perspectives and multi-dimensional points-of-view without influence from the user. The profiles are introduced to the conference reviewer assignment problem. Attention is given to distribute people across items in order reduce potential overloading of a person, and neglect or rejection of a task. Second, people’s areas of interest are inferred from their resumes and expressed in terms of their uncertainty avoiding explicit elicitation from an individual or outsider. The profile is applied to a personnel selection problem where emphasis is placed on the preferences of the candidate leading to an asymmetric matching process. Third, profiles are created by integrating implicit information and explicitly stated attributes. A model is developed to classify citizens according to their lifestyles which maintains the original information in the data set throughout the cluster formation. These use cases serve as pilot tests for generalization to real-life implementations. Areas for future application are discussed from new perspectives.Els sistemes de recomanació de persones (PRS) són un tipus especial de sistemes recomanadors (RS). Sovint s’utilitzen per identificar persones per a realitzar una tasca. La recomanació de persones comporta diversos reptes no exposats en la RS tradicional. Elements com la disponibilitat, la sobrecàrrega, la falta de resposta i les recomanacions incorrectes poden tenir efectes adversos. En aquesta tesi s'explora com es poden obtenir les preferències dels usuaris per a la definició d'assignacions sota incertesa i com aquestes assignacions es poden alinear amb tasques definides. S'introdueixen diferents metodologies per definir el perfil d’usuaris, cadascun en funció de la naturalesa de la informació necessària. Aquestes metodologies es desenvolupen i s’apliquen en tres casos d’ús per il·lustrar els reptes dels PRS i els passos realitzats per abordar-los. Cadascun destaca les prioritats del procés, l’encaix de les recomanacions i les seves limitacions. En el primer cas, els perfils es deriven de variables heterogènies de manera implícita per tal de caracteritzar als usuaris des de múltiples perspectives i punts de vista multidimensionals sense la influència explícita de l’usuari. Això s’aplica al problema d'assignació d’avaluadors per a articles de conferències. Es presta especial atenció al fet de distribuir els avaluadors entre articles per tal de reduir la sobrecàrrega potencial d'una persona i el neguit o el rebuig a la tasca. En el segon cas, les àrees d’interès per a caracteritzar les persones es dedueixen dels seus currículums i s’expressen en termes d’incertesa evitant que els interessos es demanin explícitament a les persones. El sistema s'aplica a un problema de selecció de personal on es posa èmfasi en les preferències del candidat que condueixen a un procés d’encaix asimètric. En el tercer cas, els perfils dels usuaris es defineixen integrant informació implícita i atributs indicats explícitament. Es desenvolupa un model per classificar els ciutadans segons els seus estils de vida que manté la informació original del conjunt de dades del clúster al que ell pertany. Finalment, s’analitzen aquests casos com a proves pilot per generalitzar implementacions en futurs casos reals. Es discuteixen les àrees d'aplicació futures i noves perspectives.Postprint (published version

    COMMUNITY DETECTION AND INFLUENCE MAXIMIZATION IN ONLINE SOCIAL NETWORKS

    Get PDF
    The detecting and clustering of data and users into communities on the social web are important and complex issues in order to develop smart marketing models in changing and evolving social ecosystems. These marketing models are created by individual decision to purchase a product and are influenced by friends and acquaintances. This leads to novel marketing models, which view users as members of online social network communities, rather than the traditional view of marketing to individuals. This thesis starts by examining models that detect communities in online social networks. Then an enhanced approach to detect community which clusters similar nodes together is suggested. Social relationships play an important role in determining user behavior. For example, a user might purchase a product that his/her friend recently bought. Such a phenomenon is called social influence and is used to study how far the action of one user can affect the behaviors of others. Then an original metric used to compute the influential power of social network users based on logs of common actions in order to infer a probabilistic influence propagation model. Finally, a combined community detection algorithm and suggested influence propagation approach reveals a new influence maximization model by identifying and using the most influential users within their communities. In doing so, we employed a fuzzy logic based technique to determine the key users who drive this influence in their communities and diffuse a certain behavior. This original approach contrasts with previous influence propagation models, which did not use similarity opportunities among members of communities to maximize influence propagation. The performance results show that the model activates a higher number of overall nodes in contemporary social networks, starting from a smaller set of key users, as compared to existing landmark approaches which influence fewer nodes, yet employ a larger set of key users

    Determine OWA operator weights using kernel density estimation

    Get PDF
    Some subjective methods should divide input values into local clusters before determining the ordered weighted averaging (OWA) operator weights based on the data distribution characteristics of input values. However, the process of clustering input values is complex. In this paper, a novel probability density based OWA (PDOWA) operator is put forward based on the data distribution characteristics of input values. To capture the local cluster structures of input values, the kernel density estimation (KDE) is used to estimate the probability density function (PDF), which fits to the input values. The derived PDF contains the density information of input values, which reflects the importance of input values. Therefore, the input values with high probability densities (PDs) should be assigned with large weights, while the ones with low PDs should be assigned with small weights. Afterwards, the desirable properties of the proposed PDOWA operator are investigated. Finally, the proposed PDOWA operator is applied to handle the multicriteria decision making problem concerning the evaluation of smart phones and it is compared with some existing OWA operators. The comparative analysis shows that the proposed PDOWA operator is simpler and more efficient than the existing OWA operator
    • …
    corecore