10,817 research outputs found
A kernel-based framework for learning graded relations from data
Driven by a large number of potential applications in areas like
bioinformatics, information retrieval and social network analysis, the problem
setting of inferring relations between pairs of data objects has recently been
investigated quite intensively in the machine learning community. To this end,
current approaches typically consider datasets containing crisp relations, so
that standard classification methods can be adopted. However, relations between
objects like similarities and preferences are often expressed in a graded
manner in real-world applications. A general kernel-based framework for
learning relations from data is introduced here. It extends existing approaches
because both crisp and graded relations are considered, and it unifies existing
approaches because different types of graded relations can be modeled,
including symmetric and reciprocal relations. This framework establishes
important links between recent developments in fuzzy set theory and machine
learning. Its usefulness is demonstrated through various experiments on
synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Integrating Segmentation and Similarity in Melodic Analysis
The recognition of melodic structure depends on both the segmentation into structural units, the melodic motifs, and relations of motifs which are mainly determined by similarity. Existing models and studies of segmentation and motivic similarity cover only certain aspects and do not provide a comprehensive or coherent theory. In this paper an Integrated Segmentation and Similarity Model (ISSM) for melodic analysis is introduced. The ISSM yields an interpretation similar to a paradigmatic analysis for a given melody. An interpretation comprises a segmentation, assignments of related motifs and notes, and detailed information on the differences of assigned motifs and notes. The ISSM is based on generating and rating interpretations to find the most adequate one. For this rating a neuro-fuzzy-system is used, which combines knowledge with learning from data. The ISSM is an extension of a system for rhythm analysis. This paper covers the model structure and the features relevant for melodic and motivic analysis. Melodic segmentation and similarity ratings are described and results of a small experiment which show that the ISSM can learn structural interpretations from data and that integrating similarity improves segmentation performance of the model
Tree similarity measure-based recommender systems
University of Technology, Sydney. Faculty of Science.The rapid growth of web information provides excellent opportunities for developing e-services in many applications but also caused increasingly severe information overload problems whereby users are not able to locate relevant information to exactly meet their needs efficiently by using the current Internet search functions. A personalised recommender system aims to handle this issue.
A big challenge in current recommender system research is: the items and user profiles in many recommender system applications nowadays, such as the e-business and e-learning recommender systems, are so complex that they can only be described in complicated tree structures. Therefore, the item or user similarity measure, as the core technique of the recommendation approach, becomes a tree similarity measure, which existing recommender systems cannot provide. Another challenge is that in many real life situations, online recommendations to customers in selecting the most suitable products/services are often made under incomplete and uncertain information, which needs fuzzy set theory and techniques to deal with. Thus, how to use fuzzy set techniques to handle data uncertainty issues in tree-structured items or user profiles needs to be investigated.
This research aims to handle these two challenges in both theoretical and practical aspects. It first defines a tree-structured data model, which can be used to model tree-structured items, user profiles and user preferences. A comprehensive similarity measure on tree-structured data considering all the information on tree structures, nodesâ concepts, weights and values is then developed, which can be used to compute the semantic similarity between tree-structured items or users, and the matching degree of items to tree-structured user requests. Based on the tree-structured data model, the tree-structured items and user requirements are modelled as item trees and user request trees respectively. An item tree and user request tree-based hybrid recommendation approach is then developed. To model usersâ fuzzy tree-structured preferences, a fuzzy preference tree model is proposed. A fuzzy preference tree-based recommendation approach is then developed. Experimental results on an Australian business dataset and the Movielens dataset show that the proposed recommendation approaches have good performance and are well-suited in dealing with tree-structured data in recommender systems. By use of the proposed tree similarity measure and recommendation approaches based on that, two real world applications, a business partner recommender system, Smart BizSeeker, and an e-learning recommender system, ELRS, are designed and implemented, which demonstrate the applicability and effectiveness of the proposed approaches
Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem
This paper builds upon the fundamental work of Niwa et al. [34], which
provides the unique possibility to analyze the relative aggregation/folding
propensity of the elements of the entire Escherichia coli (E. coli) proteome in
a cell-free standardized microenvironment. The hardness of the problem comes
from the superposition between the driving forces of intra- and inter-molecule
interactions and it is mirrored by the evidences of shift from folding to
aggregation phenotypes by single-point mutations [10]. Here we apply several
state-of-the-art classification methods coming from the field of structural
pattern recognition, with the aim to compare different representations of the
same proteins gathered from the Niwa et al. data base; such representations
include sequences and labeled (contact) graphs enriched with chemico-physical
attributes. By this comparison, we are able to identify also some interesting
general properties of proteins. Notably, (i) we suggest a threshold around 250
residues discriminating "easily foldable" from "hardly foldable" molecules
consistent with other independent experiments, and (ii) we highlight the
relevance of contact graph spectra for folding behavior discrimination and
characterization of the E. coli solubility data. The soundness of the
experimental results presented in this paper is proved by the statistically
relevant relationships discovered among the chemico-physical description of
proteins and the developed cost matrix of substitution used in the various
discrimination systems.Comment: 17 pages, 3 figures, 46 reference
Extended Fuzzy Clustering Algorithms
Fuzzy clustering is a widely applied method for obtaining fuzzy models from data. Ithas been applied successfully in various fields including finance and marketing. Despitethe successful applications, there are a number of issues that must be dealt with in practicalapplications of fuzzy clustering algorithms. This technical report proposes two extensionsto the objective function based fuzzy clustering for dealing with these issues. First, the(point) prototypes are extended to hypervolumes whose size is determined automaticallyfrom the data being clustered. These prototypes are shown to be less sensitive to a biasin the distribution of the data. Second, cluster merging by assessing the similarity amongthe clusters during optimization is introduced. Starting with an over-estimated number ofclusters in the data, similar clusters are merged during clustering in order to obtain a suitablepartitioning of the data. An adaptive threshold for merging is introduced. The proposedextensions are applied to Gustafson-Kessel and fuzzy c-means algorithms, and the resultingextended algorithms are given. The properties of the new algorithms are illustrated invarious examples.fuzzy clustering;cluster merging;similarity;volume prototypes
- âŠ