3,172 research outputs found

    Scalable approximate FRNN-OWA classification

    Get PDF
    Fuzzy Rough Nearest Neighbour classification with Ordered Weighted Averaging operators (FRNN-OWA) is an algorithm that classifies unseen instances according to their membership in the fuzzy upper and lower approximations of the decision classes. Previous research has shown that the use of OWA operators increases the robustness of this model. However, calculating membership in an approximation requires a nearest neighbour search. In practice, the query time complexity of exact nearest neighbour search algorithms in more than a handful of dimensions is near-linear, which limits the scalability of FRNN-OWA. Therefore, we propose approximate FRNN-OWA, a modified model that calculates upper and lower approximations of decision classes using the approximate nearest neighbours returned by Hierarchical Navigable Small Worlds (HNSW), a recent approximative nearest neighbour search algorithm with logarithmic query time complexity at constant near-100% accuracy. We demonstrate that approximate FRNN-OWA is sufficiently robust to match the classification accuracy of exact FRNN-OWA while scaling much more efficiently. We test four parameter configurations of HNSW, and evaluate their performance by measuring classification accuracy and construction and query times for samples of various sizes from three large datasets. We find that with two of the parameter configurations, approximate FRNN-OWA achieves near-identical accuracy to exact FRNN-OWA for most sample sizes within query times that are up to several orders of magnitude faster

    Fuzzy-rough-learn 0.1 : a Python library for machine learning with fuzzy rough sets

    Get PDF
    We present fuzzy-rough-learn, the first Python library of fuzzy rough set machine learning algorithms. It contains three algorithms previously implemented in R and Java, as well as two new algorithms from the recent literature. We briefly discuss the use cases of fuzzy-rough-learn and the design philosophy guiding its development, before providing an overview of the included algorithms and their parameters

    Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

    Get PDF
    The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.Comment: NAACL 201

    Una combinación basada en operadores OWA para la Clasificación de Género Multi-etiqueta de páginas web

    Get PDF
    This paper presents a new method for genre identification that combines homogeneous classifiers using OWA (Ordered Weighted Averaging) operators. Our method uses character n-grams extracted from different information sources such as URL, title, headings and anchors. To deal with the complexity of web pages, we applied MLKNN as a multi-label classifier, in which a web page can be affected by more than one genre. Experiments conducted using a known multi-label corpus show that our method achieves good results.En este trabajo se presenta un nuevo método para la identificación de género que combina clasificadores homogéneos utilizando OWA (promedio ponderado) Pedimos operadores. Nuestro método utiliza caracteres n-gramas extraídos de diferentes fuentes de información, tales como URL, título, encabezados y anclajes. Para hacer frente a la complejidad de las páginas web, se aplicó MLKNN como un clasificador multi-etiqueta, en el que una página web puede verse afectada por más de un género. Los experimentos llevados a cabo usando un conocido corpus multi-etiqueta muestran que nuestro método logra buenos resultados

    Fuzzy-Pattern-Classifier Based Sensor Fusion for Machine Conditioning

    Get PDF

    BENCHMARKING CLASSIFIERS - HOW WELL DOES A GOWA-VARIANT OF THE SIMILARITY CLASSIFIER DO IN COMPARISON WITH SELECTED CLASSIFIERS?

    Get PDF
    Digital data is ubiquitous in nearly all modern businesses. Organizations have more data available, in various formats, than ever before. Machine learning algorithms and predictive analytics utilize the knowledge contained in that data, in order to help the business related decision-making. This study explores predictive analytics by comparing different classification methods – the main interest being in the Generalize Ordered Weighted Average (GOWA)-variant of the similarity classifier. The target for this research is to find out how what is the GOWA-variant of the similarity classifier and how well it performs compared to other selected classifiers. This study also tries to investigate whether the GOWA-variant of the similarity classifier is a sufficient method to be used in the busi-ness related decision-making. Four different classical classifiers were selected as reference classifiers on the basis of their common usage in machine learning research, and on their availability in the Sta-tistics and Machine Learning Toolbox in MATLAB. Three different data sets from UCI Machine Learning repository were used for benchmarking the classifiers. The benchmarking process uses fitness function instead of pure classification accuracy to determine the performance of the classifiers. Fitness function combines several measurement criteria into a one common value. With one data set, the GOWA-variant of the similarity classifier per-formed the best. One of the data sets contains credit card client data. It was more complex than the other two data sets and contains clearly business related data. The GOWA-variant performed also well with this data set. Therefore it can be claimed that the GOWA-variant of the similarity classifi-er is a viable option to be used also for solving business related problems

    Learning ordered pooling weights in image classification

    Full text link
    Spatial pooling is an important step in computer vision systems like Convolutional Neural Networks or the Bag-of-Words method. The spatial pooling purpose is to combine neighbouring descriptors to obtain a single descriptor for a given region (local or global). The resultant combined vector must be as discriminant as possible, in other words, must contain relevant information, while removing irrelevant and confusing details. Maximum and average are the most common aggregation functions used in the pooling step. To improve the aggregation of relevant information without degrading their discriminative power for image classification, we introduce a simple but effective scheme based on Ordered Weighted Average (OWA) aggregation operators. We present a method to learn the weights of the OWA aggregation operator in a Bag-of-Words framework and in Convolutional Neural Networks, and provide an extensive evaluation showing that OWA based pooling outperforms classical aggregation operators
    corecore